- Tài khoản và mật khẩu chỉ cung cấp cho sinh viên, giảng viên, cán bộ của TRƯỜNG ĐẠI HỌC FPT
- Hướng dẫn sử dụng:
Xem Video
.
- Danh mục tài liệu mới:
Tại đây
.
-
Đăng nhập
:
Tại đây
.
Business Administration Data Mining Bank Ho Chi Minh Stock Exchange
Issue Date:
2021
Publisher:
FPTU Hà Nội
Abstract:
This thesis aims at identifying and applying data mining to predict bank share prices listed
on HOSE. In this study, the authors divided banks into 03 groups of large banks, medium
banks and small banks, so as to test on the performance applications of forecasting programs.
For each group of banks, one representative bank share is selected to participate in the test.
The representative of big banks listed on HOSE is BIDV (stock code BID), the medium one
is Vietnam Technological and Commercial Joint Stock Bank (stock code TCB) and the small
one is Tien Phong Commercial Joint Stock Bank (stock code TPB).
The input data used in the program are textual data and digital data. The number of textual
data are big with 23,879 articles published on the 04 official websites, i.e., cafef.vn,
thanhnien.vn, vnexpress.net, and StockBiz.vn. In the meantime, digital data is collected from
the historical stock price of BID, TCB, and TPB. Upon completion of downloading these
articles from the above-mentioned websites, the articles must be published on the same day,
being labelled and classified into categories of good news, moderate news, and bad news
based on the next day's stock price movement. Then, the data is undergone pre-processing
and is split into 02 parts with the proportion 70%:30% respectively. The first 70% of data is
used for building the test for model training and rest of 30% for program performance testing.
Based on results from program performance models, the best accuracy level is withdrawn
for each case of bank share representing for 03 groups of banks. In addition, the authors
selected the website that can provide the most effective information affecting investors
emotion to buy or to sell, or hold bank shares. Recognizing that the extracted information, if
applications of the 04 websites, will reduce the overall efficiency of the performance
program, the authors decided just to take the data of one website that provide the highest
accuracy, then applied information from this website to run the programs. In data mining,
the four models SMV, Random Forest, Decision Tree, and K-Neighbor are applied, and just
one model is selected for each case of bank share. The research results show that SMV is
selected for case of BID and TPB share, while Random Forest is the best serve for TCB.
The results also show that the highest efficiency in prediction reaches 52.94% of its accuracy,
and the website Stockbiz.vn is best served for BID, Cafef.vn is the best serve for TCB, and
Vnexpress.net is the best serve for TPB.