- Tài khoản và mật khẩu chỉ cung cấp cho sinh viên, giảng viên, cán bộ của TRƯỜNG ĐẠI HỌC FPT
- Hướng dẫn sử dụng:
Xem Video
.
- Danh mục tài liệu mới:
Tại đây
.
-
Đăng nhập
:
Tại đây
.
Đồ án tốt nghiệp Capstone Project Trí tuệ nhân tạo Artificial Intelligence LaTeX code End to end Attention mechanism SP24AI12 Chuyển đổi hình ảnh
Issue Date:
2024
Publisher:
FPTU HCM
Abstract:
Recognizing mathematical formulas in images and translating them into LaTeX sequences, both printed and handwritten, is challenging due to the complexity of two-dimensional formulas and lack of training
data. Traditional methods can only handle simple formulas and are not effective for complex formulas. In this paper, we introduce the Sumen (Scaling Up Image-to-LaTeX Performance) model, an encoder-decoder
architecture based on Transformer with attention mechanism trained on the largest dataset from previous works. The model achieves a BLEU score of 95.59, Edit Distance (ED) of 97.3, and Exact Match
(EM) of 69.23 on the img2latex100k benchmark. On the CROHME 2014/2016/2019 benchmark, the corresponding results on Expression
Recognition Rates (ExpRate) are 58.01/82.39/78.99 and Word Error
Rate (WER) are 9.46/2.55/4.51. All of our metrics outperform stateof-the-art methods on both printed and handwritten formulas.