- Tài khoản và mật khẩu chỉ cung cấp cho sinh viên, giảng viên, cán bộ của TRƯỜNG ĐẠI HỌC FPT
- Hướng dẫn sử dụng:
Xem Video
.
- Danh mục tài liệu mới:
Tại đây
.
-
Đăng nhập
:
Tại đây
.
Đồ án tốt nghiệp Capstone Project Trí tuệ nhân tạo Artificial Intelligence Bilingual Hybrid Text to Speech SP24AI08 Hệ thống hỏi đáp Giọng nói tự nhiên
Issue Date:
2024
Publisher:
FPTU HCM
Abstract:
This project introduces a robust bilingual question-answering and natural Text-to-Speech (TTS) system, primarily designed to support both Vietnamese and English languages. Leveraging pre-trained Large Language Models (LLMs),
the system enables seamless interactions in both text and speech domains. Key components include a Retrieval Augmented Generation Pipeline (RAG) for efficient information retrieval, an LLM module for response generation and an integration of a TTS module for human-like speech synthesis, and the development of a user-friendly Demo Web Application. The proposed solutions encompass various stages, including data collection, RAG framework development, fine-tuning of LLMs, and rigorous TTS evaluation using metrics and user surveys. Notable achievements
of the project include the development of "T-LLama" - a 7-billion-parameter Bilingual QA LLM, ranked among the top 5 on the VMLU Leaderboard, the implementation of a Bilingual RAG system, the creation of "viXTTS" - the first LLM-based Vietnamese TTS model with voice cloning and multilingual capability through transfer learning, and the assembly of "viVoice" - a gigantic Vietnamese speech dataset with over 1000 hours of audio.