AI-powered transcription of Garchen Rinpoche’s speech.
SUMMARY | |
---|---|
Version: | 1.1 |
Purpose | Train a custom Speech-to-Text model and its training dataset to accurately transcribe Garchen Rinpoche’s oral teachings, making them accessible and searchable. |
Champion(s) | @DavidYesheNyima @Lhakpa_Wangyal |
Communication | - Discord Channel - Github Team - Google Calendar |
Documentation | - PRD (Requirements) - Github Project Board - Visuals - Meeting Minutes |
Proposal Details
1. Problem Statement / Motivation
Train a custom Speech-to-Text model and its training dataset to accurately transcribe Garchen Rinpoche’s oral teachings, making them accessible and searchable in multiple languages.
2. Scope
- In Scope:
- 000 hours of recorded teachings
- Custom STT model for GR
- 000 hours post-corrected
- Translationd in 00 languages
- Out of Scope:
- …
3. Potential Deliverables
A list of tangible things the SIG might produce.
- STT discovery phase: does finetuning a model on GR’s speech improves performance and how much data is needed for what gain in performance.
- MT discovery phase: …
- [e.g., A proof-of-concept for a new evaluation tool.]
- [e.g., A DRD for a new evaluation dataset.]
4. Team
Members:
- @DavidYesheNyima - Champion
- @Lhakpa_Wangyal - coordinator
- @Ganga_Gyatso - developer
Annotators:
- Kunchok Gawa @kunchok73kunchok73kunchok73kunchok73
- Karm@Ka@Karmatsepmatsepak7 Tsepak @Karmatsep@ja@jamluv227luv227k7
- Jampa Lobsang @jamluv227
- Kalsang Thardoe @K-Thardoe
SIG Meeting Calendar
- Meetings: 2nd Tuesday of each month, 16:00 UTC
- Meeting Minutes