Attendees: [@DavidYesheNyima - Champion, @Ganga_Gyatso - developer, @Lhakpa_Wangyal - coordinator ]
Date: September 02, 2025
Agenda:
- SRT Files and associated technical Issues
- Transcription Quality
- General aspects of the project like work progress, additional training of HEGR’s STT Model with
Key Discussion Points
- Mr. Ganga Gyatso discussed about generated .srt files for the audio training data and shared the google drive folder enclosing those files [ Garchen Rinpoche SRT files]. And discussed in detail about the extra gaps and overlapping time codes with the help of his article in open pecha forum.
- With reference to the transcription quality, Ganga Gyatso discussed about the concerns raised by Mr. David earlier. Particularly, regarding the nature of transcription the former emphasised that we had been working on verbatim transcription so that cleaning of the fillers, sounds, etc can be done by the LLM.
- Later, Mr. Ganga Gyatso demonstrated the model he had trained with wav2vec2 and suggested for training of Garchen Rinpoche’s Model with whisper ai.
- Mr. David suggested continuing the model training for better results.
Action Items
- @DavidYesheNyima [Will be giving another 30 minutes of audio that should be suitable for benchmarking.of Garchen Rinpoche] - Due: [ASAP]
- @Ganga_Gyatso [will be update about reviewer for the transcription] - Due: [after communicating with Mr. Ngawang Trinley]
- @Ganga_Gyatso [will be will be discussing with his colleague (Mr. Gayche) regarding transcript cleanup and translation] - Due: [ASAP]
Decisions Made
- Need to review the transcripts audio files.
- To continue the model training for better results.