A custom ASR model to transcribe the speech of Kabjye Dilgo Khyentse Rinpoche

Training from scratch with the combined old and new datasets ensures balanced learning and avoids overfitting to patterns from the previously fine-tuned checkpoint. It leverages the base model’s generalizability, treats all data equally, and eliminates biases toward earlier fine-tuned data. Since the dataset is small (<50 hours), training from scratch is cost-effective, quick (under $10), and provides a fresh optimization process for better convergence and flexibility for future fine-tuning.

1 Like