- 14 May, 2024 4 commits
-
-
Fangjun Kuang authored
-
Fangjun Kuang authored
-
Fangjun Kuang authored
-
Fangjun Kuang authored
-
- 13 May, 2024 5 commits
-
-
Fangjun Kuang authored
-
Fangjun Kuang authored
-
Fangjun Kuang authored
-
Fangjun Kuang authored
-
- 12 May, 2024 2 commits
-
-
Fangjun Kuang authored
-
* Install naudiodon2 manually. It is needed only when using a microphone. The CI tests don't need it.
Fangjun Kuang authored
-
- 11 May, 2024 3 commits
-
-
Fangjun Kuang authored
-
Fangjun Kuang authored
-
Fangjun Kuang authored
-
- 10 May, 2024 4 commits
-
-
Fangjun Kuang authored
-
Fangjun Kuang authored
-
Fangjun Kuang authored
-
- 09 May, 2024 2 commits
-
-
Fangjun Kuang authored
-
Fangjun Kuang authored
-
- 08 May, 2024 2 commits
-
-
Fangjun Kuang authored
-
Fangjun Kuang authored
-
- 07 May, 2024 2 commits
-
-
Fangjun Kuang authored
-
Fangjun Kuang authored
-
- 06 May, 2024 3 commits
-
-
Fangjun Kuang authored
-
Fangjun Kuang authored
-
Update to match the changes in infer_sv.py at 3D-speaker. Added 2 more supported models and "zh_en" language.
chiiyeh authored
-
- 04 May, 2024 1 commit
-
-
Fangjun Kuang authored
-
- 03 May, 2024 1 commit
-
-
Fangjun Kuang authored
-
- 01 May, 2024 1 commit
-
-
Fangjun Kuang authored
-
- 29 Apr, 2024 1 commit
-
-
Fangjun Kuang authored
-
- 28 Apr, 2024 1 commit
-
-
Fangjun Kuang authored
-
- 26 Apr, 2024 8 commits
-
-
Fangjun Kuang authored
-
Fangjun Kuang authored
-
Fangjun Kuang authored
-
Fangjun Kuang authored
-
Fangjun Kuang authored
-
* Adding temperature scaling on Joiner logits: - T hard-coded to 2.0 - so far best result NCE 0.122 (still not so high) - the BPE scores were rescaled with 0.2 (but then also incorrect words get high confidence, visually reasonable histograms are for 0.5 scale) - BPE->WORD score merging done by min(.) function (tried also prob-product, and also arithmetic, geometric, harmonic mean) - without temperature scaling (i.e. scale 1.0), the best NCE was 0.032 (here product merging was best) Results seem consistent with: https://arxiv.org/abs/2110.15222 Everything tuned on a very-small set of 100 sentences with 813 words and 10.2% WER, a Czech model. I also experimented with blank posteriors mixed into the BPE confidences, but no NCE improvement found, so not pushing that. Temperature scling added also to the Greedy search confidences. * making `temperature_scale` configurable from outsideKarel Vesely authored -
Fangjun Kuang authored
-
Daniel Doña authored
-