- 29 Jul, 2025 1 commit
-
-
Ming-Hsuan-Tu authored
-
- 27 Jul, 2025 3 commits
-
-
Fangjun Kuang authored
-
Yiwei Shao authored
-
Fangjun Kuang authored
-
- 26 Jul, 2025 4 commits
-
-
This PR fixes a data type compatibility issue in the GigaAM transducer encoder where the output length tensor could be either int32 or int64, but the code only handled int32. The fix adds runtime type checking and supports both data types. - Adds runtime detection of encoder output length tensor data type (int32 vs int64) - Implements conditional data access based on the detected type - Adds error handling for unsupported data types
Fangjun Kuang authored -
Fangjun Kuang authored
-
Fangjun Kuang authored
-
- 17 Jul, 2025 2 commits
-
-
infinite42 authored
-
infinite42 authored
-
- 16 Jul, 2025 3 commits
-
-
infinite42 authored
-
Fangjun Kuang authored
-
Fangjun Kuang authored
-
- 15 Jul, 2025 1 commit
-
-
A new method and property were introduced in the VoiceActivityDetector C++ and Python APIs to provide access to the current speech segment as soon as speech is detected, rather than only after the segment completes.
Fangjun Kuang authored
-
- 14 Jul, 2025 1 commit
-
-
Fangjun Kuang authored
-
- 12 Jul, 2025 12 commits
-
-
Fangjun Kuang authored
-
Fangjun Kuang authored
-
Fangjun Kuang authored
-
Fangjun Kuang authored
-
Fangjun Kuang authored
-
Fangjun Kuang authored
-
Fangjun Kuang authored
-
This PR adds support for the new ten-vad model in both the Node.js addon examples and the HarmonyOS wrapper. - Introduce TenVadConfig alongside existing SileroVadConfig and extend the VadConfig API. - Update C++ addon to parse ten-vad parameters and pass them through to the detector. - Modify Node.js example scripts to let users switch between silero and ten-vad and to normalize generated filenames.
Fangjun Kuang authored -
Add support for the ten-vad model alongside silero-vad in the WebAssembly VAD API, update the UI and documentation, and extend examples and CI workflows to handle the new model. - Extend C++ bindings and printing logic to include ten-vad configuration. - Implement JavaScript init/free routines and runtime detection for ten-vad. - Update UI layout, README assets, example scripts, and CI workflow to support ten-vad.
Fangjun Kuang authored -
Fangjun Kuang authored
-
Fangjun Kuang authored
-
This PR fixes a Linux build error by removing the non-portable std::logf call and using the C logf function instead. - Replace std::logf with logf in TenVadModel::Impl::LogMel to compile on Linux.
Fangjun Kuang authored
-
- 11 Jul, 2025 2 commits
-
-
This PR adds support for the TEN VAD model alongside the existing Silero VAD in both C++ and Python interfaces. - Introduces TenVadModelConfig with Python bindings and integrates it into VadModelConfig. - Implements TenVadModel in C++ and extends the factory (VadModel::Create) and detector logic to choose between Silero and TEN VAD. - Updates build files (CMake), fixes a spelling typo, and extends the Python example script to demonstrate --ten-vad-model.
Fangjun Kuang authored -
Fangjun Kuang authored
-
- 10 Jul, 2025 2 commits
-
-
Fangjun Kuang authored
-
Add support for the new NeMo Canary ASR model across multiple language bindings by introducing a Canary model configuration and setter method on the offline recognizer. - Define Canary model config in Pascal, Go, C#, Dart and update converter functions - Add SetConfig API for offline recognizer (Pascal, Go, C#, Dart) - Extend CI/workflows and example scripts to test non-streaming Canary decoding
Fangjun Kuang authored
-
- 09 Jul, 2025 3 commits
-
-
# New Features - Added new example programs demonstrating streaming speech recognition from a microphone using Parakeet-TDT CTC and Zipformer Transducer models with voice activity detection. - These examples support microphone input via PortAudio and display recognized text incrementally. # Bug Fixes - Improved error handling and logic when opening microphone devices in several example programs for more reliable device initialization. # Chores - Updated build configuration to include new executable examples when PortAudio support is enabled.
Fangjun Kuang authored -
This PR integrates LODR (Level-Ordered Deterministic Rescoring) support from Icefall into both online and offline recognizers, enabling LODR for LM shallow fusion and LM rescore. - Extended OnlineLMConfig and OfflineLMConfig to include lodr_fst, lodr_scale, and lodr_backoff_id. - Implemented LodrFst and LodrStateCost classes and wired them into RNN LM scoring in both online and offline code paths. - Updated Python bindings, CLI entry points, examples, and CI test scripts to accept and exercise the new LODR options.
Askars Salimbajevs authored -
Refactors and extends model export support to include new NeMo Parakeet TDT int8 variants for English and Japanese, updating the Kotlin API, export scripts, test runners, and CI workflows. - Added support for two new int8 model types in OfflineRecognizer.kt. - Enhanced Python export scripts to perform dynamic quantization and metadata injection. - Updated shell scripts and GitHub workflows to package, test, and publish int8 model artifacts.
Fangjun Kuang authored
-
- 08 Jul, 2025 3 commits
-
-
Fangjun Kuang authored
-
Fangjun Kuang authored
-
Add support for the NeMo Canary model in both Java and Kotlin APIs, wiring it through JNI and updating examples and CI. - Introduce OfflineCanaryModelConfig in Kotlin and Java with builder patterns - Extend OfflineRecognizer to accept and apply the new canary config via setConfig - Update JNI binding (GetOfflineConfig) and getOfflineModelConfig mapping (type 32), plus examples and CI workflows
Fangjun Kuang authored
-
- 07 Jul, 2025 3 commits
-
-
This PR introduces support for NeMo Canary models across C, C++, and JavaScript APIs by adding new Canary configuration structures, updating bindings, extending examples, and enhancing CI workflows. - Add OfflineCanaryModelConfig to all language bindings (C, C++, JS, ETS). - Implement SetConfig methods and NAPI wrappers for updating recognizer config at runtime. - Update examples and CI scripts to demonstrate and test NeMo Canary model usage.
Fangjun Kuang authored -
Fangjun Kuang authored
-