- 16 Jul, 2025 2 commits
-
-
Fangjun Kuang authored
-
Fangjun Kuang authored
-
- 15 Jul, 2025 1 commit
-
-
A new method and property were introduced in the VoiceActivityDetector C++ and Python APIs to provide access to the current speech segment as soon as speech is detected, rather than only after the segment completes.
Fangjun Kuang authored
-
- 14 Jul, 2025 1 commit
-
-
Fangjun Kuang authored
-
- 12 Jul, 2025 12 commits
-
-
Fangjun Kuang authored
-
Fangjun Kuang authored
-
Fangjun Kuang authored
-
Fangjun Kuang authored
-
Fangjun Kuang authored
-
Fangjun Kuang authored
-
Fangjun Kuang authored
-
This PR adds support for the new ten-vad model in both the Node.js addon examples and the HarmonyOS wrapper. - Introduce TenVadConfig alongside existing SileroVadConfig and extend the VadConfig API. - Update C++ addon to parse ten-vad parameters and pass them through to the detector. - Modify Node.js example scripts to let users switch between silero and ten-vad and to normalize generated filenames.
Fangjun Kuang authored -
Add support for the ten-vad model alongside silero-vad in the WebAssembly VAD API, update the UI and documentation, and extend examples and CI workflows to handle the new model. - Extend C++ bindings and printing logic to include ten-vad configuration. - Implement JavaScript init/free routines and runtime detection for ten-vad. - Update UI layout, README assets, example scripts, and CI workflow to support ten-vad.
Fangjun Kuang authored -
Fangjun Kuang authored
-
Fangjun Kuang authored
-
This PR fixes a Linux build error by removing the non-portable std::logf call and using the C logf function instead. - Replace std::logf with logf in TenVadModel::Impl::LogMel to compile on Linux.
Fangjun Kuang authored
-
- 11 Jul, 2025 2 commits
-
-
This PR adds support for the TEN VAD model alongside the existing Silero VAD in both C++ and Python interfaces. - Introduces TenVadModelConfig with Python bindings and integrates it into VadModelConfig. - Implements TenVadModel in C++ and extends the factory (VadModel::Create) and detector logic to choose between Silero and TEN VAD. - Updates build files (CMake), fixes a spelling typo, and extends the Python example script to demonstrate --ten-vad-model.
Fangjun Kuang authored -
Fangjun Kuang authored
-
- 10 Jul, 2025 2 commits
-
-
Fangjun Kuang authored
-
Add support for the new NeMo Canary ASR model across multiple language bindings by introducing a Canary model configuration and setter method on the offline recognizer. - Define Canary model config in Pascal, Go, C#, Dart and update converter functions - Add SetConfig API for offline recognizer (Pascal, Go, C#, Dart) - Extend CI/workflows and example scripts to test non-streaming Canary decoding
Fangjun Kuang authored
-
- 09 Jul, 2025 3 commits
-
-
# New Features - Added new example programs demonstrating streaming speech recognition from a microphone using Parakeet-TDT CTC and Zipformer Transducer models with voice activity detection. - These examples support microphone input via PortAudio and display recognized text incrementally. # Bug Fixes - Improved error handling and logic when opening microphone devices in several example programs for more reliable device initialization. # Chores - Updated build configuration to include new executable examples when PortAudio support is enabled.
Fangjun Kuang authored -
This PR integrates LODR (Level-Ordered Deterministic Rescoring) support from Icefall into both online and offline recognizers, enabling LODR for LM shallow fusion and LM rescore. - Extended OnlineLMConfig and OfflineLMConfig to include lodr_fst, lodr_scale, and lodr_backoff_id. - Implemented LodrFst and LodrStateCost classes and wired them into RNN LM scoring in both online and offline code paths. - Updated Python bindings, CLI entry points, examples, and CI test scripts to accept and exercise the new LODR options.
Askars Salimbajevs authored -
Refactors and extends model export support to include new NeMo Parakeet TDT int8 variants for English and Japanese, updating the Kotlin API, export scripts, test runners, and CI workflows. - Added support for two new int8 model types in OfflineRecognizer.kt. - Enhanced Python export scripts to perform dynamic quantization and metadata injection. - Updated shell scripts and GitHub workflows to package, test, and publish int8 model artifacts.
Fangjun Kuang authored
-
- 08 Jul, 2025 3 commits
-
-
Fangjun Kuang authored
-
Fangjun Kuang authored
-
Add support for the NeMo Canary model in both Java and Kotlin APIs, wiring it through JNI and updating examples and CI. - Introduce OfflineCanaryModelConfig in Kotlin and Java with builder patterns - Extend OfflineRecognizer to accept and apply the new canary config via setConfig - Update JNI binding (GetOfflineConfig) and getOfflineModelConfig mapping (type 32), plus examples and CI workflows
Fangjun Kuang authored
-
- 07 Jul, 2025 3 commits
-
-
This PR introduces support for NeMo Canary models across C, C++, and JavaScript APIs by adding new Canary configuration structures, updating bindings, extending examples, and enhancing CI workflows. - Add OfflineCanaryModelConfig to all language bindings (C, C++, JS, ETS). - Implement SetConfig methods and NAPI wrappers for updating recognizer config at runtime. - Update examples and CI scripts to demonstrate and test NeMo Canary model usage.
Fangjun Kuang authored -
Fangjun Kuang authored
-
- 06 Jul, 2025 4 commits
-
-
Fangjun Kuang authored
-
Unreal Engine has its own memory management, so we cannot return a struct containing a std::vector object.
Fangjun Kuang authored -
lucaelin authored
-
- 04 Jul, 2025 5 commits
-
-
Fangjun Kuang authored
-
linsui authored
-
Fangjun Kuang authored
-
Adds support for building and packaging Linux AArch64 (arm64) artifacts alongside x64 for Dart/Flutter plugins. - Detects host architecture in CMake and adjusts library paths - Extends test workflows to run on an ARM runner and handle linux-aarch64 paths - Splits release pipeline into separate x64 and aarch64 build/package jobs
Fangjun Kuang authored -
This PR adds support for non-streaming Zipformer CTC ASR models across multiple language bindings, WebAssembly, examples, and CI workflows. - Introduces a new OfflineZipformerCtcModelConfig in C/C++, Python, Swift, Java, Kotlin, Go, Dart, Pascal, and C# APIs - Updates initialization, freeing, and recognition logic to include Zipformer CTC in WASM and Node.js - Adds example scripts and CI steps for downloading, building, and running Zipformer CTC models Model doc is available at https://k2-fsa.github.io/sherpa/onnx/pretrained_models/offline-ctc/icefall/zipformer.html
Fangjun Kuang authored
-
- 03 Jul, 2025 1 commit
-
-
wenjie.Li authored
-
- 02 Jul, 2025 1 commit
-