- 12 Jul, 2025 2 commits
-
-
Fangjun Kuang authored
-
This PR fixes a Linux build error by removing the non-portable std::logf call and using the C logf function instead. - Replace std::logf with logf in TenVadModel::Impl::LogMel to compile on Linux.
Fangjun Kuang authored
-
- 11 Jul, 2025 2 commits
-
-
This PR adds support for the TEN VAD model alongside the existing Silero VAD in both C++ and Python interfaces. - Introduces TenVadModelConfig with Python bindings and integrates it into VadModelConfig. - Implements TenVadModel in C++ and extends the factory (VadModel::Create) and detector logic to choose between Silero and TEN VAD. - Updates build files (CMake), fixes a spelling typo, and extends the Python example script to demonstrate --ten-vad-model.
Fangjun Kuang authored -
Fangjun Kuang authored
-
- 10 Jul, 2025 2 commits
-
-
Fangjun Kuang authored
-
Add support for the new NeMo Canary ASR model across multiple language bindings by introducing a Canary model configuration and setter method on the offline recognizer. - Define Canary model config in Pascal, Go, C#, Dart and update converter functions - Add SetConfig API for offline recognizer (Pascal, Go, C#, Dart) - Extend CI/workflows and example scripts to test non-streaming Canary decoding
Fangjun Kuang authored
-
- 09 Jul, 2025 3 commits
-
-
# New Features - Added new example programs demonstrating streaming speech recognition from a microphone using Parakeet-TDT CTC and Zipformer Transducer models with voice activity detection. - These examples support microphone input via PortAudio and display recognized text incrementally. # Bug Fixes - Improved error handling and logic when opening microphone devices in several example programs for more reliable device initialization. # Chores - Updated build configuration to include new executable examples when PortAudio support is enabled.
Fangjun Kuang authored -
This PR integrates LODR (Level-Ordered Deterministic Rescoring) support from Icefall into both online and offline recognizers, enabling LODR for LM shallow fusion and LM rescore. - Extended OnlineLMConfig and OfflineLMConfig to include lodr_fst, lodr_scale, and lodr_backoff_id. - Implemented LodrFst and LodrStateCost classes and wired them into RNN LM scoring in both online and offline code paths. - Updated Python bindings, CLI entry points, examples, and CI test scripts to accept and exercise the new LODR options.
Askars Salimbajevs authored -
Refactors and extends model export support to include new NeMo Parakeet TDT int8 variants for English and Japanese, updating the Kotlin API, export scripts, test runners, and CI workflows. - Added support for two new int8 model types in OfflineRecognizer.kt. - Enhanced Python export scripts to perform dynamic quantization and metadata injection. - Updated shell scripts and GitHub workflows to package, test, and publish int8 model artifacts.
Fangjun Kuang authored
-
- 08 Jul, 2025 3 commits
-
-
Fangjun Kuang authored
-
Fangjun Kuang authored
-
Add support for the NeMo Canary model in both Java and Kotlin APIs, wiring it through JNI and updating examples and CI. - Introduce OfflineCanaryModelConfig in Kotlin and Java with builder patterns - Extend OfflineRecognizer to accept and apply the new canary config via setConfig - Update JNI binding (GetOfflineConfig) and getOfflineModelConfig mapping (type 32), plus examples and CI workflows
Fangjun Kuang authored
-
- 07 Jul, 2025 3 commits
-
-
This PR introduces support for NeMo Canary models across C, C++, and JavaScript APIs by adding new Canary configuration structures, updating bindings, extending examples, and enhancing CI workflows. - Add OfflineCanaryModelConfig to all language bindings (C, C++, JS, ETS). - Implement SetConfig methods and NAPI wrappers for updating recognizer config at runtime. - Update examples and CI scripts to demonstrate and test NeMo Canary model usage.
Fangjun Kuang authored -
Fangjun Kuang authored
-
- 06 Jul, 2025 4 commits
-
-
Fangjun Kuang authored
-
Unreal Engine has its own memory management, so we cannot return a struct containing a std::vector object.
Fangjun Kuang authored -
lucaelin authored
-
- 04 Jul, 2025 5 commits
-
-
Fangjun Kuang authored
-
linsui authored
-
Fangjun Kuang authored
-
Adds support for building and packaging Linux AArch64 (arm64) artifacts alongside x64 for Dart/Flutter plugins. - Detects host architecture in CMake and adjusts library paths - Extends test workflows to run on an ARM runner and handle linux-aarch64 paths - Splits release pipeline into separate x64 and aarch64 build/package jobs
Fangjun Kuang authored -
This PR adds support for non-streaming Zipformer CTC ASR models across multiple language bindings, WebAssembly, examples, and CI workflows. - Introduces a new OfflineZipformerCtcModelConfig in C/C++, Python, Swift, Java, Kotlin, Go, Dart, Pascal, and C# APIs - Updates initialization, freeing, and recognition logic to include Zipformer CTC in WASM and Node.js - Adds example scripts and CI steps for downloading, building, and running Zipformer CTC models Model doc is available at https://k2-fsa.github.io/sherpa/onnx/pretrained_models/offline-ctc/icefall/zipformer.html
Fangjun Kuang authored
-
- 03 Jul, 2025 1 commit
-
-
wenjie.Li authored
-
- 02 Jul, 2025 1 commit
-
- 30 Jun, 2025 2 commits
-
-
Fangjun Kuang authored
-
Fangjun Kuang authored
-
- 27 Jun, 2025 3 commits
-
-
It refactors the release scripts to centralize and simplify version updates across multiple files. Key changes include: - Introducing variables (old_version, new_version, replace_str) for version substitution. - Replacing hard-coded sed expressions with dynamic ones in various files. - Ensuring backup files generated by sed are cleaned up after execution.
Fangjun Kuang authored -
Fangjun Kuang authored
-
Adds support for Zipformer transducer ASR models that use Whisper-style features by introducing a new feature flag, parsing metadata, and integrating per-chunk normalization. - Introduce UseWhisperFeature in the model interface and Zipformer implementation - Parse "feature" metadata to set the whisper flag and wire it into the recognizer - Update feature extraction logic to handle Whisper filterbanks with early returns
Fangjun Kuang authored
-
- 26 Jun, 2025 2 commits
-
-
Fangjun Kuang authored
-
Replace the deprecated portaudio-go integration with malgo in the Go real-time speech recognition example and correct version string typos in the Node.js examples. - Fixed “verison” typo in Node.js console logs. - Swapped out portaudio-go for malgo in the Go microphone example, introducing initRecognizer, callback-driven streaming, and sample conversion. - Removed portaudio-go from go.mod.
Fangjun Kuang authored
-
- 25 Jun, 2025 1 commit
-
-
Fangjun Kuang authored
-
- 24 Jun, 2025 3 commits
-
-
Fangjun Kuang authored
-
Fangjun Kuang authored
-
- 20 Jun, 2025 2 commits
-
-
Fangjun Kuang authored
-
Fangjun Kuang authored
-
- 18 Jun, 2025 1 commit
-
-
- generate samples for https://k2-fsa.github.io/sherpa/onnx/tts/all/ - provide int8 model for kokoro v0.19 kokoro-int8-en-v0_19.tar.bz2
Fangjun Kuang authored
-