Commits · 32c248b8a0765bb821a9fa3f6cee4086565bb924 · xuning / sherpaonnx

12 Sep, 2025 4 commits
- Release v1.12.13 (#2593) · 32c248b8
  32c248b8 Browse Directory
  
  Fangjun Kuang authored 2025-09-12 16:03:15 +0800
- Upload RKNN models for sense-voice (#2592) · c415092f
  c415092f 浏览文件
  
  Fangjun Kuang authored 2025-09-12 15:54:03 +0800
- Support RK NPU for SenseVoice non-streaming ASR models (#2589) · c691318b ...
  c691318b 浏览文件
```
This PR adds RK NPU support for SenseVoice non-streaming ASR models by implementing a new RKNN backend with greedy CTC decoding.

- Adds offline RKNN implementation for SenseVoice models including model loading, feature processing, and CTC decoding
- Introduces export tools to convert SenseVoice models from PyTorch to ONNX and then to RKNN format
- Implements provider-aware validation to prevent mismatched model and provider usage
```
  Fangjun Kuang authored 2025-09-12 10:46:38 +0800
- Fix initializing symbol table for OnlineRecognizer. (#2590) · 926b2885
  926b2885 浏览文件
  
  Fangjun Kuang authored 2025-09-12 09:37:06 +0800
10 Sep, 2025 4 commits
- Release v1.12.12 (#2586) · 04a98ca8
  04a98ca8 浏览文件
  
  Fangjun Kuang authored 2025-09-10 22:55:01 +0800
- Add various languge bindings for Wenet non-streaming CTC models (#2584) · 7e42ba2c ...
  7e42ba2c 浏览文件
```
This PR adds support for Wenet non-streaming CTC models to sherpa-onnx by introducing the SherpaOnnxOfflineWenetCtcModelConfig struct and integrating it across all language bindings and APIs. The implementation follows the same pattern as other CTC model types like Zipformer CTC.

- Introduces SherpaOnnxOfflineWenetCtcModelConfig struct with a single model field for the ONNX model path
- Adds the new config to SherpaOnnxOfflineModelConfig and updates all language bindings (C++, Pascal, Kotlin, Java, Go, C#, Swift, JavaScript, etc.)
- Provides comprehensive examples and tests across all supported platforms and languages
```
  Fangjun Kuang authored 2025-09-10 18:52:18 +0800
- Export ASLP-lab/WSYue-ASR/tree/main/u2pp_conformer_yue to sherpa-onnx (#2582) · 71f87e18
  71f87e18 Browse Directory
  
  Fangjun Kuang authored 2025-09-10 14:27:09 +0800
- Upload new sense-voice models (#2580) · 19b01899
  19b01899 Browse Directory
  
  Fangjun Kuang authored 2025-09-10 09:41:33 +0800
09 Sep, 2025 4 commits

Export KittenTTS mini v0.1 to sherpa-onnx (#2578) · 9a73770e
9a73770e 浏览文件

Fangjun Kuang authored 2025-09-09 18:33:37 +0800
Fix the missing online punctuation in android aar (#2577) · a1d6592d
a1d6592d 浏览文件

Fangjun Kuang authored 2025-09-09 18:01:43 +0800

Add various language bindings for streaming T-one Russian ASR models (#2576) · 686b909e ...

This PR adds support for streaming T-one Russian ASR models across various language bindings in the sherpa-onnx library. The changes enable T-one CTC (Connectionist Temporal Classification) model integration by adding new configuration structures and example implementations.

- Introduces OnlineToneCtcModelConfig structures across all language bindings (C, C++, Swift, Java, Kotlin, Go, etc.)
- Adds T-one CTC model support to WASM implementations for both ASR and keyword spotting
- Provides comprehensive example implementations demonstrating T-one model usage in multiple programming languages

authored 2025-09-09 16:51:18 +0800

Add C++ and Python support for T-one streaming Russian ASR models (#2575) · 858b5052 ...

858b5052 浏览文件

This PR adds support for T-one streaming Russian ASR models in both C++ and Python APIs. The T-one model is a CTC-based Russian speech recognition model with specific characteristics including float16 state handling, 300ms frame lengths, and 8kHz sampling rate.

- Added new OnlineToneCtcModel implementation with specialized processing for T-one models
- Integrated T-one support into the existing CTC model pipeline and Python bindings
- Added Python example and test scripts for the new functionality

authored 2025-09-09 12:07:34 +0800

08 Sep, 2025 1 commit

Export models from https://github.com/voicekit-team/T-one to sherpa-onnx (#2571) · e4f48ce6 ...

e4f48ce6 Browse File

This PR exports models from the T-one repository (https://github.com/voicekit-team/T-one) to sherpa-onnx format, creating a complete pipeline for Russian speech recognition using streaming CTC models.

- Adds scripts to download, process, and test T-one models in sherpa-onnx format
- Creates GitHub workflow for automated model export and publishing
- Updates kaldi-native-fbank dependency to version 1.22.1

authored 2025-09-08 17:22:23 +0800

05 Sep, 2025 2 commits

Update README to include https://github.com/Mentra-Community/MentraOS (#2565) · e870afc0 ...

e870afc0 浏览文件

This PR adds documentation for MentraOS, a smart glasses operating system that integrates sherpa-onnx for speech recognition functionality. The addition showcases another real-world application using the sherpa-onnx library.

- Adds a new section documenting MentraOS integration with sherpa-onnx
- Includes description of MentraOS features and platform support
- References related pull request for implementation details

authored 2025-09-05 16:23:28 +0800

Add hint for loading model files from SD card on Android. (#2564) · 4167b86c ...

4167b86c 浏览文件

This PR adds a helpful hint for Android developers who are trying to load model files from the SD card instead of the app's assets. The change detects when an absolute path is provided while an asset manager is still being used, which is a common configuration mistake.

- Adds validation to detect absolute paths when using Android asset manager
- Provides clear error messages guiding users to set assetManager to null for SD card file access
- References the related issue for additional context (#2562)

authored 2025-09-05 16:06:42 +0800

04 Sep, 2025 3 commits
- Avoid appending blanks for Cantonese vits tts. (#2559) · 1568ac27
  1568ac27 浏览文件
  
  Fangjun Kuang authored 2025-09-04 15:01:20 +0800
- Fix cantonese vits tts (#2558) · e254c38f
  e254c38f 浏览文件
  
  Fangjun Kuang authored 2025-09-04 14:00:14 +0800
- Disable loading libs from jar on Android. (#2557) · 0823ddcb ...
  0823ddcb 浏览文件
```
This PR disables loading native libraries from JAR resources specifically on Android platforms. The change prevents potential issues with JAR-based library loading on Android while maintaining compatibility with other platforms.
```
  Fangjun Kuang authored 2025-09-04 12:13:27 +0800
02 Sep, 2025 1 commit
- Support armv8l in Java API (#2556) · daac04bd
  daac04bd 浏览文件
  
  凌封 authored 2025-09-02 20:13:19 +0800
01 Sep, 2025 6 commits
- Update kaldifst and kaldi-decoder (#2551) · b0f35572
  b0f35572 浏览文件
  
  Fangjun Kuang authored 2025-09-01 16:59:03 +0800
- Fix using sherpa-onnx as a cmake sub-project. (#2550) · c2cad93e
  c2cad93e 浏览文件
  
  Fangjun Kuang authored 2025-09-01 15:29:19 +0800
- Fix building for risc-v (#2549) · 0b5af832
  0b5af832 浏览文件
  
  Fangjun Kuang authored 2025-09-01 15:04:51 +0800
- Release v1.12.11 (#2547) · a9187d5c
  a9187d5c Browse Directory
  
  Fangjun Kuang authored 2025-09-01 14:09:24 +0800
- Fix linking (#2546) · f0e68cde
  f0e68cde 浏览文件
  
  Fangjun Kuang authored 2025-09-01 11:59:46 +0800
- Fix c api (#2545) · 27311b8a ...
  27311b8a Browse Directory
```
This PR fixes the C API by adding proper support for durations in offline recognition results. The issue addresses problems introduced in a previous PR where the durations field was added to the C API struct but not properly handled across all language bindings.

Key changes:

- Adds durations field handling across multiple language bindings (Swift, Kotlin, Java, C#)
- Fixes field ordering in C API struct to ensure ABI compatibility
- Updates JNI implementation to properly extract and pass durations data
```
  Fangjun Kuang authored 2025-09-01 11:23:49 +0800
27 Aug, 2025 2 commits

Add Zipvoice (#2487) · c149696c ...
c149696c 浏览文件
```
Co-authored-by: yaozengwei <yaozengwei@outlook.com>
```
Wei Kang authored 2025-08-27 19:50:00 +0800

Fix uploading win32 libs to huggingface (#2537) · 6768ca78 ...

6768ca78 浏览文件

This PR fixes the uploading process for win32 libraries to Hugging Face by updating Windows OS detection and correcting the file copy destination path.

- Replaces deprecated wmic command with PowerShell-based OS detection for better reliability
- Adds fallback mechanism using cmd /c ver when PowerShell is unavailable
- Corrects the destination path for win32 library archives to include version subdirectory

authored 2025-08-27 16:47:53 +0800

26 Aug, 2025 4 commits
- Add one more German tts model from OpenVoiceOS. (#2536) · d30aa980
  d30aa980 浏览文件
  
  Fangjun Kuang authored 2025-08-26 23:19:31 +0800
- Fix wasm for kws (#2535) · 408808b3
  408808b3 浏览文件
  
  Fangjun Kuang authored 2025-08-26 22:30:04 +0800
- Simplify the usage of our non-Android Java API (#2533) · 7c9d071e ...
  7c9d071e 浏览文件
```
This PR simplifies the usage of the non-Android Java API by providing platform-specific JAR files that include native shared libraries, eliminating the need for users to manually manage native dependencies.

- Refactored LibraryUtils.java to support multiple library loading methods including extracting from JAR resources
- Added build infrastructure to create platform-specific native library JAR files
- Introduced debug capabilities and improved error handling for library loading
```
  Fangjun Kuang authored 2025-08-26 20:13:07 +0800
- Support BPE models with byte fallback. (#2531) · 9d0adcd3
  9d0adcd3 浏览文件
  
  Fangjun Kuang authored 2025-08-26 12:03:02 +0800
25 Aug, 2025 8 commits
- Add license info about tts models from OpenVoiceOS (#2530) · f45cd87a
  f45cd87a 浏览文件
  
  Fangjun Kuang authored 2025-08-26 07:24:02 +0800
- Fix releasing go packages (#2529) · eaf2eb2e
  eaf2eb2e 浏览文件
  
  Fangjun Kuang authored 2025-08-25 20:01:02 +0800
- Generate tts samples for MatchaTTS (English). (#2527) · f1f8149a
  f1f8149a 浏览文件
  
  Fangjun Kuang authored 2025-08-25 16:04:50 +0800
- Add two more Piper tts models (#2525) · 4694d675 ...
  4694d675 浏览文件
```
This PR adds support for two new Piper TTS (Text-to-Speech) models: an Indonesian model (id_ID-news_tts-medium) and a Hindi model (hi_IN-rohan-medium).
```
  Fangjun Kuang authored 2025-08-25 14:42:25 +0800
- Release v1.12.10 (#2523) · 6b1fbded
  6b1fbded 浏览文件
  
  Fangjun Kuang authored 2025-08-25 11:49:31 +0800
- Fix kokoro tts for punctuations (#2522) · 3d5d1b9b
  3d5d1b9b 浏览文件
  
  Fangjun Kuang authored 2025-08-25 11:06:28 +0800
- Split sherpa-onnx Python package (#2521) · e8dd5cd2
  e8dd5cd2 浏览文件
  
  Fangjun Kuang authored 2025-08-25 10:16:58 +0800
- Support 16KB page size for Android (#2520) · 44a92efb
  44a92efb 浏览文件
  
  Fangjun Kuang authored 2025-08-25 10:00:51 +0800
21 Aug, 2025 1 commit
- Add tdt duration to APIs (#2514) · 06ae4a7c
  06ae4a7c 浏览文件
  
  Brad Murray authored 2025-08-21 10:55:04 +0800