名称 最后更新
.github 正在载入提交数据...
android 正在载入提交数据...
c-api-examples 正在载入提交数据...
cmake 正在载入提交数据...
dart-api-examples 正在载入提交数据...
dotnet-examples 正在载入提交数据...
ffmpeg-examples 正在载入提交数据...
flutter-examples 正在载入提交数据...
flutter 正在载入提交数据...
go-api-examples 正在载入提交数据...
ios-swift 正在载入提交数据...
ios-swiftui 正在载入提交数据...
java-api-examples 正在载入提交数据...
kotlin-api-examples 正在载入提交数据...
mfc-examples 正在载入提交数据...
nodejs-addon-examples 正在载入提交数据...
nodejs-examples 正在载入提交数据...
python-api-examples 正在载入提交数据...
scripts 正在载入提交数据...
sherpa-onnx 正在载入提交数据...
swift-api-examples 正在载入提交数据...
toolchains 正在载入提交数据...
wasm 正在载入提交数据...
.clang-format 正在载入提交数据...
.clang-tidy 正在载入提交数据...
.flake8 正在载入提交数据...
.gitignore 正在载入提交数据...
CHANGELOG.md 正在载入提交数据...
CMakeLists.txt 正在载入提交数据...
CPPLINT.cfg 正在载入提交数据...
LICENSE 正在载入提交数据...
MANIFEST.in 正在载入提交数据...
README.md 正在载入提交数据...
build-aarch64-linux-gnu.sh 正在载入提交数据...
build-android-arm64-v8a.sh 正在载入提交数据...
build-android-armv7-eabi.sh 正在载入提交数据...
build-android-x86-64.sh 正在载入提交数据...
build-android-x86.sh 正在载入提交数据...
build-arm-linux-gnueabihf.sh 正在载入提交数据...
build-ios-no-tts.sh 正在载入提交数据...
build-ios-shared.sh 正在载入提交数据...
build-ios.sh 正在载入提交数据...
build-riscv64-linux-gnu.sh 正在载入提交数据...
build-swift-macos.sh 正在载入提交数据...
build-wasm-simd-asr.sh 正在载入提交数据...
build-wasm-simd-kws.sh 正在载入提交数据...
build-wasm-simd-nodejs.sh 正在载入提交数据...
build-wasm-simd-tts.sh 正在载入提交数据...
release.sh 正在载入提交数据...
setup.py 正在载入提交数据...

Supported functions

Speech recognition Speech synthesis Speaker verification Speaker identification
✔️ ✔️ ✔️ ✔️
Spoken Language identification Audio tagging Voice activity detection Keyword spotting
✔️ ✔️ ✔️ ✔️

Supported platforms

Architecture Android iOS Windows macOS linux
x64 ✔️ ✔️ ✔️ ✔️
x86 ✔️ ✔️
arm64 ✔️ ✔️ ✔️ ✔️ ✔️
arm32 ✔️ ✔️
riscv64 ✔️

Supported programming languages

C++ C Python C# Java JavaScript Kotlin Swift Go Dart
✔️ ✔️ ✔️ ✔️ ✔️ ✔️ ✔️ ✔️ ✔️ ✔️

It also supports WebAssembly.

Introduction

This repository supports running the following functions locally

  • Speech-to-text (i.e., ASR); both streaming and non-streaming are supported
  • Text-to-speech (i.e., TTS)
  • Speaker identification
  • Speaker verification
  • Spoken language identification
  • Audio tagging
  • VAD (e.g., silero-vad)
  • Keyword spotting

on the following platforms and operating systems:

with the following APIs

  • C++, C, Python, Go, C#
  • Java, Kotlin, JavaScript
  • Swift
  • Dart

Links for pre-built Android APKs

Description URL 中国用户
Streaming speech recognition Address 点此
Text-to-speech Address 点此
Voice activity detection (VAD) Address 点此
VAD + non-streaming speech recognition Address 点此
Two-pass speech recognition Address 点此
Audio tagging Address 点此
Audio tagging (WearOS) Address 点此
Speaker identification Address 点此
Spoken language identification Address 点此
Keyword spotting Address 点此

Links for pre-built Flutter APPs

Description URL 中国用户
Streaming speech recognition Address 点此

Links for pre-trained models

Description URL
Speech recognition (speech to text, ASR) Address
Text-to-speech (TTS) Address
VAD Address
Keyword spotting Address
Audio tagging Address
Speaker identification (Speaker ID) Address
Spoken language identification (Language ID) See multi-lingual Whisper ASR models from Speech recognition
Punctuation Address

Useful links

How to reach us

Please see https://k2-fsa.github.io/sherpa/social-groups.html for 新一代 Kaldi 微信交流群 and QQ 交流群.