README.md



Speech recognition with Visual C++ MFC

This directory contains examples showing how to use Next-gen Kaldi in MFC
for speech recognition.

Caution: You need to use Windows and install Visual Studio in order to run it.
We use bash script below to demonstrate how to use it. Please change
the commands accordingly for Windows.


Streaming speech recognition

mkdir -p $HOME/open-source
cd $HOME/open-source

git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build

cmake -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=OFF -DCMAKE_INSTALL_PREFIX=./install ..
cmake --build . --config Release --target install

cd ../mfc-examples

msbuild ./mfc-examples.sln /property:Configuration=Release /property:Platform=x64

# now run the program

./x64/Release/StreamingSpeechRecognition.exe


Note that we also need to download pre-trained models. Please
refer to https://k2-fsa.github.io/sherpa/onnx/pretrained_models/online-transducer/index.html
for a list of streaming models.

We use the following model for demonstration.

cd $HOME/open-source/sherpa-onnx/mfc-examples/x64/Release
wget https://huggingface.co/pkufool/icefall-asr-zipformer-streaming-wenetspeech-20230615/resolve/main/exp/encoder-epoch-12-avg-4-chunk-16-left-128.onnx
wget https://huggingface.co/pkufool/icefall-asr-zipformer-streaming-wenetspeech-20230615/resolve/main/exp/decoder-epoch-12-avg-4-chunk-16-left-128.onnx
wget https://huggingface.co/pkufool/icefall-asr-zipformer-streaming-wenetspeech-20230615/resolve/main/exp/joiner-epoch-12-avg-4-chunk-16-left-128.onnx
wget https://huggingface.co/pkufool/icefall-asr-zipformer-streaming-wenetspeech-20230615/resolve/main/data/lang_char/tokens.txt

# now rename
mv encoder-epoch-12-avg-4-chunk-16-left-128.onnx encoder.onnx
mv decoder-epoch-12-avg-4-chunk-16-left-128.onnx decoder.onnx
mv joiner-epoch-12-avg-4-chunk-16-left-128.onnx joiner.onnx

# Now run it!
./StreamingSpeechRecognition.exe