Wyoming protocol server for the funasr speech to text system.stt -  wyoming-funasr arm64

Step 1. Create Python virtual environment 

mkdir -p  /funasr-wyoming

 

 cd /funasr-wyoming

 

 python3 -m venv venv

 

 source venv/bin/activate

 

python --version

Python 3.11.2

 

apt list --installed

 

 

 

(venv) root@raspberrypi:/funasr-wyoming# pip3 show funasr
Name: funasr
Version: 1.3.0
Summary: FunASR: A Fundamental End-to-End Speech Recognition Toolkit
Home-page: https://github.com/alibaba-damo-academy/FunASR.git
Author: Speech Lab of Alibaba Group
Author-email: [email protected]
License: The MIT License
Location: /funasr-wyoming/venv/lib/python3.11/site-packages
Requires: editdistance, hydra-core, jaconv, jamo, jieba, kaldiio, librosa, modelscope, oss2, pytorch_wpe, PyYAML, requests, scipy, sentencepiece, soundfile, tensorboardX, torch_complex, tqdm, umap_learn

 

Requirements

python>=3.8
torch>=1.13
torchaudio

 

Step 2. Install 

(venv) root@raspberrypi:/funasr-wyoming# pip3 --version

pip 23.0.1 from /funasr-wyoming/venv/lib/python3.11/site-packages/pip (python 3.11)

 

Install torch  via PyPI

pip3 install torch==2.1.0   (CPU-only)

output

Installing collected packages: mpmath, sympy, networkx, MarkupSafe, fsspec, jinja2, torch
Successfully installed MarkupSafe-3.0.3 fsspec-2026.1.0 jinja2-3.1.6 mpmath-1.3.0 networkx-3.6.1 sympy-1.14.0 torch-2.1.0

 

 

 

if ffmpeg is not installed. torchaudio is used to load audio

pip3 install torchaudio==2.1.0   (CPU-only)

output

Successfully installed torchaudio-2.1.0

 

 

Install FunASR 1.3.0  via PyPI

pip3 install -U funasr==1.3.0

 

This will pull:

Downloading https://www.piwheels.org/simple/threadpoolctl/threadpoolctl-3.6.0-py3-none-any.whl (18 kB)

Installing collected packages: jieba, jamo, jaconv, crcmod, antlr4-python3-runtime, urllib3, typing_extensions, tqdm, threadpoolctl, six, sentencepiece, PyYAML, pycryptodome, pycparser, protobuf, platformdirs, packaging, numpy, msgpack, llvmlite, joblib, jmespath, idna, filelock, editdistance, decorator, charset_normalizer, certifi, audioread, torch_complex, tensorboardX, soxr, scipy, requests, pytorch_wpe, omegaconf, numba, lazy_loader, kaldiio, cffi, soundfile, scikit-learn, pooch, modelscope, hydra-core, cryptography, pynndescent, librosa, aliyun-python-sdk-core, umap_learn, aliyun-python-sdk-kms, oss2, funasr

output

Successfully installed PyYAML-6.0.3 aliyun-python-sdk-core-2.16.0 aliyun-python-sdk-kms-2.16.5 antlr4-python3-runtime-4.9.3 audioread-3.1.0 certifi-2026.1.4 cffi-2.0.0 charset_normalizer-3.4.4 crcmod-1.7 cryptography-46.0.3 decorator-5.2.1 editdistance-0.8.1 filelock-3.20.3 funasr-1.3.0 hydra-core-1.3.2 idna-3.11 jaconv-0.4.1 jamo-0.4.1 jieba-0.42.1 jmespath-0.10.0 joblib-1.5.3 kaldiio-2.18.1 lazy_loader-0.4 librosa-0.11.0 llvmlite-0.46.0 modelscope-1.34.0 msgpack-1.1.2 numba-0.63.1 numpy-2.3.5 omegaconf-2.3.0 oss2-2.19.1 packaging-26.0 platformdirs-4.5.1 pooch-1.8.2 protobuf-6.33.4 pycparser-3.0 pycryptodome-3.23.0 pynndescent-0.6.0 pytorch_wpe-0.0.1 requests-2.32.5 scikit-learn-1.8.0 scipy-1.17.0 sentencepiece-0.2.1 six-1.17.0 soundfile-0.13.1 soxr-1.0.0 tensorboardX-2.6.4 threadpoolctl-3.6.0 torch_complex-0.4.4 tqdm-4.67.1 typing_extensions-4.15.0 umap_learn-0.5.11 urllib3-2.6.3

 

 

 



detail






sudo apt install ffmpeg

output

ffmpeg is already the newest version (8:5.1.8-0+deb12u1+rpt1).

 

if ffmpeg is not installed. torchaudio is used to load audio

 

 

Verify installation

python - << 'EOF'
from funasr import AutoModel
print("FunASR imported OK")
EOF

output

FunASR imported OK

 

 

Step3.Download and test a model (example: paraformer-zh)

test.py

from funasr import AutoModel

model = AutoModel(
    model="paraformer-zh",
    model_revision="v2.0.4",
    vad_model="fsmn-vad",
    vad_model_revision="v2.0.4",
    punc_model="ct-punc",
    punc_model_revision="v2.0.4",
)

res = model.generate(input="test.wav")
print(res)

res = model.generate(input="https://isv-data.oss-cn-hangzhou.aliyuncs.com/ics/MaaS/ASR/test_audio/vad_example.wav")
print(res)

 

python3 test.py

 

Models are cached in:

/root/.cache/modelscope/hub/models/iic/speech_seaco_paraformer_large_asr_nat-zh-cn-16k-common-vocab8404-pytorch

output

Downloading Model from https://www.modelscope.cn to directory: /root/.cache/modelscope/hub/models/iic/speech_seaco_paraformer_large_asr_nat-zh-cn-16k-common-vocab8404-pytorch


2026-01-25 08:58:28,492 - modelscope - INFO - Use user-specified model revision: v2.0.4
2026-01-25 08:58:28,595 - modelscope - INFO - Got 11 files, start to download ...
Downloading [fig/res.png]: 100%|███████████████████████████████████████████████████| 192k/192k [00:00<00:00, 386kB/s]
Downloading [am.mvn]: 100%|█████████████████████████████████████████████████████| 10.9k/10.9k [00:00<00:00, 21.7kB/s]
Downloading [example/hotword.txt]: 100%|███████████████████████████████████████████| 7.00/7.00 [00:00<00:00, 11.9B/s]
Downloading [config.yaml]: 100%|████████████████████████████████████████████████| 3.34k/3.34k [00:00<00:00, 5.66kB/s]
Downloading [configuration.json]: 100%|███████████████████████████████████████████████| 478/478 [00:00<00:00, 766B/s]
Downloading [README.md]: 100%|██████████████████████████████████████████████████| 11.3k/11.3k [00:00<00:00, 18.2kB/s]
Downloading [example/asr_example.wav]: 100%|███████████████████████████████████████| 141k/141k [00:00<00:00, 208kB/s]
Downloading [fig/seaco.png]: 100%|█████████████████████████████████████████████████| 167k/167k [00:00<00:00, 296kB/s]
Downloading [tokens.json]: 100%|█████████████████████████████████████████████████| 91.5k/91.5k [00:00<00:00, 165kB/s]
Downloading [seg_dict]: 100%|███████████████████████████████████████████████████| 7.90M/7.90M [00:03<00:00, 2.76MB/s]
Downloading [model.pt]: 100%|█████████████████████████████████████████████████████| 944M/944M [01:31<00:00, 10.8MB/s]
Processing 11 items: 100%|████████████████████████████████████████████████████████| 11.0/11.0 [01:31<00:00, 8.34s/it]
2026-01-25 09:00:00,347 - modelscope - INFO - Download model 'iic/speech_seaco_paraformer_large_asr_nat-zh-cn-16k-common-vocab8404-pytorch' successfully.█████████████████████████████████████████████| 91.5k/91.5k [00:00<00:00, 165kB/s]
WARNING:root:trust_remote_code: False                                            | 5.00M/944M [00:01<02:50, 5.77MB/s]
Downloading [model.pt]:   2%|█▎                                                  | 23.0M/944M [00:03<02:15, 7.12MB/s]
Downloading [model.pt]: 100%|████████████████████████████████████████████████████▉| 942M/944M [01:31<00:00, 6.28MB/s]
Downloading [seg_dict]: 100%|███████████████████████████████████████████████████| 7.90M/7.90M [00:03<00:00, 4.16MB/s

 

python3 -c "from funasr import AutoModel; AutoModel(model='paraformer-zh', device='cpu')"

 

Step 4.FunASR + Wyoming STT full server

 

Install wyoming

pip3 install wyoming==1.8.0

output

Looking in indexes: https://pypi.org/simple, https://www.piwheels.org/simple
Collecting wyoming==1.5.0
  Downloading wyoming-1.5.0-py3-none-any.whl (23 kB)
Installing collected packages: wyoming
Successfully installed wyoming-1.5.0

 

A Wyoming server consists of an AsyncServer and an AsyncEventHandler. The handler processes events like Describe

 

python3 server.py

 

 

 

pip3 list

Package                Version
---------------------- --------
aliyun-python-sdk-core 2.16.0
aliyun-python-sdk-kms  2.16.5
antlr4-python3-runtime 4.9.3
audioread              3.1.0
certifi                2026.1.4
cffi                   2.0.0
charset-normalizer     3.4.4
crcmod                 1.7
cryptography           46.0.3
decorator              5.2.1
editdistance           0.8.1
filelock               3.20.3
fsspec                 2026.1.0
funasr                 1.3.0
hydra-core             1.3.2
idna                   3.11
ifaddr                 0.2.0
jaconv                 0.4.1
jamo                   0.4.1
jieba                  0.42.1
Jinja2                 3.1.6
jmespath               0.10.0
joblib                 1.5.3
kaldiio                2.18.1
lazy_loader            0.4
librosa                0.11.0
llvmlite               0.46.0
MarkupSafe             3.0.3
modelscope             1.34.0
mpmath                 1.3.0
msgpack                1.1.2
networkx               3.6.1
numba                  0.63.1
numpy                  1.26.4
omegaconf              2.3.0
oss2                   2.19.1
packaging              26.0
pip                    23.0.1
platformdirs           4.5.1
pooch                  1.8.2
protobuf               6.33.4
pycparser              3.0
pycryptodome           3.23.0
pynndescent            0.6.0
pytorch-wpe            0.0.1
PyYAML                 6.0.3
requests               2.32.5
scikit-learn           1.8.0
scipy                  1.17.0
sentencepiece          0.2.1
setuptools             66.1.1
six                    1.17.0
soundfile              0.13.1
soxr                   1.0.0
sympy                  1.14.0
tensorboardX           2.6.4
threadpoolctl          3.6.0
torch                  2.1.0
torch_complex          0.4.4
torchaudio             2.1.0
tqdm                   4.67.1
typing_extensions      4.15.0
umap-learn             0.5.11
urllib3                2.6.3
wyoming                1.8.0
zeroconf               0.148.0

 

Comments

Be the first to post a comment

Post a comment