AI/wekws

Go to file

[ctc] KWS with CTCloss training and CTC prefix beam search detection. (#135 )

* add ctcloss training scripts.

* update compute_det_ctc

* fix typo.

* add fsmn model, can use pretrained kws model from modelscope.

* Add streaming detection of CTC model. Add CTC model onnx export. Add CTC model's result in README; For now CTC model runtime is not supported yet.

* QA run.sh, maxpooling training scripts is compatible. Ready to PR.

* Add a streaming kws demo, support fsmn online forward

* fix typo.

* Align Stream FSMN and Non-Stream FSMN, both in feature extraction and model forward.

* fix repeat activation, add a interval restrict.

* fix timestamp when subsampling!=1.

* fix flake8, update training script and README, give pretrained ckpt.

* fix quickcheck and flake8

* Add realtime CTC-KWS demo in README.

---------

Co-authored-by: dujing <dujing@xmov.ai>

2023-08-16 10:07:04 +08:00

.github

Update doc.yml (#68 )

2022-04-14 16:06:02 +08:00

docs

[docs] support docs

2021-11-11 10:03:52 +08:00

examples

[ctc] KWS with CTCloss training and CTC prefix beam search detection. (#135 )

2023-08-16 10:07:04 +08:00

runtime

[runtime/android] Add how to run on Android, and change onnxruntime to 1.12.1 (#111 )

2022-11-24 17:17:43 +08:00

tools

[ctc] KWS with CTCloss training and CTC prefix beam search detection. (#135 )

2023-08-16 10:07:04 +08:00

wekws

[ctc] KWS with CTCloss training and CTC prefix beam search detection. (#135 )

2023-08-16 10:07:04 +08:00

.clang-format

[runtime/onnxruntime] add onnxruntime support (#79 )

2022-08-28 13:35:21 +08:00

.flake8

[lint] fix flake8 errors (#125 )

2023-03-11 13:10:08 +08:00

.gitignore

[tools] add a bash script that trimmes silence using sox and split-based multi-processing (#56 )

2022-01-15 13:54:44 +08:00

CPPLINT.cfg

[runtime/onnxruntime] add onnxruntime support (#79 )

2022-08-28 13:35:21 +08:00

LICENSE

add license

2021-11-06 18:01:52 +08:00

README.md

[doc] update paper link in README.md (#124 )

2023-02-09 17:01:28 +08:00

requirements.txt

[wekws] add online noise and rir argumentation (#115 )

2022-11-28 21:12:26 +08:00

README.md

WeKws

Roadmap | Paper

Production First and Production Ready End-to-End Keyword Spotting Toolkit.

The goal of this toolkit it to...

Small footprint keyword spotting (KWS), or specifically wake-up word (WuW) detection is a typical and important module in internet of things (IoT) devices. It provides a way for users to control IoT devices with a hands-free experience. A WuW detection system usually runs locally and persistently on IoT devices, which requires low consumptional power, less model parameters, low computational comlexity and to detect predefined keyword in a streaming way, i.e., requires low latency.

Typical Scenario

We are going to support the following typical applications of wakeup word:

Single wake-up word
Multiple wake-up words
Customizable wake-up word
Personalized wake-up word, i.e. combination of wake-up word detection and voiceprint

Installation

Clone the repo

git clone https://github.com/wenet-e2e/wekws.git

Install Conda: please see https://docs.conda.io/en/latest/miniconda.html
Create Conda env:

conda create -n wekws python=3.8
conda activate wekws
pip install -r requirements.txt
conda install pytorch=1.10.0 torchaudio=0.10.0 cudatoolkit=11.1 -c pytorch -c conda-forge

Dataset

We plan to support a variaty of open source wake-up word datasets, include but not limited to:

All the well-trained models on these dataset will be made public avaliable.

Runtime

We plan to support a variaty of hardwares and platforms, including:

Web browser
x86
Android
Raspberry Pi

Discussion

For Chinese users, you can scan the QR code on the left to follow our offical account of WeNet. We also created a WeChat group for better discussion and quicker response. Please scan the QR code on the right to join the chat group.

Reference

Mining Effective Negative Training Samples for Keyword Spotting (github, paper)
Max-pooling Loss Training of Long Short-term Memory Networks for Small-footprint Keyword Spotting (paper)
A depthwise separable convolutional neural network for keyword spotting on an embedded system (github, paper)
Hello Edge: Keyword Spotting on Microcontrollers (github, paper)
An Empirical Evaluation of Generic Convolutional and Recurrent Networks for Sequence Modeling (github, paper)

Languages

Python 76.2%

C++ 15.2%

Java 5.2%

Shell 2.4%

CMake 1%