42 Commits

Author SHA1 Message Date
Jean Du
b233d46552
[ctc] KWS with CTCloss training and CTC prefix beam search detection. (#135)
* add ctcloss training scripts.

* update compute_det_ctc

* fix typo.

* add fsmn model, can use pretrained kws model from modelscope.

* Add streaming detection of CTC model. Add CTC model onnx export. Add CTC model's result in README; For now CTC model runtime is not supported yet.

* QA run.sh, maxpooling training scripts is compatible. Ready to PR.

* Add a streaming kws demo, support fsmn online forward

* fix typo.

* Align Stream FSMN and Non-Stream FSMN, both in feature extraction and model forward.

* fix repeat activation, add a interval restrict.

* fix timestamp when subsampling!=1.

* fix flake8, update training script and README, give pretrained ckpt.

* fix quickcheck and flake8

* Add realtime CTC-KWS demo in README.

---------

Co-authored-by: dujing <dujing@xmov.ai>
2023-08-16 10:07:04 +08:00
Menglong Xu
6da85d4662
[wekws] add online noise and rir argumentation (#115)
* [wekws] add online noise and rir  argumentation

* format

* format

* update copyright

Co-authored-by: menglong.xu <menglong.xu>
2022-11-28 21:12:26 +08:00
Menglong Xu
64ccd5bb86
[wekws] add cache support for mdtc (#105)
* [wekws] add cache support for mdtc

* format

Co-authored-by: 02Bigboy <570843154@qq.com>
2022-11-08 09:18:18 +08:00
xiaoqiang306
80285fa696
[fix] fix export model input parameter in speechcommand example (#101)
Co-authored-by: jiqiang.fu <jiqiang.fu@rokid.com>
2022-11-02 21:13:26 +08:00
sugarcase
fc02c8887b
[doc] fix ds_tcn params count in hi_xiaowen recipe (#96)
Co-authored-by: sugarcase <924708577@qq.com>
2022-09-23 09:20:16 +08:00
02Bigboy
de234ef4c7
[doc] update result on hi_xiaowen (#94)
Co-authored-by: 02bigboy <wangjie2017@mail.nwpu.edu.cn>
2022-09-21 14:35:06 +08:00
Binbin Zhang
1a90b6dca7
[doc] update spec_aug result on hi_xiaowen (#92) 2022-09-19 10:00:22 +08:00
Binbin Zhang
9f29e033aa
[examples] remove static quantization (#87) 2022-09-12 15:49:28 +08:00
Binbin Zhang
c9a262866f
[wekws] rename kws to wekws (#76)
* [wekws] rename kws to wekws

* fix lint
2022-08-27 11:57:44 +08:00
ryoha
41a3432198
fix export in export_onnx (#71) 2022-05-29 09:30:53 +08:00
Cyan
7d142b9528
[examples] refactor FAR computation to support long audio test (#64)
* add .gitattributes

* add long wav

* fix some bugs

* updated lint error

* back the hi_xiaowen/run.sh to the same

* remove the space

* better one

* remove 'num_keyword' parameter

* remove files

* flask8 examine

* override the score and compute_det file

* remove defaultdict

* remove import defaultdict
2022-03-24 14:35:07 +08:00
Menglong Xu
66fcfa2ce5
[doc] add result on GSC dataset (#61)
* [examples] reset grad_clip

* [doc] add basic result of mdtc model
2022-02-13 19:53:19 +08:00
Menglong Xu
d805c55560
[examples] update to use torchrun launch (#60) 2022-02-11 14:51:00 +08:00
Binbin Zhang
57021924cb
[kws] support onnx export (#53) 2022-01-15 13:50:34 +08:00
Menglong Xu
f622c55b04
[kws] update parameter for plotting det curve (#54) 2021-12-17 20:52:45 +08:00
Menglong Xu
768900307a
[kws] add code for plotting det curve (#52)
* [kws] add code for plotting det curve

* format

* format

* format

* format

* [kws] add code for plotting det curve

format

format

format

format

* set xlim and ylim by parameter

* set xlim and ylim optional

* update help information

* update parser type

* Update run.sh
2021-12-16 18:21:04 +08:00
Menglong Xu
6a58993390
[examples] update to use torchrun launch (#50) 2021-12-15 21:03:59 +08:00
Menglong Xu
566baca343
[examples] update ds_tcn config for hey_snips (#49) 2021-12-15 19:31:58 +08:00
Binbin Zhang
dc1ac8fecd
[examples] use big model for ds_tcn (#47) 2021-12-15 11:12:24 +08:00
Binbin Zhang
e3bfcf9f4e
[doc] add quantize result (#46) 2021-12-15 11:07:59 +08:00
Binbin Zhang
f86a797b10
[kws] add static quantize (#44)
* [kws] add static quantize

* refine lint error in shuffle_list.py

* refine lint

* fix topo
2021-12-14 14:32:54 +08:00
Binbin Zhang
a61db05ff4
[examples] add weight decay in aishell (#43) 2021-12-13 19:53:24 +08:00
Binbin Zhang
171309bd9e
[bin] use torchrun to launch ddp training (#42) 2021-12-13 19:45:55 +08:00
Binbin Zhang
fd255fd7c6
[examples] update spec aug parameters in hi xiaowen(#40) 2021-12-10 15:48:16 +08:00
Binbin Zhang
bc8d9f1c37
[examples] fix mdtc small config in hi_xiaowen (#39) 2021-12-09 17:53:34 +08:00
Menglong Xu
4a875776e5
[example] support hey_snips_kws_4.0 dataset (#38)
* [example] support hey_snips_kws_4.0 dataset

* format

* format
2021-12-08 23:46:05 +08:00
xiaohou
afbc1d2960
[example] add testing code for speech command dataset (#32)
* update run.sh

* update run.sh

* rename test.py to compute_accuracy.py

* update run,sh
2021-12-07 10:56:30 +08:00
Binbin Zhang
92a4c19ffe
[examples] use ds_tcn as default model (#34)
* [examples] use ds_tcn as default model

* fix scoring gpu id
2021-12-07 10:36:38 +08:00
xiaohou
37f56db5af
[exampels] add speechcommand train (#30)
* [example] added code for training speech command dataset

* update kes_model.py

* update kes_model.py

* format

* format

* add more comments to explain the new classifier designed for speech command classification task

* add copyrigh info

* update copyrigh info of classifier.py
2021-12-06 17:14:33 +08:00
xiaohou
8be4bef405
[examples] speech command data prepare (#27)
* [examples] added speech command data preparation code

* update

* updata path.sh
2021-12-06 12:00:25 +08:00
Menglong Xu
88444ab177
[examples] correct a spelling mistake (#24) 2021-12-05 21:24:06 +08:00
Binbin Zhang
dfe8b2536b
Revert "[recipe] suport speech command dataset (#21)" (#22)
This reverts commit c48c959807e7e80cdd514be9bd019b16e3b816eb.
2021-12-04 13:55:58 +08:00
xiaohou
c48c959807
[recipe] suport speech command dataset (#21)
* [recipe] suport speech command dataset

* format

* format

* format

* update run.sh
2021-12-03 21:07:42 +08:00
lxiao336
ba6919baaf
modifications to get the mdtc model torch-scriptable (#14)
* modifying some implmentations of mdtc to get the model torch-scripting through

* modifications to get the mdtc model torch-scriptable

Co-authored-by: lxiao336 <shawl336@163.com>
2021-11-29 11:15:30 +08:00
jingyong hou
9aaa4fc26c add mannul random seed so we can reproduce the experimental results 2021-11-19 16:23:03 +08:00
jingyong hou
edfc6de743 add results of mdtc 2021-11-19 15:31:11 +08:00
Jingyong Hou
674c372c9a formatting 2021-11-11 09:40:05 +08:00
Jingyong Hou
8f5e1beee4 formatting 2021-11-11 09:37:41 +08:00
Jingyong Hou
3326c6d37f formatting 2021-11-11 09:30:37 +08:00
Jingyong Hou
7df9ced666 fixed bug of compute_cmvn_stats.py 2021-11-10 22:40:21 +08:00
Jingyong Hou
4db050eb67 add model mdtc for mobvoi-hotword example 2021-11-10 22:13:46 +08:00
Binbin Zhang
dbebee86fd [examples] support hi xiaowen dataset 2021-11-10 18:57:52 +08:00