112 Commits

Author SHA1 Message Date
Jean Du
059fd87a8f
[ctc]fix false rejection result from long time silence. (#149)
* [ctc]fix false rejection result from long time silence.

* fix list index out of range.

---------

Co-authored-by: dujing <dujing@xmov.ai>
2023-10-09 17:34:36 +08:00
Di Wu
58859d580d
[wekws] fix log (only one process print model) (#150)
Co-authored-by: di.wu <di.wu@diwudeMacBook-Pro.local>
2023-10-09 17:33:52 +08:00
Tiance Wang
6ae98ef111
[fix] Convert target to torch.int64 for cross_entropy (#141)
On my machine the original code threw an error 
RuntimeError: "nll_loss_forward_reduce_cuda_kernel_2d_index" not implemented for 'Int'

I followed https://github.com/wenet-e2e/wekws#installation to setup the environment so I'm curious if this error has ever occured to other people.
2023-09-10 09:02:28 +08:00
Menglong Xu
2c3c9ce383
[fix] Fix fsmn export error (#143) (#146)
* fix: Fix fsmn export error (#143)

* fix flake8

* fix flake8
2023-09-10 09:01:15 +08:00
linlin
694928d706
fix: Android Demo detect invalid when the button is pressed again (#142)
* fix: Android Demo detect invalid when the button is pressed again

* fix: Android demo detect invalid when the button is pressed again

---------

Co-authored-by: chuhonglin <chuhonglin@papegames.com>
2023-08-31 11:49:01 +08:00
Jean Du
b233d46552
[ctc] KWS with CTCloss training and CTC prefix beam search detection. (#135)
* add ctcloss training scripts.

* update compute_det_ctc

* fix typo.

* add fsmn model, can use pretrained kws model from modelscope.

* Add streaming detection of CTC model. Add CTC model onnx export. Add CTC model's result in README; For now CTC model runtime is not supported yet.

* QA run.sh, maxpooling training scripts is compatible. Ready to PR.

* Add a streaming kws demo, support fsmn online forward

* fix typo.

* Align Stream FSMN and Non-Stream FSMN, both in feature extraction and model forward.

* fix repeat activation, add a interval restrict.

* fix timestamp when subsampling!=1.

* fix flake8, update training script and README, give pretrained ckpt.

* fix quickcheck and flake8

* Add realtime CTC-KWS demo in README.

---------

Co-authored-by: dujing <dujing@xmov.ai>
2023-08-16 10:07:04 +08:00
Hongji Wang
85350c38a8
[lint] fix flake8 errors (#125) 2023-03-11 13:10:08 +08:00
Hongji Wang
846524fc8c
[doc] update paper link in README.md (#124)
* [doc] update paper link in README.md

* [wekws] remove unused variables in wekws/utils/executor.py
2023-02-09 17:01:28 +08:00
Binbin Zhang
e08fb683de
[doc] add roadmap (#122) 2022-12-11 10:59:30 +08:00
Menglong Xu
6da85d4662
[wekws] add online noise and rir argumentation (#115)
* [wekws] add online noise and rir  argumentation

* format

* format

* update copyright

Co-authored-by: menglong.xu <menglong.xu>
2022-11-28 21:12:26 +08:00
Binbin Zhang
5c6088f947
[doc] add paper link (#112) 2022-11-24 17:25:39 +08:00
veelion
184e8a8da4
[runtime/android] Add how to run on Android, and change onnxruntime to 1.12.1 (#111)
* add Usage

* LookupCustomMetadataMap() is deprecated in 1.13.1, so change to 1.12.1

Co-authored-by: weiliang <weiliang.chong@day-care.cn>
2022-11-24 17:17:43 +08:00
Liangcd
16e20ed0f2
[tcn] remove unused variables (#107) 2022-11-17 16:30:54 +08:00
Menglong Xu
64ccd5bb86
[wekws] add cache support for mdtc (#105)
* [wekws] add cache support for mdtc

* format

Co-authored-by: 02Bigboy <570843154@qq.com>
2022-11-08 09:18:18 +08:00
xiaoqiang306
80285fa696
[fix] fix export model input parameter in speechcommand example (#101)
Co-authored-by: jiqiang.fu <jiqiang.fu@rokid.com>
2022-11-02 21:13:26 +08:00
彭震东
f3d6a0a40e
[docs] fix typo (#100) 2022-11-01 16:55:39 +08:00
彭震东
3abedc209d
[runtime] support raspberry pi (#99)
* [runtime] support raspberry pi

* [runtime] add readme for raspberry pi
2022-11-01 16:52:46 +08:00
sugarcase
fc02c8887b
[doc] fix ds_tcn params count in hi_xiaowen recipe (#96)
Co-authored-by: sugarcase <924708577@qq.com>
2022-09-23 09:20:16 +08:00
02Bigboy
de234ef4c7
[doc] update result on hi_xiaowen (#94)
Co-authored-by: 02bigboy <wangjie2017@mail.nwpu.edu.cn>
2022-09-21 14:35:06 +08:00
Binbin Zhang
1a90b6dca7
[doc] update spec_aug result on hi_xiaowen (#92) 2022-09-19 10:00:22 +08:00
Binbin Zhang
ff5dcf29ed
[runtime/android] refine android demo, use hey_snips as the default wakeup word (#90) 2022-09-12 21:55:06 +08:00
彭震东
4bacb81f7f
[android] add build.gradle (#89) 2022-09-12 19:09:44 +08:00
彭震东
508938f537
[android] add build.gradle and rename model name (#88) 2022-09-12 17:50:05 +08:00
Binbin Zhang
9f29e033aa
[examples] remove static quantization (#87) 2022-09-12 15:49:28 +08:00
彭震东
0d9237b8c0
[runtime/android] add android runtime (#83)
* [android] init android runtime

* [android] add voice rectangle view

* [android] finished

* [android] fix lint
2022-09-07 15:25:45 +08:00
Binbin Zhang
1ad3102c8c
[fix] fix mdtc training cache (#82) 2022-09-01 18:25:43 +08:00
Binbin Zhang
490a474d4e
[fix] fix training and export error (#81) 2022-08-28 16:49:24 +08:00
Binbin Zhang
50354a38e0
[fix/runtime] fix topo error (#80) 2022-08-28 16:35:19 +08:00
Binbin Zhang
53d7b8f807
[runtime/onnxruntime] add onnxruntime support (#79)
* [runtime/onnxruntime] add onnxruntime support

* add cpplint and clang-format

* fix lint
2022-08-28 13:35:21 +08:00
Binbin Zhang
5037d51ed9
[wekws] add cache support (#78) 2022-08-27 16:44:22 +08:00
Binbin Zhang
8aa68ad750
[doc] rename conda env to wekws (#77) 2022-08-27 16:23:10 +08:00
Binbin Zhang
c9a262866f
[wekws] rename kws to wekws (#76)
* [wekws] rename kws to wekws

* fix lint
2022-08-27 11:57:44 +08:00
Wall.E
51f0fe6dc3
fixed the parameter transfer problem for criterion (#75)
* fixed the parameter transfer problem for criterion

Co-authored-by: yangyyt <yuntingyang@yuntingdeMacBook-Pro.local>
2022-07-13 23:50:48 +08:00
胡大炮
141d40704f
[fix bug] add optimizer.zero_grad() in kws/utils/executor.py (#72) (#73)
* fix bug in kws/utils/executor.py (#72)

* [fix bug] add zero_grad() above backward() in kws/utils/executor.py (#72)
2022-06-05 22:39:26 +08:00
ryoha
41a3432198
fix export in export_onnx (#71) 2022-05-29 09:30:53 +08:00
Binbin Zhang
663a31d9ea
Update doc.yml (#68) 2022-04-14 16:06:02 +08:00
Cyan
015748b94e
learning rate won't initiate from 0.001 when continuing training from checkpoint (#67)
* add .gitattributes

* add long wav

* fix some bugs

* updated lint error

* back the hi_xiaowen/run.sh to the same

* remove the space

* better one

* remove 'num_keyword' parameter

* remove files

* flask8 examine

* override the score and compute_det file

* remove defaultdict

* remove import defaultdict

* learning rate won't initial from 0.001 when continuing training from checkpoint

* fix intent bug with initial learning rate != 0.001
2022-04-14 16:02:18 +08:00
Cyan
7d142b9528
[examples] refactor FAR computation to support long audio test (#64)
* add .gitattributes

* add long wav

* fix some bugs

* updated lint error

* back the hi_xiaowen/run.sh to the same

* remove the space

* better one

* remove 'num_keyword' parameter

* remove files

* flask8 examine

* override the score and compute_det file

* remove defaultdict

* remove import defaultdict
2022-03-24 14:35:07 +08:00
Menglong Xu
ff4b47f94d
[kws] update cross_entropy loss (#62)
* [kws] update cross_entropy loss

replace nn.CrossEntropyLoss() with F.cross_entropy()

* format
2022-03-15 19:34:28 +08:00
Menglong Xu
66fcfa2ce5
[doc] add result on GSC dataset (#61)
* [examples] reset grad_clip

* [doc] add basic result of mdtc model
2022-02-13 19:53:19 +08:00
Menglong Xu
d805c55560
[examples] update to use torchrun launch (#60) 2022-02-11 14:51:00 +08:00
lxiao336
db2685d1a4
[tools] add a bash script that trimmes silence using sox and split-based multi-processing (#56)
Co-authored-by: hp <shawl336@163.com>
2022-01-15 13:54:44 +08:00
Binbin Zhang
57021924cb
[kws] support onnx export (#53) 2022-01-15 13:50:34 +08:00
Binbin Zhang
665df6113e
[doc] update qr code (#55) 2021-12-21 17:38:01 +08:00
Menglong Xu
f622c55b04
[kws] update parameter for plotting det curve (#54) 2021-12-17 20:52:45 +08:00
Menglong Xu
768900307a
[kws] add code for plotting det curve (#52)
* [kws] add code for plotting det curve

* format

* format

* format

* format

* [kws] add code for plotting det curve

format

format

format

format

* set xlim and ylim by parameter

* set xlim and ylim optional

* update help information

* update parser type

* Update run.sh
2021-12-16 18:21:04 +08:00
Menglong Xu
20891f90e6
Merge pull request #51 from wenet-e2e/binbin-activation
[kws] put activation in model, so the activation could be exported in…
2021-12-15 21:32:52 +08:00
Binbin Zhang
8943acb51f [kws] put activation in model, so the activation could be exported in script model 2021-12-15 21:10:27 +08:00
Menglong Xu
6a58993390
[examples] update to use torchrun launch (#50) 2021-12-15 21:03:59 +08:00
Menglong Xu
566baca343
[examples] update ds_tcn config for hey_snips (#49) 2021-12-15 19:31:58 +08:00