-
Notifications
You must be signed in to change notification settings - Fork 59
/
Copy path人工智能.txt
4474 lines (3914 loc) · 200 KB
/
人工智能.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
机器人
https://github.com/peng-zhihui/ElectronBot
https://github.com/peng-zhihui/Dummy-Robot
https://github.com/peng-zhihui/L-ink_Card
https://github.com/peng-zhihui/PocketLCD
https://github.com/peng-zhihui/XUAN
https://github.com/peng-zhihui/HDMI-PI
https://github.com/peng-zhihui/HoloCubic
图片中文识别
https://github.com/PaddlePaddle/PaddleOCR
https://github.com/breezedeus/cnocr
python3 scripts/cnocr_predict.py --file text.png
文字处理
https://github.com/hankcs/HanLP
https://github.com/hankcs/pyhanlp
https://github.com/ownthink/Jiagu
分词
hanlp segment <<< '欢迎新老师生前来就餐'
句法分析
hanlp parse <<< '欢迎新老师生前来就餐'
关键词提取
HanLP.extractKeyword('欢迎新老师生前来就餐', 2)
自动摘要
HanLP.extractSummary('欢迎新老师生前来就餐', 3)
依存句法分析
HanLP.parseDependency('欢迎新老师生前来就餐')
语音识别
https://github.com/nl8590687/ASRT_SpeechRecognition
python3 asrserver.py
https://github.com/kaldi-asr/kaldi
http://kaldi-asr.org/
https://github.com/jackyyy0228/Chinese-ASR
TTS
https://github.com/espnet/espnet
预训练模型镜像
https://coggle.club/note/dl/pretrained-models
文字转语音
大声说一句:
say {{"I like to ride my bike."}}
大声朗读文件:
say -f {{filename.txt}}
播放自定义语音和语速的短语:
say -v {{voice}} -r {{words_per_minute}} {{"I'm sorry Dave, I can't let you do that."}}
列出可用的声音:
say -v ?
创建语音文本的音频文件:
say -o {{filename.aiff}} {{"Here's to the Crazy Ones."}}
清华大学自然语言处理
https://github.com/thunlp/THULAC-Python
python3 test.py
python3 -m thulac input.txt output.txt
结巴分词
https://github.com/fxsjy/jieba
处理中文文本内容
https://github.com/isnowfy/snownlp
https://github.com/tsroten/pynlpir
https://github.com/stacklikemind/deepnude_official
https://github.com/lwlodo/deep_nude/
https://github.com/emperorwushi/xi/
https://github.com/NVIDIA/pix2pixHD
Text-To-Speech
######
多语言:
https://github.com/pndurette/gTTS
pip install gTTS
gtts-cli 'hello' --output hello.mp3
https://github.com/cboard-org/cboard
https://github.com/zlargon/google-tts/
npm install google-tts-api --save
https://github.com/vilic/cordova-plugin-tts
https://github.com/naoufal/react-native-speech
web:
https://github.com/guest271314/SpeechSynthesisRecorder
https://github.com/kripken/speak.js
https://github.com/Marak/say.js
命令行:
https://www.npmjs.com/package/voc-cli
py:
https://github.com/buriburisuri/speech-to-text-wavenet
https://github.com/readbeyond/aeneas
http://espeak.sourceforge.net/test/latest.html
https://github.com/Kyubyong/tacotron
完全端到端的文本到语音合成模型,主要是将文本转化为语音,使用了预训练模型(pre-trained)技术
https://github.com/keithito/tacotron
https://keithito.com/LJ-Speech-Dataset/
https://librivox.org/
https://github.com/DragonComputer/Dragonfire
https://github.com/r9y9/deepvoice3_pytorch
git clone https://github.com/r9y9/deepvoice3_pytorch && cd deepvoice3_pytorch
pip install -e ".[bin]"
python synthesis.py --preset=20180505_deepvoice3_ljspeech.json \
20180505_deepvoice3_checkpoint_step000640000.pth \
sentences.txt \
output_dir
python preprocess.py --preset=presets/deepvoice3_ljspeech.json ljspeech ~/data/LJSpeech-1.0
python train.py --preset=presets/deepvoice3_ljspeech.json --data-root=./data/ljspeech
python preprocess.py ljspeech ~/data/LJSpeech-1.0
# warning! this may use different hyper parameters used at preprocessing stage
python train.py --preset=presets/deepvoice3_ljspeech.json --data-root=./data/ljspeech
https://github.com/mozilla/TTS
https://github.com/hgneng/ekho #Chinese
http://www.eguidedog.net/ekho.php
speech-to-text
######
https://github.com/mozilla/DeepSpeech
pip3 install deepspeech
deepspeech --model models/output_graph.pbmm --alphabet models/alphabet.txt --lm models/lm.binary --trie models/trie --audio my_audio_file.wav
pip3 install deepspeech-gpu
deepspeech --model models/output_graph.pbmm --alphabet models/alphabet.txt --lm models/lm.binary --trie models/trie --audio my_audio_file.wav
pre-trained model
wget https://github.com/mozilla/DeepSpeech/releases/download/v0.1.1/deepspeech-0.1.1-models.tar.gz
tar xvfz deepspeech-0.1.1-models.tar.gz
audio files
wget https://github.com/mozilla/DeepSpeech/releases/download/v0.1.1/audio-0.1.1.tar.gz
tar xvfz audio-0.1.1.tar.gz
deepspeech models/output_graph.pb audio/2830-3980-0043.wav models/alphabet.txt models/lm.binary models/trie
Text: experience proves this
deepspeech models/output_graph.pb audio/4507-16021-0012.wav models/alphabet.txt models/lm.binary models/trie
Text: why should one halt on the way
deepspeech models/output_graph.pb audio/8455-210777-0068.wav models/alphabet.txt models/lm.binary models/trie
Text: your power is sufficient i said
deepspeech --model models/output_graph.pbmm --alphabet models/alphabet.txt --lm models/lm.binary --trie models/trie --audio my_audio_file.wav
To download the pre-built binaries, use util/taskcluster.py:
python3 util/taskcluster.py --target .
or if you're on macOS:
python3 util/taskcluster.py --arch osx --target .
https://github.com/asticode/go-astideepspeech
http://www.cstr.ed.ac.uk/projects/festival/
采用densenet识别图中文字
https://github.com/yinchangchang/ocr_densenet
阿里云语音验证码
https://github.com/qingdie/qingdie-aliyun
https://dysmsapi.aliyuncs.com/ #短信验证码
https://dyvmsapi.aliyuncs.com/ #语音验证码
https://github.com/kerlomz/captcha_trainer
https://github.com/kerlomz/captcha_library_c
https://github.com/kerlomz/captcha_demo_csharp
https://github.com/kerlomz/captcha_platform
https://mp.weixin.qq.com/s/6IAEus9OTg-hP9NGKJRm_Q
字典大全
http://www.zd9999.com/
https://github.com/GopherCoder/dictionary-of-chinese
https://github.com/pwxcoo/chinese-xinhua
古诗词
https://github.com/chinese-poetry/chinese-poetry
https://github.com/KomaBeyond/chinese-poetry-mysql
https://github.com/Werneror/Poetry
泼辣有图
http://www.polayoutu.com/collections
查天气
https://github.com/tangjiahao/robotofwx/blob/master/robotmain.py
https://api.seniverse.com/v3/weather/now.json?key=Skb40T46PiBDM35V2&location=%s&language=zh-Hans&unit=c
https://free-api.heweather.net/s6/weather?location=%s&key=a3269a0918a44a62ae97c314dd24f02a
酷狗音乐
http://www.kugou.com/yy/index.php?r=play/getdata&hash=%s&album_id=%s&_=1497972864535
https://wwwapi.kugou.com/yy/index.php?r=play/getdata&callback=jQuery191014887140948582345_1557824383110&hash=%s&album_id=%s&dfid=0zpwSa44LtGp0D89Gr371MJb&mid=51eafc9b0e5eaca4e106b905175401ec&platid=4&_=1557824383112
http://songsearch.kugou.com/song_search_v2?keyword=%spage=1&pagesize=3&userid=-1&clientver=&platform=WebFilter&tag=em&filter=2&iscorrection=1&privilege_filter=0
https://github.com/tangjiahao/robotofwx/blob/master/robotmain.py
机器学习
https://github.com/eriklindernoren/ML-From-Scratch
https://github.com/NELSONZHAO/zhihu
机器翻译machine translation(NMT)
https://github.com/tensorflow/nmt
https://github.com/OpenNMT/OpenNMT-py
http://opennmt.net/OpenNMT-py/speech2text.html
https://github.com/OpenNMT/OpenNMT
https://github.com/THUNLP-MT/MT-Reading-List
https://github.com/xuwenshen/Machine-Translation
https://github.com/foamliu/Machine-Translation-v2
英汉词典
https://github.com/ChestnutHeng/Wudao-dict
https://github.com/program-in-chinese/vscode_english_chinese_dictionary
https://github.com/skywind3000/ECDICT
https://github.com/skywind3000/ECDICT/releases
https://github.com/program-in-chinese/webextension_english_chinese_dictionary
https://github.com/fxsjy/diaosi
https://github.com/chienlungcheung/MyDict
手机短信验证码语音验证码话费充值流量充值
https://github.com/gitchenze/panguPhone
http://www.miaodiyun.com/
语音识别
https://github.com/xxbb1234021/speech_recognition
训练数据下载 清华大学中文语料库(thchs30)http://www.openslr.org/18/
训练
配置conf目录下的conf.ini文件中的各项
在终端运行 python train.py 开始训练
在终端运行 python test.py 测试
也可以使用PyCharm打开
wav 文件转 16k 16bits 位深的单声道pcm文件
ffmpeg -y -i 16k.wav -acodec pcm_s16le -f s16le -ac 1 -ar 16000 16k.pcm
44100 采样率 单声道 16bts pcm 文件转 16000采样率 16bits 位深的单声道pcm文件
ffmpeg -y -f s16le -ac 1 -ar 44100 -i test44.pcm -acodec pcm_s16le -f s16le -ac 1 -ar 16000 16k.pcm
mp3 文件转 16K 16bits 位深的单声道 pcm文件
ffmpeg -y -i aidemo.mp3 -acodec pcm_s16le -f s16le -ac 1 -ar 16000 16k.pcm
// -acodec pcm_s16le pcm_s16le 16bits 编码器 // -f s16le 保存为16bits pcm格式 // -ac 1 单声道 // -ar 16000 16000采样率
Facebook AI Research的自动语音识别工具包
https://github.com/facebookresearch/wav2letter
https://github.com/brightmart/roberta_zh
https://github.com/facebookresearch/SlowFast 视频分类/视频理解/行为检测
中文语音识别 AISHELL
https://github.com/libai3/masr
识别自己的语音
brew install portaudio
pip3 install pyaudio
语言模型
https://deepspeech.bj.bcebos.com/zh_lm/zh_giga.no_cna_cmn.prune01244.klm
https://github.com/Uberi/speech_recognition
https://pypi.org/project/pocketsphinx/
pip3 install SpeechRecognition
pip3 install https://github.com/bambocher/pocketsphinx-python/archive/master.zip
brew install cmu-pocketsphinx cmu-sphinxbase cmu-sphinxtrain cmuclmtk
https://realpython.com/python-speech-recognition/
https://blog.csdn.net/weixin_40490238/article/details/84841825
https://sourceforge.net/projects/cmusphinx/files/Acoustic%20and%20Language%20Models/Mandarin/cmusphinx-zh-cn-5.2.tar.gz/download
https://jaist.dl.sourceforge.net/project/cmusphinx/Acoustic%20and%20Language%20Models/Mandarin/cmusphinx-zh-cn-5.2.tar.gz
解压到
cd /usr/local/lib/python3.7/site-packages/speech_recognition/pocketsphinx-data
mkdir -p zh-CN/acoustic-model
zh_broadcastnews_16k_ptm256_8000.tar.bz2解压缩到zh-CN/acoustic-model
zh_broadcastnews_utf8.dic重命名为pronounciation-dictionary.dict并放入\zh-CN文件夹
SphinxBase工具将zh_broadcastnews_64000_utf8.DMP转换成language-model.lm.bin并放入\zh-CN文件夹下
pocketsphinx_continuous -hmm /usr/local/lib/python3.7/site-packages/speech_recognition/pocketsphinx-data/zh-CN/acoustic-model/ -lm zh_broadcastnews_64000_utf8.DMP -dict pronounciation-dictionary.dic
pocketsphinx_continuous -hmm zh_broadcastnews_ptm256_8000 -lm zh_broadcastnews_64000_utf8.DMP -dict zh_broadcastnews_utf8.dic -infile myfile-16000.wav > myfile.txt
pocketsphinx_continuous -inmic yes -hmm ../share/pocketsphinx/model/cmusphinx-zh-cn-5.2/zh_cn.cd_cont_5000 -lm ../share/pocketsphinx/model/cmusphinx-zh-cn-5.2/zh_cn.lm.bin -dict ../share/pocketsphinx/model/cmusphinx-zh-cn-5.2/zh_cn.dic
pocketsphinx_continuous -inmic yes -hmm /usr/local/pocketsphinx/share/pocketsphinx/model/cmusphinx-zh-cn-5.2/zh_cn.cd_cont_5000 -lm ./4648.lm -dict ./4648.dic
https://github.com/cmusphinx/sphinxbase
.\sphinx_lm_convert.exe -i .\zh_broadcastnews_64000_utf8.DMP -o language-model.lm -ofmt arpa
.\sphinx_lm_convert.exe -i .\language-model.lm -o language-model.lm.bin
sphinx_lm_convert -i zh_broadcastnews_64000_utf8.DMP -o language-model.lm -ofmt arpa
sphinx_lm_convert -i language-model.lm -o language-model.lm.bin
https://www.cnblogs.com/henjay724/p/9576670.html
http://www.speech.cs.cmu.edu/tools/lextool.html
http://www.speech.cs.cmu.edu/tools/lmtool-new.html
pip3 install cmudict
vi test.txt
窗口 ch uang k ou
打开 d a k ai
关闭 g uan b i
记事本 j i sh ib b en
浏览器 l iu l an q i
音乐 y in uxs uxe
http://www.speech.cs.cmu.edu/tools/lmtool-new.html
pocketsphinx_continuous -lm 6177.lm -dict 6177.dic
/usr/local/Cellar/cmu-pocketsphinx/0.8/share/pocketsphinx/model/hmm/en_US/hub4wsj_sc_8k
pocketsphinx_continuous -hmm tdt_sc_8k -lm 6177.lm -dict 6177.dic
/usr/local/share/pocketsphinx/model/hmm/zh/tdt_sc_8k/
/usr/local/Cellar/cmu-pocketsphinx/0.8/share/pocketsphinx/model/hmm/zh/tdt_sc_8k/
训练大文本数据的语言模型
vi weather.txt
<s> 天气 </s>
<s> 有雨 </s>
<s> 晴朗 </s>
<s> 多云 </s>
<s> 雷电 </s>
产生词汇表vocabulary文件:
text2wfreq < weather.txt | wfreq2vocab > weather.vocab
命令text2wfreq:统计文本文件中每个词出现的次数,得到一个后缀为wfreq的文件
命令wfreq2vocab:统计文本文件中含有多少个词,即有哪些词。
生成 arpa格式的语言模型:
text2idngram -vocab weather.vocab -idngram weather.idngram < weather.txt
idngram2lm -vocab_type 0 -idngram weather.idngram -vocab weather.vocab -arpa weather.arpa
转换为 CMU的二进制格式 (DMP):
sphinx_lm_convert -i weather.arpa -o weather.lm.DMP
cp -a /usr/local/share/pocketsphinx/model/hmm/zh/tdt_sc_8k .
sphinx_fe -argfile tdt_sc_8k/feat.params -samprate 16000 -c arctic20.fileids -di . -do . -ei wav -eo mfc -mswav yes
https://www.cnblogs.com/qiuhong/articles/3671991.html
sphinx_lm_convert -i model.lm -o model.dmp
sphinx_lm_convert -i model.dmp -ifmt dmp -o model.lm -ofmt arpa
http://www.voidcn.com/article/p-tiryhtrm-zk.html
rec_wav.sh
for i in `seq 1 12`; do
fn=`printf arctic_%04d $i`;
read sent; echo $sent;
rec -r 16000 -e signed-integer -b 16 -c 1 $fn.wav 2>/dev/null;
done < arctic20.txt
Raspberry PI语音控制-PocketSphinx
https://my.oschina.net/RagingTyphoon/blog/493072
IBM:
https://github.com/watson-developer-cloud/speech-to-text-nodejs
https://stream-wdc.watsonplatform.net/speech-to-text/api
https://gateway-syd.watsonplatform.net/speech-to-text/api
https://speech-to-text-demo.ng.bluemix.net/
GOOGLE:
https://console.developers.google.com/
http://www.chromium.org/developers/how-tos/api-keys
https://github.com/gillesdemey/google-speech-v2
brew install sox
rec --encoding signed-integer --bits 16 --channels 1 --rate 16000 test.wav
curl -X POST \
--data-binary @'audio/hello (16bit PCM).wav' \
--header 'Content-Type: audio/l16; rate=16000;' \
'https://www.google.com/speech-api/v2/recognize?output=json&lang=en-us&key=yourkey'
curl -X POST \
--data-binary @audio/good-morning-google.flac \
--header 'Content-Type: audio/x-flac; rate=44100;' \
'https://www.google.com/speech-api/v2/recognize?output=json&lang=en-us&key=yourkey'
https://github.com/evancohen/sonus
npm install --save sonus
科大讯飞
https://www.xfyun.cn/
https://www.xfyun.cn/services/voicedictation
http://member.voicecloud.cn/index.php/default/register
https://www.xfyun.cn/solutions/robots
http://www.devstore.cn/evaluation/testInfo/107-127.html
文字转拼音
https://github.com/janx/ruby-pinyin
https://github.com/sofish/han
微信CLIENT
https://github.com/trazyn/weweChat #只有PC
开源IM
https://github.com/hcxiong/xuanxuan #只有PC
https://github.com/meili/TeamTalk
https://github.com/YiChat
https://github.com/duckchat/gaga
https://github.com/dianbaer/anychat
https://github.com/zulip
https://github.com/gunthercox/ChatterBot
https://github.com/pandolia/qqbot
https://github.com/huangzk/qqchatbot
https://gitee.com/airgzn/QQChatBot
https://gitee.com/airgzn/xiaofeichatbot
APPLE
https://developer.apple.com/documentation/avfoundation/speech_synthesis
https://github.com/CoderTitan/TextAndVoice
http://ai.youdao.com/
https://openapi.youdao.com/api
https://openapi.youdao.com/ocrtransapi
https://openapi.youdao.com/speechtransapi
https://openapi.youdao.com/ocrapi
https://openapi.youdao.com/ocr_structure
https://openapi.youdao.com/ocr_formula
https://www.cnblogs.com/alchemystar/p/13668470.html
http://ai.youdao.com/DOCSIRMA/html/%E8%87%AA%E7%84%B6%E8%AF%AD%E8%A8%80%E7%BF%BB%E8%AF%91/API%E6%96%87%E6%A1%A3/%E6%96%87%E6%9C%AC%E7%BF%BB%E8%AF%91%E6%9C%8D%E5%8A%A1/%E6%96%87%E6%9C%AC%E7%BF%BB%E8%AF%91%E6%9C%8D%E5%8A%A1-API%E6%96%87%E6%A1%A3.html
http://ai.youdao.com/DOCSIRMA/html/%E8%87%AA%E7%84%B6%E8%AF%AD%E8%A8%80%E7%BF%BB%E8%AF%91/API%E6%96%87%E6%A1%A3/%E8%AF%AD%E9%9F%B3%E7%BF%BB%E8%AF%91%E6%9C%8D%E5%8A%A1/%E8%AF%AD%E9%9F%B3%E7%BF%BB%E8%AF%91%E6%9C%8D%E5%8A%A1-API%E6%96%87%E6%A1%A3.html
http://ai.youdao.com/DOCSIRMA/html/%E8%87%AA%E7%84%B6%E8%AF%AD%E8%A8%80%E7%BF%BB%E8%AF%91/API%E6%96%87%E6%A1%A3/%E5%9B%BE%E7%89%87%E7%BF%BB%E8%AF%91%E6%9C%8D%E5%8A%A1/%E5%9B%BE%E7%89%87%E7%BF%BB%E8%AF%91%E6%9C%8D%E5%8A%A1-API%E6%96%87%E6%A1%A3.html
http://ai.youdao.com/DOCSIRMA/html/%E6%96%87%E5%AD%97%E8%AF%86%E5%88%ABOCR/API%E6%96%87%E6%A1%A3/%E9%80%9A%E7%94%A8OCR%E6%9C%8D%E5%8A%A1/%E9%80%9A%E7%94%A8OCR%E6%9C%8D%E5%8A%A1-API%E6%96%87%E6%A1%A3.html
BAIDU
```
https://github.com/Baidu-AIP/nodejs-sdk
npm install baidu-aip-sdk
http://ai.baidu.com/docs#/
http://yuyin.baidu.com/
https://github.com/ChenHao96/VoiceInteraction
https://github.com/eisneim/cytron_tts_gui
https://github.com/apetab/vbot-voice
http://tsn.baidu.com/text2audio
QQ&微信语音silk转换wav
brew install gcc ffmpeg
git clone https://github.com/kn007/silk-v3-decoder.git silk-v3-decoder
cd silk-v3-decoder/silk
make && make decoder
./decoder 123.silk 123.pcm
ffmpeg -y -f s16le -ar 24000 -ac 1 -i 123.pcm -f wav -ar 16000 -b:a 16 -ac 1 123.wav
https://www.jianshu.com/p/b092da81feb0
语音识别
len + speech方式
http://vop.baidu.com/server_api?format=wav&rate=16000&channel=1&token=&cuid=9e:eb:e8:d4:67:00&len=大小&speech=图片base64
url + callback方式
http://vop.baidu.com/server_api?format=wav&rate=16000&channel=1&token=&cuid=9e:eb:e8:d4:67:00&url=123.wav&callback=回调地址
http://tts.baidu.com/text2audio?lan=zh&ie=UTF-8&spd=2&text=
https://ai.baidu.com/aidemo?type=tns2&idx=1&tex=%s&cuid=baidu_speech_demo&cod=2&lan=zh&ctp=1&pdt=1&spd=5&per=4&vol=5&pit=5
http://tts.baidu.com/text2audio?lan=zh&ie=UTF-8&spd=2&text=你要转换的文字
https://openapi.baidu.com/oauth/2.0/token?grant_type=client_credentials&client_id={}&client_secret={}&
```
https://github.com/nl8590687/ASRT_SpeechRecognition
cp -rf datalist/* dataset/
目前可用的模型有24、25和251
本项目开始训练请执行:
$ python3 train_mspeech.py
本项目开始测试请执行:
$ python3 test_mspeech.py
测试之前,请确保代码中填写的模型文件路径存在。
ASRT API服务器启动请执行:
$ python3 asrserver.py
如果要训练和使用模型251,请在代码中 import SpeechModel 的相应位置做修改。
dataset/data_thchs30/train/*.wav
dataset/data_thchs30/dev/*.wav
dataset/data_thchs30/test/*.wav
dataset/ST-CMDS-20170001_1-OS/*.wav
https://github.com/nl8590687/ASRT_SpeechRecognition/wiki
https://github.com/apachecn/AiLearning
https://feisky.xyz/machine-learning/
自然语言处理 中文分词 词性标注 命名实体识别 依存句法分析 关键词提取 新词发现 短语提取 自动摘要 文本分类 拼音简繁
http://hanlp.com/
https://github.com/hankcs/HanLP
https://github.com/hankcs/pyhanlp
pip3 install pyhanlp
hanlp update
hanlp --help
hanlp segment <<< '欢迎新老师生前来就餐'
hanlp parse <<< '徐先生还具体帮助他确定了把画雄鹰、松鼠和麻雀作为主攻目标。'
https://github.com/fighting41love/cocoNLP
https://github.com/ownthink/Jiagu
https://github.com/PaddlePaddle/LARK/tree/develop/ERNIE
pip3 install paddlepaddle
https://mp.weixin.qq.com/s/nb3g1RV3fk2rm8a_v_ZUEA
https://mp.weixin.qq.com/s/osfV54FRU1vw5c4CZuSR1A
https://github.com/PaddlePaddle/book
http://www.paddlepaddle.org/documentation/docs/zh/1.2/beginners_guide/quick_start/index.html
https://github.com/explosion/spaCy
https://github.com/visipedia/iwildcam_comp
https://github.com/visipedia/inat_comp
https://github.com/macaodha/inat_comp_2018
https://github.com/Microsoft/AirSim
https://www.microsoft.com/en-us/ai/ai-for-earth?activetab=pivot1%3aprimaryr6
http://cocodataset.org/#download
https://hackaday.io/project/159737-spectra-open-biomedical-imaging
人工智能开发平台
https://github.com/ifeegoo/Prometheus
https://github.com/huanghe/ai
https://mp.weixin.qq.com/s/-y_01EBYVxiCwLddCvyFfg
https://github.com/intel-analytics/analytics-zoo
https://analytics-zoo.github.io/0.4.0/
以太坊智能合约+DApp 工作流实战案例:抽奖程序
https://github.com/wangshijun/ethereum-lottery-dapp
https://infura.io/project/08ed39a60be74cd78974ecfed000fe6f
https://infura.io/docs/gettingStarted/authentication
npm install wscat -g
wscat -c wss://mainnet.infura.io/ws/v3/08ed39a60be74cd78974ecfed000ff
> {"jsonrpc": "2.0", "id": 1, "method": "eth_blockNumber", "params": []}
npm install -g solc truffle ganache-cli
https://github.com/trufflesuite/ganache-cli
https://truffleframework.com/ganache
https://github.com/wangzukun/truffle4-demo
truffle init
推荐
https://mp.weixin.qq.com/s/E6EH6aJjzTwN2UZf_4nwoA
Synonyms 中文近义词工具包,可以用于自然语言理解的很多任务:文本对齐,推荐算法,相似度计算,语义偏移,关键字提取,概念提取,自动摘要,搜索引擎等
https://github.com/huyingxi/Synonyms
语言/知识表示工具
https://github.com/PaddlePaddle/LARK
句子、QA相似度匹配
https://github.com/NTMC-Community/MatchZoo
https://polyglot.readthedocs.io/en/latest/Installation.html
https://github.com/aboSamoor/polyglot
brew install polyglot
pip3 install polyglot
pyltp
https://github.com/HIT-SCIR/pyltp
https://mp.weixin.qq.com/s/gLzdYZVoegjAPmnMAUq19g
pip3 install pyltp
https://pyltp.readthedocs.io/zh_CN/develop/api.html
反向传递: https://www.cnblogs.com/charlotte77/p/5629865.html
CNN原理: http://www.cnblogs.com/charlotte77/p/7759802.html
RNN原理: https://blog.csdn.net/qq_39422642/article/details/78676567
LSTM深入浅出的好文: https://blog.csdn.net/roslei/article/details/61912618
语音翻译 面对面翻译小程序
https://github.com/Tencent/Face2FaceTranslator
中英文翻译
https://github.com/xuwenshen/Machine-Translation
https://github.com/liuhuanyong/ChineseTextualInference
https://github.com/quincyliang/nlp-public-dataset
中文语料
https://github.com/yanwii/machine-translation
https://github.com/brightmart/nlp_chinese_corpus
https://github.com/FeeiCN/dict
http://www.iciba.com/
http://www.iciba.com/hello
https://github.com/justinyhuang/BashCiba
https://github.com/Neoyyy/google-CommandLine-Translation-Tool
https://translate.google.cn/translate_a/single?hl=zh-CN&sl=zh-CN&tl=en&q=%E4%B8%AD%E5%9B%BD&client=tw-ob
http://www.baidu.com/
http://fanyi.baidu.com/basetrans
https://github.com/AnuoF/TranslateTool
http://fanyi.youdao.com/openapi?path=data-mode
http://fanyi.youdao.com/openapi.do?keyfrom=wufeifei&key=716426270&type=data&doctype=json&version=1.1&q=测试
代码安全审计
https://github.com/WhaleShark-Team/cobra
XLNET/NLP预训练新方法
自编码语言模型(Autoencoder LM)
https://github.com/zihangdai/xlnet
bert中文分类实践/ELMO
自回归语言模型(Autoregressive LM)
https://github.com/NLPScott/bert-Chinese-classification-task
https://github.com/yuanxiaosc/BERT_Paper_Chinese_Translation
https://github.com/terrifyzhao/bert-utils
http://icrc.hitsz.edu.cn/info/1037/1162.htm
https://github.com/NVIDIA/Megatron-LM
https://github.com/ymcui/Chinese-BERT-wwm
以TensorFlow版本为例,下载完毕后对zip文件进行解压得到:
chinese_wwm_L-12_H-768_A-12.zip
|- bert_model.ckpt # 模型权重
|- bert_model.meta # 模型meta信息
|- bert_model.index # 模型index信息
|- bert_config.json # 模型参数
|- vocab.txt # 词表
https://github.com/ymcui/cmrc2018
CMRC 2018数据集是哈工大讯飞联合实验室发布的中文机器阅读理解数据。根据给定问题,系统需要从篇章中抽取出片段作为答案,形式与SQuAD相同。
https://github.com/DRCKnowledgeTeam/DRCD
DRCD数据集由中国台湾台达研究院发布,其形式与SQuAD相同,是基于繁体中文的抽取式阅读理解数据集。
https://github.com/shiyybua/NER
中文命名实体识别(NER)任务中,我们采用了经典的人民日报数据以及微软亚洲研究院发布的NER数据。
THUCNews
http://thuctc.thunlp.org/
由清华大学自然语言处理实验室发布的新闻数据集,需要将新闻分成10个类别中的一个。
识别/塑造面部
https://deepfakes.com.cn/
https://deepfakes.com.cn/index.php/95.html
https://deepfakes.com.cn/index.php/265.html
https://deepfakes.com.cn/index.php/243.html
https://github.com/deepfakes/faceswap
https://github.com/deepfakes/faceswap/blob/master/INSTALL.md
从您的安装文件夹中运行python faceswap.py extract。这将从src文件夹拍摄照片并将面部提取到extract文件夹中。
从您的安装文件夹中运行python faceswap.py train。这将从包含两张脸的照片的两个文件夹中拍摄照片,并训练将保存在models文件夹内的模型。
从您的安装文件夹中运行python faceswap.py convert。这将从original文件夹中拍摄照片并将新面孔应用到modified文件夹中。
您可以通过运行来运行GUI python faceswap.py gui
换脸/换头
https://github.com/iperov/DeepFaceLab
https://radek350.wordpress.com/2018/02/17/myfakeapp-fakeapp-alternative/
https://github.com/sunattic/AISuperstar
https://github.com/joshua-wu/deepfakes_faceswap
https://github.com/llSourcell/deepfakes
https://github.com/dfaker/df
https://github.com/gsurma/face_generator
CTR
https://github.com/shenweichen/DeepCTR
经过预先训练的30多种语言的单词向量
https://github.com/hcxiong/wordvectors
人名、地址、邮箱、手机号、手机归属地 等信息的抽取,rake短语抽取算法。
pip3 install cocoNLP
清华大学XLORE:中英文跨语言百科知识图谱
https://xlore.org/ttl/xlore.all.zip
智能家居
https://github.com/apanly/piRobot
https://github.com/apanly/autohome
https://github.com/2shou/TextGrocery.git #短文本分类工具
BeautifulSoup(HTML/XML的解析器)
http://www.pm25.in/api_doc
https://www.faceplusplus.com.cn/
pip3 install jieba tgrocery
远程控制玩具车
https://github.com/pjq/rpi
sudo apt-get install libttspico-utils
https://github.com/GwadaLUG/pico-read-speaker
- libttspico-data (https://openrepos.net/content/mickaelh/libttspico-data)
- libttspico0 (https://openrepos.net/content/mickaelh/libttspico0)
- libttspico-utils (https://openrepos.net/content/mickaelh/libttspico-utils)
- libttspico-dev (https://openrepos.net/content/mickaelh/libttspico-dev)
or
- sudo apt-get install libttspico0 libttspico-utils libttspico-data
wget https://raw.githubusercontent.com/stevenmirabito/asterisk-picotts/master/picotts-install.sh -O - | sh
- svox (pico2wave) https://packages.debian.org/source/squeeze/svox
https://github.com/mscdex/speaky
https://github.com/grigi/talkey
https://pbxinaflash.com/community/threads/svox-pico-tts-for-asterisk.17859/
pico2wave -l fr-FR -w /tmp/test.wav "Ceci est un test"
aplay /tmp/test.wav
https://github.com/zaf/asterisk-googletts
brew install sox mpg123 pulseaudio espeak
soxi sox play
play existing-file.wav
sox existing-file.wav −d
https://www.google.com.hk/speech-api/v1/recognize?xjerr=1&client=chromium&pfilter=2&lang=zh-CN&maxresults=6
https://github.com/apanly/piRobot/blob/master/stt/google.py
cd /
wget http://incrediblepbx.com/picotts.tar.gz
tar zxvf picotts.tar.gz
cd /root
./picotts-install.sh
sed -i 's|en)|en-US)|' /etc/asterisk/extensions_custom.conf
sed -i 's|googletts|picotts|' /etc/asterisk/extensions_custom.conf
asterisk -rx "dialplan reload"
espeak --stdout "this is a test" | paplay
echo "these are my notes" > text.txt
espeak --stdout -f text.txt > text.wav
paplay text.wav # you should hear "these are my notes"
play text.wav
基于STM32的孤立词语音识别
https://github.com/gk969/stm32-speech-recognition
http://gk969.com/stm32-speech-recognition/
图灵
聊天
http://www.tuling123.com/openapi/api
图片识别文字
pip3 install baidu-aip
https://github.com/shuoGG1239/Image2Text
https://github.com/lancezhange/smoke_recognition 图片烟雾识别
python3 smokeDetection.py
pip3 install pytesseract
https://www.cnblogs.com/wzben/p/5930538.html
brew install --with-training-tools --all-languages tesseract
https://github.com/tesseract-ocr/tessdata
https://github.com/tesseract-ocr/tessdata/tree/3.04.00
https://github.com/tesseract-ocr/tessdata_fast/
tesseract -v tesseract --list-langs 查看版本+语音
tesseract 图片名称 生成的结果文件的名称 字库
tesseract test.jpg result -l chi_sim
tesseract -l chi_sim+eng
tesseract 1234.png 1234 -l chi_sim -psm 6
tesseract --help-psm
0 定向脚本监测(OSD)
1 使用OSD自动分页
2 自动分页,但是不使用OSD或OCR(Optical Character Recognition,光学字符识别)
3 全自动分页,但是没有使用OSD(默认)
4 假设可变大小的一个文本列。
5 假设垂直对齐文本的单个统一块。
6 假设一个统一的文本块。
7 将图像视为单个文本行。
8 将图像视为单个词。
9 将图像视为圆中的单个词。
将图片转换成tif格式,用于后面生成box文件。可以通过画图,然后另存为tif即可
[lang].[fontname].exp[num].tif
生成box文件
tesseract mjorcen.normal.exp0.jpg mjorcen.normal.exp0 -l chi_sim batch.nochop makebox
box文件和对应的tif一定要在相同的目录下,不然后面打不开。
打开jTessBoxEditor矫正错误并训练 打开train.bat
tesseract mjorcen.normal.exp0.jpg mjorcen.normal.exp0 nobatch box.train
unicharset_extractor mjorcen.normal.exp0.box
新建一个font_properties文件
里面内容写入 normal 0 0 0 0 0 表示默认普通字体
shapeclustering -F font_properties -U unicharset mjorcen.normal.exp0.tr
mftraining -F font_properties -U unicharset -O unicharset mjorcen.normal.exp0.tr
cntraining mjorcen.normal.exp0.tr
最后会生成五个文件,把目录下的unicharset、inttemp、pffmtable、shapetable、normproto这五个文件前面都加上normal.
combine_tessdata normal.
得到训练好的字库。
把 normal.traineddata 复制到Tesseract-OCR 安装目录下的tessdata文件夹中
tesseract mjorcen.normal.exp0.jpg mjorcen.normal.exp0 -l normal
素材合成,(多个素材合成)
打开jTessBoxEditor工具,菜单栏:tools->Merge TIFF...,选中要合成的图片并保存为为:huiyi.fitt。
生成box文件
tesseract huiyi.tif huiyi -l chi_sim -psm 10 batch.nochop makebox
执行后会在生成一个名为huiyi.box的box文件。
用文本编辑器或者xcode打开编辑保存
生成.tr文件
tesseract huiyi.tif huiyi -psm 10 nobatch box.train
生成unicharset文件
unicharset_extractor huiyi.box
jTessBoxEditor
https://sourceforge.net/projects/vietocr/files/jTessBoxEditor/
java -Xms4096m -Xmx4096m -jar jTessBoxEditor.jar
图片转tiff
pip3 install tifffile
python3 /usr/local/lib/python3.7/site-packages/tifffile/tifffile.py --help
vi ~/.bash_profile
alias tifffile='python3 /usr/local/lib/python3.7/site-packages/tifffile/tifffile.py'
source ~/.bash_profile
tifffile --help
go get -u github.com/brunsgaard/img2tiff
cd $GOPATH/src/github.com/brunsgaard/img2tiff
https://blog.csdn.net/qq_25806863/article/details/67637567
vi process-tessdata.sh
#!/bin/sh
read -p "输入你语言:" lang
echo ${lang}
read -p "输入你的字体:" font
echo ${font}
echo "所以完整文件名为:"
echo ${lang}.${font}.exp0.tif
echo "开始。。。"
echo ${font} 0 0 0 0 0 >font_properties
tesseract ${lang}.${font}.exp0.tif ${lang}.${font}.exp0 nobatch box.train
unicharset_extractor ${lang}.${font}.exp0.box
shapeclustering -F font_properties -U unicharset ${lang}.${font}.exp0.tr
mftraining -F font_properties -U unicharset -O unicharset ${lang}.${font}.exp0.tr
cntraining ${lang}.${font}.exp0.tr
echo "开始重命名文件"
mv inttemp ${font}.inttemp
mv normproto ${font}.normproto
mv pffmtable ${font}.pffmtable
mv shapetable ${font}.shapetable
mv unicharset ${font}.unicharset
echo "生成最终文件"
combine_tessdata ${font}.
echo "完成"
识别车牌
https://github.com/zeusees/HyperLPR
pip install hyperlpr
CNN的OCR车牌识别
https://github.com/huxiaoman7/mxnet-cnn-plate-recognition
一款入门级的人脸、视频、文字检测以及识别的项目.
https://github.com/vipstone/faceai
https://github.com/bairdzhang/smallhardface
pip3 install dlib
训练模型用于是人脸识别的关键,用于查找图片的关键点。
wget http://dlib.net/files/shape_predictor_68_face_landmarks.dat.bz2
当然你也可以训练自己的人脸关键点模型,这个功能会放在后面讲。
下载好的模型文件,我的存放地址是:C:\Python36\Lib\site-packages\dlib-data\shape_predictor_68_face_landmarks.dat.bz2
解压:shape_predictor_68_face_landmarks.dat.bz2得到文件:shape_predictor_68_face_landmarks.dat
https://github.com/hcxiong/faceai/blob/master/doc/detectionDlib.md
https://github.com/hcxiong/faceai/blob/master/doc/videoOpenCV.md
https://github.com/hcxiong/faceai/blob/master/doc/videoDlib.md
https://github.com/hcxiong/faceai/blob/master/doc/faceRecognitionOutline.md
人脸检测
https://github.com/610265158/DSFD-tensorflow
https://github.com/kpzhang93/MTCNN_face_detection_alignment
https://github.com/ydwen/caffe-face
https://github.com/deepinsight/insightface
https://github.com/deepinsight/insightface/wiki/Model-Zoo
MS1M-Arcface
https://pan.baidu.com/s/1S6LJZGdqcZRle1vlcMzHOQ
https://github.com/Linzaer/Ultra-Light-Fast-Generic-Face-Detector-1MB
https://github.com/ZhaoJ9014/face.evoLVe.PyTorch
https://github.com/Cadene/pretrained-models.pytorch
2012年视觉对象课程挑战(VOC2012)
http://host.robots.ox.ac.uk:8080/pascal/VOC/voc2012/index.html
https://github.com/fighting41love/funNLP
中英文敏感词过滤 https://github.com/observerss/textfilter
97种语言检测 https://github.com/saffsd/langid.py
另一个语言检测https://code.google.com/archive/p/language-detection/
中国手机归属地查询 https://github.com/ls0f/phone
phone国际手机、电话归属地查询 https://github.com/AfterShip/phone
根据名字判断性别: https://github.com/observerss/ngender
人名语料库 https://github.com/wainshine/Chinese-Names-Corpus
中文缩写库 https://github.com/zhangyics/Chinese-abbreviation-dataset/blob/master/dev_set.txt
汉语拆字词典 https://github.com/kfcd/chaizi
词汇情感值 https://github.com/rainarch/SentiBridge/blob/master/Entity_Emotion_Express/CCF_data/pair_mine_result
中文词库、停用词、敏感词 https://github.com/dongxiexidian/Chinese
汉字转拼音 https://github.com/mozillazg/python-pinyin
中文繁简体互转 https://github.com/skydark/nstools/tree/master/zhtools
英文模拟中文发音引擎 funny chinese text to speech enginee https://github.com/tinyfool/ChineseWithEnglish
同义词库、反义词库、否定词库 https://github.com/phunterlau/wangfeng-rnn
无空格英文串分割、抽取单词 https://github.com/keredson/wordninja
结巴中文分词 https://github.com/fxsjy/jieba
百度中文词法分析(分词+词性+专名)系统 https://github.com/baidu/lac
https://github.com/baidu/AnyQ 百度FAQ自动问答系统
https://github.com/baidu/Senta 百度情感识别系统
Scattertext 文本可视化:https://github.com/JasonKessler/scattertext
中文字符数据:https://github.com/skishore/makemeahanzi
中文识别
https://github.com/breezedeus/cnocr
python3 scripts/cnocr_predict.py --file multi-line_cn1.png
https://github.com/diaomin/crnn-mxnet-chinese-text-recognition
语料
https://github.com/codemayq/chaotbot_corpus_Chinese
https://github.com/gunthercox/chatterbot-corpus
https://github.com/MarkWuNLP/MultiTurnResponseSelection
https://github.com/wb14123/couplet-dataset
中文古诗自动作诗机器人
https://github.com/jinfagang/tensorflow_poems
python3 train.py
python3 compose_poem.py
python3 main.py -w poem --no-train
基於向量匹配的情境式聊天機器人
https://github.com/zake7749/Chatbot
https://github.com/zake7749/PTT-Chat-Generator
用于主题建模,文档索引 和大型语料库的相似性检索。目标受众是 自然语言处理(NLP)和信息检索(IR)社区。
https://github.com/RaRe-Technologies/gensim
PTT 八卦版問答中文語料
https://github.com/zake7749/Gossiping-Chinese-Corpus
处理中文文本内容
https://github.com/isnowfy/snownlp
文本相似度
https://github.com/seatgeek/fuzzywuzzy
https://github.com/sloria/TextBlob
ocr
http://apis.baidu.com/apistore/idlocr/ocr
https://github.com/deloz/baiduocr
https://github.com/tesseract-ocr/tesseract
brew install --with-training-tools --all-languages tesseract
brew install imagemagick
tesseract imagename outputbase [-l lang] [--oem ocrenginemode] [--psm pagesegmode] [configfiles...]
tesseract -l chi_sim data/test_data.png out_test_data
chi_sim.traineddata
eng.traineddata
https://github.com/naptha/tesseract.js
http://tesseract.projectnaptha.com/
https://github.com/madmaze/pytesseract
https://github.com/thiagoalessio/tesseract-ocr-for-php
https://github.com/otiai10/gosseract
https://github.com/Greedysky/TTKOCR
https://github.com/Aixtuz/CardScanner
https://github.com/iChenwin/pytesseractID
https://github.com/csxiaoyaojianxian/BloodTestReportOCR
https://github.com/bigchao8/Opencv-ImageBase
基于caffe
https://github.com/JinpengLI/deep_ocr
python reco_chars.py
实现ctpn+crnn+ctc实现不定长场景文字OCR识别
https://github.com/xiaofengShi/CHINESE-OCR
环境部署
Bash
##GPU环境
sh setup.sh
##CPU环境
sh setup-cpu.sh
##CPU python3环境
sh setup-python3.sh
使用环境:python3.6+tensorflow1.7+cpu/gpu
https://github.com/jimmyleaf/ocr_tensorflow_cnn
安装
http://caffe.berkeleyvision.org/install_osx.html
brew tap homebrew/science
brew install hdf5 opencv
文字识别
http://www.robots.ox.ac.uk/~vgg/data/text/
https://yq.aliyun.com/articles/109555?t=t1
https://github.com/YCG09/chinese_ocr
sh setup.sh #环境部署
python demo.py #Demo 将测试图片放入test_images目录,检测结果会保存到test_result中
训练 数据集:https://pan.baidu.com/s/1QkI7kjah8SPHwOQ40rS1Pw (密码:lu7m) 图片解压后放置到train/images目录下,描述文件放到train目录下
cd train
python train.py
https://github.com/JarveeLee/SynthText_Chinese_version
https://github.com/Belval/TextRecognitionDataGenerator
https://github.com/Sanster/text_renderer
用keras实现OCR定位、识别
https://github.com/xiaomaxiao/keras_ocr
https://github.com/eragonruan/text-detection-ctpn 文字区域检测CTPN
https://github.com/eragonruan/text-detection-ctpn/releases
python ./ctpn/demo_pb.py
cd lib/utils
chmod +x make.sh
./make.sh
prepare data
cd lib/prepare_training_data
python split_label.py
it will generate the prepared data in current folder, and then run
python ToVoc.py
python ./ctpn/train_net.py
主流ocr算法研究实验性的项目,目前实现了CNN+BLSTM+CTC架构
https://github.com/senlinuc/caffe_ocr
https://github.com/isee15/Card-Ocr
基于caffe的LSTM OCR案例,能够利用该案例完成序列的识别,包括验证码、车牌、身份证号码、地址等长序列动长的内容识别
https://github.com/dlunion/CaffeLSTM-OCR
物体识别
https://github.com/open-mmlab/mmdetection
https://github.com/HRNet/HRNet-Object-Detection
CTC可以被用来训练端对端的语音识别系统
https://github.com/baidu-research/warp-ctc
git clone https://github.com/baidu-research/warp-ctc.git
cd warp-ctc
mkdir build
cd build
cmake ../
make
http://ilovin.me/2017-04-06/tensorflow-lstm-ctc-ocr/
https://github.com/ilovin/lstm_ctc_ocr
https://github.com/bgshih/crnn
https://github.com/meijieru/crnn.pytorch
腾讯优图OCR云平台识别身份证、银行卡、行驶证、驾驶证,依赖包小,识别次数免费
https://api.youtu.qq.com/youtu/ocrapi/
https://open.youtu.qq.com/#/open
https://github.com/Tencent-YouTu/nodejs_sdk
https://github.com/Tencent-YouTu/Python_sdk
https://github.com/Tencent-YouTu/Go_sdk
https://github.com/TencentYouTu/ios_sdk
https://github.com/TencentYouTu/android_sdk
基于Xception的腾讯验证码识别(样本+代码)
https://github.com/bojone/n2n-ocr-for-qqcaptcha
10万验证码样本公开如下:
链接: https://pan.baidu.com/s/1mhO1sG4 密码: j2rj
https://github.com/keras-team/keras
百度莱茨狗抢购脚本
https://github.com/Acamy/pet-chain-buyer
https://github.com/yanwii/pet-chain
https://pet-chain.baidu.com/
图片验证码识别
https://www.showapi.com/api/view/184
https://github.com/Yaoshicn/decaptcha
https://github.com/dingyaguang117/ImageRecognizeOf58
https://github.com/CrazyHusen/IdentificationCodes
百度二代身份证识别
https://github.com/DophinL/baidu-ocr-idcard
https://github.com/Freeza91/baidu_ocr
百度OCR文字识别API For Ruby Gems
https://rubygems.org/gems/baidu_ocr
https://aip.baidubce.com/rest/2.0/ocr/v1/general
https://aip.baidubce.com/rest/2.0/ocr/v1/general_basic
https://aip.baidubce.com/rest/2.0/ocr/v1/general_enhanced
https://aip.baidubce.com/rest/2.0/ocr/v1/accurate_basic
https://aip.baidubce.com/rest/2.0/ocr/v1/accurate
https://aip.baidubce.com/rest/2.0/ocr/v1/bankcard
https://aip.baidubce.com/rest/2.0/ocr/v1/idcard
https://aip.baidubce.com/rest/2.0/ocr/v1/webimage
https://aip.baidubce.com/rest/2.0/ocr/v1/driving_license
https://aip.baidubce.com/rest/2.0/ocr/v1/vehicle_license
https://aip.baidubce.com/rest/2.0/ocr/v1/license_plate
https://aip.baidubce.com/rest/2.0/ocr/v1/business_license
https://aip.baidubce.com/rest/2.0/ocr/v1/receipt
https://github.com/UEdge/OCRCard
https://github.com/chasecs/react-native-baidu-ocr
语音识别,语音合成,中文分词, 中文词向量表示, 短文本相似度, 中文DNN语言模型, 评论观点抽取, 词性标注, 人脸识别, 人脸N:N比对, 身份证识别, 黄色图片识别, 图片查找, 等接口的golang调用sdk
https://github.com/ghostwwl/baiduai
aliyun
https://market.aliyun.com/aliyunocrnew
https://data.aliyun.com/product/ocr
图片识别
http://image.baidu.com/pictureup/uploadshitu?fr=flash&fm=index&pos=upload
curl -i -F '[email protected]' 'http://image.baidu.com/pictureup/uploadshitu?pos=upload&uptype=upload_pc&fm=index' -L