{"id":6969,"date":"2024-10-09T10:55:09","date_gmt":"2024-10-09T02:55:09","guid":{"rendered":"https:\/\/www.aisharenet.com\/?p=6969"},"modified":"2024-10-09T11:13:52","modified_gmt":"2024-10-09T03:13:52","slug":"voicecraft","status":"publish","type":"post","link":"https:\/\/www.kdjingpai.com\/ja\/voicecraft\/","title":{"rendered":"VoiceCraft\uff1a\u5f00\u6e90\u96f6\u6837\u672c\u8bed\u97f3\u514b\u9686\u4e0e\u6587\u672c\u8f6c\u8bed\u97f3\u5de5\u5177"},"content":{"rendered":"<p>VoiceCraft\u662f\u4e00\u4e2a\u5f00\u6e90\u7684\u8bed\u97f3\u7f16\u8f91\u548c\u96f6\u6837\u672c\u8bed\u97f3\u5408\u6210\u5de5\u5177\uff0c\u57fa\u4e8e\u795e\u7ecf\u7f16\u89e3\u7801\u5668\u8bed\u8a00\u6a21\u578b\u3002\u5b83\u91c7\u7528\u4e86\u521b\u65b0\u7684\u7f16\u7801\u5e8f\u5217\u751f\u6210\u65b9\u6cd5\uff0c\u80fd\u591f\u5728\u5df2\u6709\u8bed\u97f3\u5e8f\u5217\u4e0a\u8fdb\u884c\u63d2\u5165\u3001\u5220\u9664\u548c\u66ff\u6362\u64cd\u4f5c\uff0c\u751f\u6210\u81ea\u7136\u3001\u8fde\u8d2f\u7684\u7f16\u8f91\u8bed\u97f3\u3002\u540c\u65f6\uff0cVoiceCraft\u8fd8\u652f\u6301\u96f6\u6837\u672c\u8bed\u97f3\u5408\u6210\uff0c\u65e0\u9700\u9488\u5bf9\u7279\u5b9a\u8bf4\u8bdd\u4eba\u8fdb\u884c\u989d\u5916\u7684\u5fae\u8c03\u3002\u8be5\u5de5\u5177\u5728\u591a\u4e2a\u8bed\u97f3\u5904\u7406\u4efb\u52a1\u4e0a\u8868\u73b0\u51fa\u8272\uff0c\u6027\u80fd\u663e\u8457\u8d85\u8fc7\u4e86\u5f53\u524d\u7684\u4e1a\u754cSOTA\u6a21\u578b\u3002<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-6974\" title=\"VoiceCraft\uff1a\u5f00\u6e90\u96f6\u6837\u672c\u8bed\u97f3\u514b\u9686\u4e0e\u6587\u672c\u8f6c\u8bed\u97f3\u5de5\u5177-1\" src=\"https:\/\/www.kdjingpai.com\/wp-content\/uploads\/2024\/10\/45e2e9bd9ff11d2.png\" alt=\"VoiceCraft\uff1a\u5f00\u6e90\u96f6\u6837\u672c\u8bed\u97f3\u514b\u9686\u4e0e\u6587\u672c\u8f6c\u8bed\u97f3\u5de5\u5177-1\" width=\"1920\" height=\"870\" srcset=\"https:\/\/www.kdjingpai.com\/wp-content\/uploads\/2024\/10\/45e2e9bd9ff11d2.png 1920w, https:\/\/www.kdjingpai.com\/wp-content\/uploads\/2024\/10\/45e2e9bd9ff11d2-300x136.png 300w, https:\/\/www.kdjingpai.com\/wp-content\/uploads\/2024\/10\/45e2e9bd9ff11d2-1024x464.png 1024w, https:\/\/www.kdjingpai.com\/wp-content\/uploads\/2024\/10\/45e2e9bd9ff11d2-768x348.png 768w, https:\/\/www.kdjingpai.com\/wp-content\/uploads\/2024\/10\/45e2e9bd9ff11d2-1536x696.png 1536w\" sizes=\"auto, (max-width: 1920px) 100vw, 1920px\" \/><\/p>\n<p>&nbsp;<\/p>\n<h2>\u529f\u80fd\u5217\u8868<\/h2>\n<ul>\n<li>\u8bed\u97f3\u7f16\u8f91\uff1a\u652f\u6301\u63d2\u5165\u3001\u5220\u9664\u548c\u66ff\u6362\u64cd\u4f5c\uff0c\u751f\u6210\u81ea\u7136\u6d41\u7545\u7684\u7f16\u8f91\u8bed\u97f3\u3002<\/li>\n<li>\u96f6\u6837\u672c\u8bed\u97f3\u5408\u6210\uff1a\u65e0\u9700\u989d\u5916\u5fae\u8c03\u5373\u53ef\u751f\u6210\u76ee\u6807\u8bf4\u8bdd\u4eba\u7684\u8bed\u97f3\u3002<\/li>\n<li>\u57fa\u4e8eTransformer\u67b6\u6784\uff1a\u91c7\u7528\u56e0\u679c\u906e\u853d\u548c\u5ef6\u8fdf\u5806\u53e0\u6280\u672f\uff0c\u63d0\u5347\u751f\u6210\u8d28\u91cf\u3002<\/li>\n<li>\u5f00\u6e90\u6a21\u578b\uff1a\u53ef\u5728Huggingface\u548cAI\u5feb\u7ad9\u4e0a\u514d\u8d39\u4e0b\u8f7d\u548c\u4f7f\u7528\u3002<\/li>\n<li>\u4ea4\u4e92\u5f0fUI\uff1a\u96c6\u6210Gradio\u5e93\uff0c\u7528\u6237\u53ef\u4ee5\u76f4\u89c2\u5730\u63a7\u5236\u548c\u6d4b\u8bd5\u6a21\u578b\u3002<\/li>\n<\/ul>\n<p>&nbsp;<\/p>\n<h2>\u4f7f\u7528\u5e2e\u52a9<\/h2>\n<h3>\u5b89\u88c5\u6d41\u7a0b<\/h3>\n<ol>\n<li>\u514b\u9686\u9879\u76ee\u4ed3\u5e93\u5230\u672c\u5730\u76ee\u5f55\uff1a\n<pre><code class=\"language-bash\">git <span class=\"hljs-built_in\">clone<\/span> git@github.com:jasonppy\/VoiceCraft.git\r\n<span class=\"hljs-built_in\">cd<\/span> VoiceCraft\r\n<\/code><\/pre>\n<\/li>\n<li>\u786e\u4fdd\u60a8\u7684\u7cfb\u7edf\u5df2\u5b89\u88c5Docker\u548cNVIDIA\u5bb9\u5668\u5de5\u5177\u5305\uff08Windows\u7cfb\u7edf\u5df2\u5185\u7f6e\u9a71\u52a8\uff09\uff1a\n<pre><code class=\"language-bash\">sudo apt-get install -y nvidia-container-toolkit-base\r\n<\/code><\/pre>\n<\/li>\n<li>\u6784\u5efaDocker\u955c\u50cf\uff1a\n<pre><code class=\"language-bash\">docker build --tag <span class=\"hljs-string\">\"voicecraft\"<\/span> .\r\n<\/code><\/pre>\n<\/li>\n<li>\u542f\u52a8\u73b0\u6709\u5bb9\u5668\u6216\u521b\u5efa\u65b0\u5bb9\u5668\u5e76\u4f20\u5165\u6240\u6709GPU\uff1a\n<pre><code class=\"language-bash\">.\/start-jupyter.sh  <span class=\"hljs-comment\"># Linux<\/span>\r\nstart-jupyter.bat   <span class=\"hljs-comment\"># Windows<\/span>\r\n<\/code><\/pre>\n<\/li>\n<li>\u6253\u5f00\u6d4f\u89c8\u5668\u5e76\u8bbf\u95ee\u7ec8\u7aef\u663e\u793a\u7684URL\uff1a\n<pre><code class=\"language-bash\">docker logs jupyter\r\n<\/code><\/pre>\n<\/li>\n<li>\u53ef\u9009\uff1a\u4ece\u53e6\u4e00\u4e2a\u7ec8\u7aef\u8fdb\u5165\u5bb9\u5668\u5185\u90e8\uff1a\n<pre><code class=\"language-bash\">docker <span class=\"hljs-built_in\">exec<\/span> -it jupyter \/bin\/bash\r\n<span class=\"hljs-built_in\">export<\/span> USER=(your_linux_username_used_above)\r\n<span class=\"hljs-built_in\">export<\/span> HOME=\/home\/<span class=\"hljs-variable\">$USER<\/span>\r\nsudo apt-get update\r\n<\/code><\/pre>\n<\/li>\n<li>\u786e\u8ba4\u5bb9\u5668\u5185\u53ef\u89c1\u663e\u5361\uff1a\n<pre><code class=\"language-bash\">nvidia-smi\r\n<\/code><\/pre>\n<\/li>\n<li>\u5728\u6d4f\u89c8\u5668\u4e2d\u6253\u5f00<code>inference_tts.ipynb<\/code>\uff0c\u9010\u6b65\u6267\u884c\u6bcf\u4e2a\u5355\u5143\u683c\u3002<\/li>\n<\/ol>\n<h3>\u73af\u5883\u8bbe\u7f6e<\/h3>\n<ol>\n<li>\u521b\u5efa\u5e76\u6fc0\u6d3b\u865a\u62df\u73af\u5883\uff1a\n<pre><code class=\"language-bash\">conda create -n voicecraft python=3.9.16\r\nconda activate voicecraft\r\n<\/code><\/pre>\n<\/li>\n<li>\u5b89\u88c5\u6240\u9700\u4f9d\u8d56\uff1a\n<pre><code class=\"language-bash\">pip install -e git+https:\/\/github.com\/facebookresearch\/audiocraft.git@c5157b5bf14bf83449c17ea1eeb66c19fb4bc7f0<span class=\"hljs-comment\">#egg=audiocraft<\/span>\r\npip install xformers==0.0.22\r\npip install torchaudio==2.0.2 torch==2.0.1\r\napt-get install ffmpeg\r\napt-get install espeak-ng\r\npip install tensorboard==2.16.2\r\npip install phonemizer==3.2.1\r\npip install datasets==2.16.0\r\npip install torchmetrics==0.11.1\r\npip install huggingface_hub==0.22.2\r\nconda install -c conda-forge montreal-forced-aligner=2.2.17 openfst=1.8.2 kaldi=5.5.1068\r\nmfa model download dictionary english_us_arpa\r\nmfa model download acoustic english_us_arpa\r\nconda install -n voicecraft ipykernel --no-deps --force-reinstall\r\n<\/code><\/pre>\n<\/li>\n<\/ol>\n<h3>\u63a8\u7406\u793a\u4f8b<\/h3>\n<ol>\n<li>\u8bed\u97f3\u7f16\u8f91\u63a8\u7406\uff1a\n<pre><code class=\"language-bash\">python phonemize_encodec_encode_hf.py --dataset_size xs --download_to path\/to\/store_huggingface_downloads --save_dir path\/to\/store_extracted_codes_and_phonemes --encodec_model_path path\/to\/encodec_model --mega_batch_size 120 --batch_size 32 --max_len 30000\r\n<\/code><\/pre>\n<\/li>\n<li>\u96f6\u6837\u672c\u8bed\u97f3\u5408\u6210\u63a8\u7406\uff1a\n<pre><code class=\"language-bash\">python tts_demo.py -h\r\n<\/code><\/pre>\n<\/li>\n<\/ol>\n<h3>Gradio<\/h3>\n<ol>\n<li>\u5728Colab\u4e2d\u8fd0\u884c\uff1a\n<pre><code class=\"language-bash\">Open <span class=\"hljs-keyword\">in<\/span> Colab\r\n<\/code><\/pre>\n<\/li>\n<li>\u672c\u5730\u8fd0\u884c\uff1a\n<pre><code class=\"language-bash\">apt-get install -y espeak espeak-data libespeak1 libespeak-dev\r\napt-get install -y festival*\r\napt-get install -y build-essential\r\napt-get install -y flac libasound2-dev libsndfile1-dev vorbis-tools\r\napt-get install -y libxml2-dev libxslt-dev zlib1g-dev\r\npip install -r gradio_requirements.txt\r\npython gradio_app.py\r\n<\/code><\/pre>\n<\/li>\n<\/ol>\n<h3>\u5e38\u89c1\u95ee\u9898<\/h3>\n<ul>\n<li><strong>\u5982\u4f55\u63d0\u9ad8\u751f\u6210\u8bed\u97f3\u7684\u81ea\u7136\u5ea6\uff1f<\/strong> \u786e\u4fdd\u8f93\u5165\u7684\u6587\u672c\u5185\u5bb9\u4e0e\u76ee\u6807\u8bed\u97f3\u6837\u672c\u7684\u98ce\u683c\u548c\u8bed\u5883\u4e00\u81f4\u3002<\/li>\n<li><strong>\u751f\u6210\u7684\u8bed\u97f3\u6587\u4ef6\u6709\u566a\u97f3\u600e\u4e48\u529e\uff1f<\/strong> \u5c1d\u8bd5\u4f7f\u7528\u66f4\u9ad8\u8d28\u91cf\u7684\u8bed\u97f3\u6837\u672c\u6216\u8c03\u6574\u6a21\u578b\u53c2\u6570\u3002<\/li>\n<\/ul>\n","protected":false},"excerpt":{"rendered":"<p>VoiceCraft\u662f\u4e00\u4e2a\u5f00\u6e90\u7684\u8bed\u97f3\u7f16\u8f91\u548c\u96f6\u6837\u672c\u8bed\u97f3\u5408\u6210\u5de5\u5177\uff0c\u57fa\u4e8e\u795e\u7ecf\u7f16\u89e3\u7801\u5668\u8bed\u8a00\u6a21\u578b\u3002\u5b83\u91c7\u7528\u4e86\u521b\u65b0\u7684\u7f16\u7801\u5e8f\u5217\u751f\u6210\u65b9\u6cd5\uff0c\u80fd\u591f\u5728\u5df2\u6709\u8bed\u97f3\u5e8f\u5217\u4e0a\u8fdb\u884c\u63d2\u5165\u3001\u5220\u9664\u548c\u66ff\u6362\u64cd\u4f5c\uff0c\u751f\u6210\u81ea\u7136\u3001\u8fde\u8d2f\u7684\u7f16\u8f91\u8bed\u97f3\u3002\u540c\u65f6\uff0cVoiceCraft\u8fd8\u652f\u6301\u96f6\u6837\u672c\u8bed\u97f3\u5408\u6210\uff0c&#8230;<\/p>\n","protected":false},"author":1,"featured_media":61072,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[20],"tags":[230,237],"class_list":["post-6969","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-tool","tag-aikaiyuanxiangmu","tag-aiyuyinkelong"],"_links":{"self":[{"href":"https:\/\/www.kdjingpai.com\/ja\/wp-json\/wp\/v2\/posts\/6969","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.kdjingpai.com\/ja\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.kdjingpai.com\/ja\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.kdjingpai.com\/ja\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.kdjingpai.com\/ja\/wp-json\/wp\/v2\/comments?post=6969"}],"version-history":[{"count":0,"href":"https:\/\/www.kdjingpai.com\/ja\/wp-json\/wp\/v2\/posts\/6969\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.kdjingpai.com\/ja\/wp-json\/wp\/v2\/media\/61072"}],"wp:attachment":[{"href":"https:\/\/www.kdjingpai.com\/ja\/wp-json\/wp\/v2\/media?parent=6969"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.kdjingpai.com\/ja\/wp-json\/wp\/v2\/categories?post=6969"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.kdjingpai.com\/ja\/wp-json\/wp\/v2\/tags?post=6969"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}