{"id":20737,"date":"2024-06-10T14:59:51","date_gmt":"2024-06-10T06:59:51","guid":{"rendered":"https:\/\/www.aisharenet.com\/?p=20737"},"modified":"2025-07-17T08:47:59","modified_gmt":"2025-07-17T00:47:59","slug":"parler-tts","status":"publish","type":"post","link":"https:\/\/www.kdjingpai.com\/ja\/parler-tts\/","title":{"rendered":"Parler-TTS\uff1a\u6839\u636e\u8f93\u5165\u6587\u672c\u751f\u6210\u7279\u5b9a\u8bf4\u8bdd\u4eba\u98ce\u683c\u7684\u6587\u672c\u8f6c\u8bed\u97f3\u6a21\u578b"},"content":{"rendered":"<p>Parler-TTS \u662f\u7531 Hugging Face \u5f00\u53d1\u7684\u5f00\u6e90\u6587\u672c\u8f6c\u8bed\u97f3\uff08TTS\uff09\u6a21\u578b\u5e93\uff0c\u65e8\u5728\u751f\u6210\u9ad8\u8d28\u91cf\u3001\u81ea\u7136\u6d41\u7545\u7684\u8bed\u97f3\u3002\u8be5\u6a21\u578b\u80fd\u591f\u6839\u636e\u8f93\u5165\u6587\u672c\u751f\u6210\u5177\u6709\u7279\u5b9a\u8bf4\u8bdd\u4eba\u98ce\u683c\uff08\u5982\u6027\u522b\u3001\u97f3\u8c03\u3001\u8bf4\u8bdd\u98ce\u683c\u7b49\uff09\u7684\u8bed\u97f3\u3002Parler-TTS \u662f\u57fa\u4e8e\u8bba\u6587\u300aNatural language guidance of high-fidelity text-to-speech with synthetic annotations\u300b\u4e2d\u7684\u7814\u7a76\u6210\u679c\u5f00\u53d1\u7684\uff0c\u5e76\u4e14\u5b8c\u5168\u5f00\u6e90\uff0c\u6240\u6709\u6570\u636e\u96c6\u3001\u9884\u5904\u7406\u3001\u8bad\u7ec3\u4ee3\u7801\u548c\u6743\u91cd\u5747\u516c\u5f00\u53d1\u5e03\uff0c\u5141\u8bb8\u793e\u533a\u5728\u6b64\u57fa\u7840\u4e0a\u8fdb\u884c\u5f00\u53d1\u548c\u6539\u8fdb\u3002<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-20738\" title=\"Parler-TTS\uff1a\u6839\u636e\u8f93\u5165\u6587\u672c\u751f\u6210\u7279\u5b9a\u8bf4\u8bdd\u4eba\u98ce\u683c\u7684\u6587\u672c\u8f6c\u8bed\u97f3\u6a21\u578b-1\" src=\"https:\/\/www.kdjingpai.com\/wp-content\/uploads\/2025\/02\/244bcc5060632b4.png\" alt=\"Parler-TTS\uff1a\u6839\u636e\u8f93\u5165\u6587\u672c\u751f\u6210\u7279\u5b9a\u8bf4\u8bdd\u4eba\u98ce\u683c\u7684\u6587\u672c\u8f6c\u8bed\u97f3\u6a21\u578b-1\" width=\"1355\" height=\"867\" srcset=\"https:\/\/www.kdjingpai.com\/wp-content\/uploads\/2025\/02\/244bcc5060632b4.png 1355w, https:\/\/www.kdjingpai.com\/wp-content\/uploads\/2025\/02\/244bcc5060632b4-768x491.png 768w\" sizes=\"auto, (max-width: 1355px) 100vw, 1355px\" \/><\/p>\n<p>&nbsp;<\/p>\n<h2>\u529f\u80fd\u5217\u8868<\/h2>\n<ul>\n<li><strong>\u9ad8\u8d28\u91cf\u8bed\u97f3\u751f\u6210<\/strong>\uff1a\u751f\u6210\u81ea\u7136\u6d41\u7545\u7684\u8bed\u97f3\uff0c\u652f\u6301\u591a\u79cd\u8bf4\u8bdd\u4eba\u98ce\u683c\u3002<\/li>\n<li><strong>\u5f00\u6e90\u4ee3\u7801<\/strong>\uff1a\u6240\u6709\u4ee3\u7801\u548c\u6a21\u578b\u6743\u91cd\u5747\u516c\u5f00\uff0c\u4fbf\u4e8e\u793e\u533a\u5f00\u53d1\u548c\u6539\u8fdb\u3002<\/li>\n<li><strong>\u8f7b\u91cf\u7ea7\u4f9d\u8d56<\/strong>\uff1a\u5b89\u88c5\u548c\u4f7f\u7528\u7b80\u5355\uff0c\u4f9d\u8d56\u9879\u5c11\u3002<\/li>\n<li><strong>\u591a\u79cd\u6a21\u578b\u7248\u672c<\/strong>\uff1a\u63d0\u4f9b\u4e0d\u540c\u53c2\u6570\u91cf\u7684\u6a21\u578b\u7248\u672c\uff0c\u5982 Parler-TTS Mini \u548c Parler-TTS Large\u3002<\/li>\n<li><strong>\u5feb\u901f\u751f\u6210<\/strong>\uff1a\u4f18\u5316\u4e86\u751f\u6210\u901f\u5ea6\uff0c\u652f\u6301 SDPA \u548c Flash Attention 2\u3002<\/li>\n<li><strong>\u6570\u636e\u96c6\u548c\u6743\u91cd<\/strong>\uff1a\u63d0\u4f9b\u4e30\u5bcc\u7684\u6570\u636e\u96c6\u548c\u9884\u8bad\u7ec3\u6a21\u578b\u6743\u91cd\uff0c\u4fbf\u4e8e\u8bad\u7ec3\u548c\u5fae\u8c03\u3002<\/li>\n<\/ul>\n<p>&nbsp;<\/p>\n<h2>\u4f7f\u7528\u5e2e\u52a9<\/h2>\n<h3>\u5b89\u88c5\u6d41\u7a0b<\/h3>\n<ol>\n<li>\u786e\u4fdd\u5df2\u5b89\u88c5 Python \u73af\u5883\u3002<\/li>\n<li>\u4f7f\u7528\u4ee5\u4e0b\u547d\u4ee4\u5b89\u88c5 Parler-TTS \u5e93\uff1a<\/li>\n<\/ol>\n<pre><code>   pip install git+https:\/\/github.com\/huggingface\/parler-tts.git\r\n<\/code><\/pre>\n<ol start=\"3\">\n<li>\u5bf9\u4e8e Apple Silicon \u7528\u6237\uff0c\u9700\u8fd0\u884c\u4ee5\u4e0b\u547d\u4ee4\u4ee5\u652f\u6301 bfloat16\uff1a<\/li>\n<\/ol>\n<pre><code>   pip3 install --pre torch torchaudio --index-url https:\/\/download.pytorch.org\/whl\/nightly\/cpu\r\n<\/code><\/pre>\n<h3>\u4f7f\u7528\u65b9\u6cd5<\/h3>\n<h4>\u751f\u6210\u968f\u673a\u8bed\u97f3<\/h4>\n<ol>\n<li>\u5bfc\u5165\u5fc5\u8981\u7684\u5e93\uff1a<\/li>\n<\/ol>\n<pre><code>   import torch\r\nfrom parler_tts import ParlerTTSForConditionalGeneration\r\nfrom transformers import AutoTokenizer\r\nimport soundfile as sf\r\n<\/code><\/pre>\n<ol start=\"2\">\n<li>\u52a0\u8f7d\u6a21\u578b\u548c\u5206\u8bcd\u5668\uff1a<\/li>\n<\/ol>\n<pre><code>   device = \"cuda:0\" if torch.cuda.is_available() else \"cpu\"\r\nmodel = ParlerTTSForConditionalGeneration.from_pretrained(\"parler-tts\/parler-tts-mini-v1\").to(device)\r\ntokenizer = AutoTokenizer.from_pretrained(\"parler-tts\/parler-tts-mini-v1\")\r\n<\/code><\/pre>\n<ol start=\"3\">\n<li>\u8f93\u5165\u6587\u672c\u5e76\u751f\u6210\u8bed\u97f3\uff1a<\/li>\n<\/ol>\n<pre><code>   prompt = \"Hey, how are you doing today?\"\r\ndescription = \"A female speaker delivers a slightly expressive and animated speech with a moderate speed and <a href=\"https:\/\/www.kdjingpai.com\/pt\/pitch\/\">pitch<\/a>.\"\r\ninputs = tokenizer(prompt, return_tensors=\"pt\").to(device)\r\noutputs = model.generate(**inputs, description=description)\r\nsf.write(\"output.wav\", outputs.cpu().numpy(), 22050)\r\n<\/code><\/pre>\n<h4>\u751f\u6210\u7279\u5b9a\u8bf4\u8bdd\u4eba\u98ce\u683c\u7684\u8bed\u97f3<\/h4>\n<ol>\n<li>\u4f7f\u7528\u7279\u5b9a\u8bf4\u8bdd\u4eba\u98ce\u683c\u7684\u63cf\u8ff0\uff1a<\/li>\n<\/ol>\n<pre><code>   description = \"A male speaker with a deep voice and slow pace.\"\r\ninputs = tokenizer(prompt, return_tensors=\"pt\").to(device)\r\noutputs = model.generate(**inputs, description=description)\r\nsf.write(\"output_specific.wav\", outputs.cpu().numpy(), 22050)\r\n<\/code><\/pre>\n<h3>\u8bad\u7ec3\u6a21\u578b<\/h3>\n<ol>\n<li>\u4e0b\u8f7d\u5e76\u51c6\u5907\u6570\u636e\u96c6\u3002<\/li>\n<li>\u4f7f\u7528\u63d0\u4f9b\u7684\u8bad\u7ec3\u4ee3\u7801\u8fdb\u884c\u6a21\u578b\u8bad\u7ec3\uff1a<\/li>\n<\/ol>\n<pre><code>   python train.py --dataset_path \/path\/to\/dataset --output_dir \/path\/to\/output\r\n<\/code><\/pre>\n<h3>\u4f18\u5316\u63a8\u7406<\/h3>\n<ol>\n<li>\u4f7f\u7528 SDPA \u548c Flash Attention 2 \u8fdb\u884c\u4f18\u5316\uff1a<\/li>\n<\/ol>\n<pre><code>   model = ParlerTTSForConditionalGeneration.from_pretrained(\"parler-tts\/parler-tts-mini-v1\", use_flash_attention=True).to(device)<\/code><\/pre>\n","protected":false},"excerpt":{"rendered":"<p>Parler-TTS \u662f\u7531 Hugging Face \u5f00\u53d1\u7684\u5f00\u6e90\u6587\u672c\u8f6c\u8bed\u97f3\uff08TTS\uff09\u6a21\u578b\u5e93\uff0c\u65e8\u5728\u751f\u6210\u9ad8\u8d28\u91cf\u3001\u81ea\u7136\u6d41\u7545\u7684\u8bed\u97f3\u3002\u8be5\u6a21\u578b\u80fd\u591f\u6839\u636e\u8f93\u5165\u6587\u672c\u751f\u6210\u5177\u6709\u7279\u5b9a\u8bf4\u8bdd\u4eba\u98ce\u683c\uff08\u5982\u6027\u522b\u3001\u97f3\u8c03\u3001\u8bf4\u8bdd\u98ce\u683c\u7b49\uff09\u7684\u8bed\u97f3\u3002Parler-TTS \u662f\u57fa\u4e8e\u8bba\u6587\u300aN&#8230;<\/p>\n","protected":false},"author":1,"featured_media":60875,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[20,392,400],"tags":[230,215],"class_list":["post-20737","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-tool","category-models","category-speech-model","tag-aikaiyuanxiangmu","tag-aiwenbenzhuanyuyin"],"_links":{"self":[{"href":"https:\/\/www.kdjingpai.com\/ja\/wp-json\/wp\/v2\/posts\/20737","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.kdjingpai.com\/ja\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.kdjingpai.com\/ja\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.kdjingpai.com\/ja\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.kdjingpai.com\/ja\/wp-json\/wp\/v2\/comments?post=20737"}],"version-history":[{"count":0,"href":"https:\/\/www.kdjingpai.com\/ja\/wp-json\/wp\/v2\/posts\/20737\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.kdjingpai.com\/ja\/wp-json\/wp\/v2\/media\/60875"}],"wp:attachment":[{"href":"https:\/\/www.kdjingpai.com\/ja\/wp-json\/wp\/v2\/media?parent=20737"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.kdjingpai.com\/ja\/wp-json\/wp\/v2\/categories?post=20737"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.kdjingpai.com\/ja\/wp-json\/wp\/v2\/tags?post=20737"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}