The Complete Guide to Hot Phrase Library Configuration
Description of the structure of the document
The software directory contains three types of hotword files:
- hot-zh.txt:Chinese Thesaurus (based on Pinyin matching)
- hot-en.txt:English Thesaurus (based on spelling matches)
- hot-rule.txt:Customized Replacement Rules
Configuration method
- Chinese hot words:One word per line (e.g. "convolutional neural network")
- English Hot Words:Fill in one word per line (e.g. "ReLU").
- Rules Hot Words:Use the equals sign format (e.g. "NLP = Natural Language Processing")
Best Practice Recommendations
- Specialized areas are recommended to maintain 100-500 core terms
- Mixed Chinese and English words are prioritized in hot-rule.txt (e.g., "CNN = Convolutional Neural Network").
- Regularly updated hotword database (dynamic loading on the client side does not require a reboot)
- Complex abbreviations are recommended to be configured with both case variants (e.g., "AI" and "ai").
Practical tests show that reasonable configuration of hot words can reduce the error rate of professional text recognition by more than 60%, which is especially suitable for legal, medical and other professional fields.
This answer comes from the articleCapsWriter-Offline: Speech Input and Subtitle Transcription Tool for the PCThe