Terminology Understanding Enhancement Program
When the model encounters terms outside the knowledge base, it can be handled in a five-step process:
- Term Capture: By
monitor_unanswered.py
Script logging of unknown terminology requests - Automatic Expansion: Configuring Azure OpenAI for
gen_synthetic_data.py
Automatic generation of terminology explanations - semantic alignment: Run
train_synonym.py
Establish mapping of terminology to existing knowledge - validate a closed loop: Add new terms to
pending_review.json
Inventoried after manual review - Active Learning: Enable
active_learning
Patterns to collect user feedback
Implementation data from a legal tech company shows that the program has made the model four times faster at adapting to newly effective regulatory terminology, and bysemantic_fallback
The mechanism improves the accuracy of responses for unregistered terms from a random guess to 72%. it is recommended to run a weekly thesaurus health check (check_terminology_coverage
).
This answer comes from the articleKBLaM: An Open Source Enhanced Tool for Embedding External Knowledge in Large ModelsThe