Four measures are needed to reduce hallucinations in academic settings:
- Enable RAT mode: By
RATPipeline
configurereasoning_model_name="deepseek-r1:1.5b"
cap (a poem)reflection=2
Enhanced fact-checking - Source labeling: in
RAGPipeline
Set at initializationreturn_sources=True
The output will be accompanied by the location of the references. - Document Cleaning: Remove non-text content (e.g. headers and footers) when pre-processing PDFs to reduce noise interference
- parameter tuning::
- raise
k=7
Get more supporting materials - Set the LLM's
temperature=0.3
Reduced randomness
- raise
It is recommended that manual sampling of key findings be implemented and that an accuracy assessment mechanism be established for continuous optimization.
This answer comes from the articleRAGLight: Lightweight Retrieval Augmentation Generation Python LibraryThe