Current Position:fig. beginning " AI Tool

Medical-RAG: A Retrieval-Augmented Generation Framework for Constructing Chinese Medical Q&As

2025-08-25

497 5

make a copy of

Medical-RAG is a Q&A intelligence program designed for the Chinese medical field. It is based on the Retrieval Augmented Generation (RAG) technology, which improves the accuracy and security of Large Language Models (LLMs) for medical advice by incorporating external knowledge bases. The core of the project is to utilize Milvus, a high-performance vector database, to store and retrieve medical knowledge, and to integrate the LangChain framework to manage the entire Q&A process. The project implements a complete automated data processing pipeline, including intelligent data annotation using LLM, construction of medical domain-specific word lists, and efficient data warehousing. It adopts an advanced hybrid retrieval architecture that combines semantic retrieval with dense vectors and keyword retrieval with sparse vectors (BM25), and fuses multiple results with a configurable rearrangement algorithm to improve the accuracy of retrieved content. The developer can deploy and manage the whole system through flexible YAML configuration files, enabling it to adapt to different operating environments and requirements.

Function List

Automated data processing: The project provides an automated data annotation pipeline that supports large model inference via HTTP or local GPU calls to accelerate the annotation process.
Automated word list management:: Built-in multi-threaded and medical domain lexicon that automates the construction and management of word lists for sparse searches to improve query accuracy.
Hybrid Search Architecture:: Supports both dense and sparse vector retrieval. The dense retrieval supports various embedding models such as Ollama, OpenAI, HuggingFace, etc., while the sparse retrieval uses the BM25 algorithm optimized for the medical field.
Rearrangement and integration of results: Support the use of RRF (Reciprocal Rank Fusion) or weighted fusion of multiple search results to improve the relevance of the final answer.
Deep optimization in the medical field: A predefined professional classification system containing 6 major departmental classifications and 8 major problem categories, and the use ofpkusegMedical Domain Segmentation Modeling for Text Processing.
High Performance Vector Database:: Based on Milvus v2.6+, supports efficient vector search, batch embedding and concurrent queries.
Flexible configuration system:: All core parameters, such as database connection, model selection, retrieval strategy, etc., are configured through YAML files, which are easy to deploy and adjust in different environments.
High Efficiency Interface Package: Encapsulates the common interfaces to Milvus and provides theRAGSearchToolCore tools, such as the developer to facilitate secondary development and call.

Using Help

The project provides a complete set of processes from the preparation of the environment to the final query, the following is a detailed step-by-step procedure designed to help users get started quickly.

Step 1: Environmental preparation

Before you start, you need to prepare the basic runtime environment, including cloning the project, installing dependencies, and starting the required services.

Cloning Project Code
First, clone from GitHubmedical-ragsource code to your local machine.
```
git clone https://github.com/yolo-hyl/medical-rag
cd medical-rag/src
```
Install project dependencies
The project is developed in Python and all dependencies are documented in thesetup.pyin. Use pip for installation.
```
pip install -e .
```
Starting the Milvus Vector Database
The project uses Milvus as a vector database and Docker is recommended to start it. A handy startup script is already included in the project code.
```
cd Milvus
bash standalone_embed.sh start
```    此命令会启动一个单机版的Milvus实例。
```

Start Ollama service (optional)
If you plan to use a locally-run big model (such as Qwen) for data annotation or generating answers, you need to install and launch Ollama.

# 启动Ollama服务
ollama serve

# 拉取需要用到的模型
# bge-m3是一个常用的嵌入模型，用于生成向量
ollama pull bge-m3:latest

# qwen2:7b是一个性能不错的标注和问答模型
ollama pull qwen2:7b

Step 2: Basic Configuration

Before running a specific process, the core parameters need to be configured. The configuration file is located in thesrc/MedicalRag/config/default.yaml. You need to modify the following key information according to your environment:

Milvus connection information: To ensure thaturicap (a poem)tokenMatch the Milvus instance you started.
```
milvus:
client:
uri: "http://localhost:19530"
token: "root:Milvus"
collection:
name: "qa_knowledge"
```
Embedded Model Configuration: Specifies the model to be used to generate dense vectors. The following configuration uses the local Ollama service in thebge-m3Model.
```
embedding:
dense:
provider: ollama
model: "bge-m3:latest"
base_url: "http://localhost:11434"
```

Step 3: Data Processing and Inventory

Data processing is the core of building a Q&A system, and the project divides it into four segments: data annotation, constructing word lists, creating collections, and data entry.

data annotation
This step utilizes a large language model to automatically categorize the raw Q&A data (e.g., department affiliation, question type).
- First, configure the labeling parameter file:src/MedicalRag/config/data/annotator.yamlThe
- Then, run the annotation script:
```
python scripts/annotation.py src/MedicalRag/config/data/annotator.yaml
```
construct a word list
In order to support BM25 sparse retrieval, a proprietary vocabulary needs to be constructed based on the corpus of the medical domain.
```
python scripts/build_vocab.py
```
The script processes the data and generates a file namedvocab.pkl.gzof the word list file.

Creating a Milvus collection (Collection)
This step creates a collection in Milvus for storing vectors and related information. The structure (Schema) of the collection is given by thedefault.yamlConfiguration file definition.

# 使用默认配置文件创建集合
python scripts/create_collection.py -c src/MedicalRag/config/default.yaml

# 如果需要强制删除并重建集合，可以添加--force-recreate参数
python scripts/create_collection.py --force-recreate

Data entry
The processed and labeled data is vectorized and finally deposited into the Milvus collection.
```
python scripts/insert_data_to_collection.py
```
This script automatically processes the vectorization of the data (both dense and sparse vectors) and batch inserts it into the database.

Step 4: Search and retrieval

Once the data is all in, the Q&A search can begin.

Configuring Query Policies
You can do this by modifying thesrc/MedicalRag/config/search/search_answer.yamlfile to define the retrieval strategy, e.g., to adjust the weights of different retrieval channels (dense, sparse).
Run the query script
utilizationsearch_pipline.pyscript to execute the query.
```
# 使用指定的搜索配置文件进行查询
python scripts/search_pipline.py --search-config src/MedicalRag/config/search/search_answer.yaml
```
The script goes into an interactive mode where you can enter a question (e.g. "What are the symptoms of syphilis?") to test the search.

Use of core tools

The project also provides a program calledRAGSearchTooltool class to facilitate direct calls to the retrieval function in other code.

from MedicalRag.tools.rag_search_tool import RAGSearchTool
# 从配置文件初始化工具
tool = RAGSearchTool("config/search.yaml")
if tool.is_ready():
# 执行单个查询
results = tool.search("梅毒的症状有哪些？")
print(results)
# 执行批量查询
results_batch = tool.search(["梅毒的治疗方法", "高血压的预防措施"])
print(results_batch)
# 带过滤条件的查询（例如，只在“外科”相关的知识中检索）
results_filtered = tool.search("骨折怎么办", filters={"dept_pk": "3"}) # 假设3代表外科
print(results_filtered)

application scenario

Intelligent diagnosis and treatment assistant
The system can be used as a clinical aid for doctors. When doctors encounter complex or rare cases, they can quickly query relevant diagnosis and treatment guidelines, drug information and the latest medical research to provide decision support for diagnosis and treatment.
Medical student education and training
It can be used to build a simulated questioning system to help medical students practice asking questions, diagnosing and developing treatment plans in a virtual environment. The system can accelerate the learning process by providing standardized answers and relevant knowledge points based on students' questions.
Patient Health Counseling
It can be deployed as a public-facing intelligent customer service or chatbot to provide initial health counseling services to patients 24/7. Users can ask questions about common diseases, symptoms, medication precautions, etc., and the system can provide secure and accurate answers from an authoritative knowledge base, easing the pressure on hospital outpatient services.
Medical Knowledge Base Management and Retrieval
For hospitals and research organizations, the system can integrate massive internal medical documents, medical records and research papers to build an intelligent knowledge management platform. Researchers and healthcare professionals can find the information they need quickly and precisely through natural language.

QA

What problem does this program solve?
It mainly solves the problem that general-purpose large language models have insufficient knowledge in specialized domains (especially in the medical field) and are prone to "hallucinate" or provide inaccurate information. Through RAG technology, model responses are limited to a reliable external medical knowledge base, thus providing more accurate and safe medical advice.
What are the key technologies used in the project?
The project mainly uses Retrieval Augmented Generation (RAG), vector databases (Milvus), natural language processing frameworks (LangChain), hybrid retrieval techniques (combination of dense and sparse vectors BM25), and a variety of optional back-ends for large language modeling (e.g., Ollama, OpenAI, etc.).
How do I replace the embedding model or language model used in my project?
Changing models is as simple as modifying the corresponding YAML configuration file. For example, to change the dense embedding model, you can change the model in thedefault.yamlmodificationsembedding.denseshare ofprovidercap (a poem)modelfields. Similarly, the LLM used for data annotation can be found in theannotator.yamlConfigure it in.
How should I optimize if the retrieval is not satisfactory?
There are several ways to optimize. First, try tweaking thesearch_answer.yamlconfiguration file for different retrieval channels in theweight(weights) to change the fusion ratio of dense and sparse retrieval results . Second, the data used to construct the word list can be examined and expanded to generate a higher qualityvocab.pkl.gzdocuments to improve the accuracy of sparse retrieval. Finally, ensuring that your knowledge base data is of high quality and has broad coverage is fundamental to improving results.

AI open source project Knowledge Retrieval and the RAG Framework

AI productivity tools " Medical-RAG: A Retrieval-Augmented Generation Framework for Constructing Chinese Medical Q&As Posted on 2025-08-25, if you find the URL is out of date, or inaccessible, please contact us.

0Bookmarked

0kudos

Medical-RAG: A Retrieval-Augmented Generation Framework for Constructing Chinese Medical Q&As

Function List

Using Help

Step 1: Environmental preparation

Step 2: Basic Configuration

Step 3: Data Processing and Inventory

Step 4: Search and retrieval

Use of core tools

application scenario

QA

Related articles

Recommended

Can't find AI tools? Try here!

Popular AI tools

New Releases

Latest AI tools

Medical-RAG: A Retrieval-Augmented Generation Framework for Constructing Chinese Medical Q&As

Function List

Using Help

Step 1: Environmental preparation

Step 2: Basic Configuration

Step 3: Data Processing and Inventory

Step 4: Search and retrieval

Use of core tools

application scenario

QA

Related articles

Recommended

Can't find AI tools? Try here!

Popular AI tools

New Releases

Latest AI tools

Quick query station AI tool