{"id":16854,"date":"2024-12-31T20:12:19","date_gmt":"2024-12-31T12:12:19","guid":{"rendered":"https:\/\/www.aisharenet.com\/?p=16854"},"modified":"2024-12-31T20:12:19","modified_gmt":"2024-12-31T12:12:19","slug":"wanzizhangwenjiangtou-r","status":"publish","type":"post","link":"https:\/\/www.kdjingpai.com\/en\/wanzizhangwenjiangtou-r\/","title":{"rendered":"\u4e07\u5b57\u957f\u6587\u8bb2\u900f RAG \u5728DB-GPT\u5b9e\u9645\u843d\u5730\u573a\u666f\u4e2d\u7684\u4f18\u5316"},"content":{"rendered":"<h2>\u524d\u8a00<\/h2>\n<p>\u5728\u8fc7\u53bb\u4e24\u5e74\u4e2d\uff0c\u68c0\u7d22\u589e\u5f3a\u751f\u6210\uff08RAG\uff0cRetrieval-Augmented Generation\uff09\u6280\u672f\u9010\u6e10\u6210\u4e3a\u63d0\u5347\u667a\u80fd\u4f53\u7684\u6838\u5fc3\u7ec4\u6210\u90e8\u5206\u3002\u901a\u8fc7\u7ed3\u5408\u68c0\u7d22\u4e0e\u751f\u6210\u7684\u53cc\u91cd\u80fd\u529b\uff0cRAG\u80fd\u591f\u5f15\u5165\u5916\u90e8\u77e5\u8bc6\uff0c\u4ece\u800c\u4e3a\u5927\u6a21\u578b\u5728\u590d\u6742\u573a\u666f\u4e2d\u7684\u5e94\u7528\u63d0\u4f9b\u66f4\u591a\u53ef\u80fd\u6027\u3002\u4f46\u662f\u5728\u5b9e\u9645\u843d\u5730\u573a\u666f\u4e2d\uff0c\u5f80\u5f80\u4f1a\u5b58\u5728\u68c0\u7d22\u51c6\u786e\u7387\u4f4e\uff0c\u566a\u97f3\u5e72\u6270\u591a\uff0c\u53ec\u56de\u5b8c\u6574\u6027\uff0c\u4e13\u4e1a\u6027\u4e0d\u591f\uff0c\u5bfc\u81f4LLM\u5e7b\u89c9\u4e25\u91cd\u7684\u95ee\u9898\u3002\u672c\u6587\u5c06\u805a\u7126RAG\u5728\u5b9e\u9645\u843d\u5730\u573a\u666f\u4e2d\u7684\u77e5\u8bc6\u52a0\u5de5\u548c\u68c0\u7d22\u7ec6\u8282\uff0c\u5982\u4f55\u53bb\u4f18\u5316RAG Pineline\u94fe\u8def\uff0c\u6700\u7ec8\u63d0\u5347\u53ec\u56de\u51c6\u786e\u7387\u3002<\/p>\n<p>&nbsp;<\/p>\n<p>\u5feb\u901f\u642d\u5efa\u4e00\u4e2aRAG\u667a\u80fd\u95ee\u7b54\u5e94\u7528\u5f88\u7b80\u5355\uff0c\u4f46\u662f\u5728\u5b9e\u9645\u4e1a\u52a1\u573a\u666f\u843d\u5730\u8fd8\u9700\u8981\u505a\u5927\u91cf\u7684\u51c6\u5907\u5de5\u4f5c\u3002<\/p>\n<p>&nbsp;<\/p>\n<h2><strong>1.RAG\u5173\u952e\u6d41\u7a0b\u6e90\u7801\u89e3\u8bfb<\/strong><\/h2>\n<p>\u4e3b\u8981\u5206\u4e3a<strong>\u77e5\u8bc6\u52a0\u5de5<\/strong>\u548c<strong>RAG<\/strong><strong>\u90e8\u5206\u5173\u952e\u6d41\u7a0b\uff1a<\/strong><\/p>\n<p><img decoding=\"async\" src=\"https:\/\/www.kdjingpai.com\/wp-content\/uploads\/2024\/12\/4a23debbd60c739.png\" \/><\/p>\n<p><strong>1. \u77e5\u8bc6\u52a0\u5de5<\/strong><\/p>\n<p>\u77e5\u8bc6\u52a0\u8f7d -&gt; \u77e5\u8bc6\u5207\u7247 -&gt; \u4fe1\u606f\u62bd\u53d6 -&gt; \u77e5\u8bc6\u52a0\u5de5(embedding\/graph\/keywords) -&gt; \u77e5\u8bc6\u5b58\u50a8<\/p>\n<p>&nbsp;<\/p>\n<p><img decoding=\"async\" src=\"https:\/\/www.kdjingpai.com\/wp-content\/uploads\/2024\/12\/2d848bb79485a36.png\" \/><\/p>\n<p>&nbsp;<\/p>\n<ul>\n<li>\u77e5\u8bc6\u52a0\u8f7d<\/li>\n<\/ul>\n<pre># \u77e5\u8bc6\u5de5\u5382\u8fdb\u884c\u5b9e\u4f8b\u5316\r\nKnowledgeFactory -&gt; create() -&gt; load() -&gt; Document\r\n- knowledge\r\n- markdown\r\n- pdf\r\n- docx\r\n- txt\r\n- html\r\n- pptx\r\n- url\r\n- ...<\/pre>\n<p>&nbsp;<\/p>\n<p>\u5982\u4f55\u6269\u5c55\uff1a<\/p>\n<pre>from abc import ABC\r\nfrom typing import List, Any\r\n\r\nclass Knowledge(ABC):\r\ndef load(self) -&gt; List[Document]:\r\n\"\"\"Load knowledge from data loader.\"\"\"\r\npass\r\n\r\n@classmethod\r\ndef document_type(cls) -&gt; Any:\r\n\"\"\"Get document type.\"\"\"\r\npass\r\n\r\n@classmethod\r\ndef support_chunk_strategy(cls) -&gt; List[ChunkStrategy]:\r\n\"\"\"Return supported chunk strategy.\"\"\"\r\nreturn [\r\nChunkStrategy.CHUNK_BY_SIZE,\r\nChunkStrategy.CHUNK_BY_PAGE,\r\nChunkStrategy.CHUNK_BY_PARAGRAPH,\r\nChunkStrategy.CHUNK_BY_MARKDOWN_HEADER,\r\nChunkStrategy.CHUNK_BY_SEPARATOR,\r\n]\r\n\r\n@classmethod\r\ndef default_chunk_strategy(cls) -&gt; ChunkStrategy:\r\n\"\"\"\r\nReturn default chunk strategy.\r\n\r\nReturns:\r\nChunkStrategy: default chunk strategy\r\n\"\"\"\r\nreturn ChunkStrategy.CHUNK_BY_SIZE<\/pre>\n<p>&nbsp;<\/p>\n<ul>\n<li>\u77e5\u8bc6\u5207\u7247<\/li>\n<\/ul>\n<p><img decoding=\"async\" src=\"https:\/\/www.kdjingpai.com\/wp-content\/uploads\/2024\/12\/df8dbdb45a3d866.png\" \/><\/p>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n<p>ChunkManager: \u901a\u8fc7\u52a0\u8f7d\u540e\u7684\u77e5\u8bc6\u6570\u636e\uff0c\u6839\u636e\u7528\u6237\u6307\u5b9a\u7684\u5206\u7247\u7b56\u7565\u548c\u5206\u7247\u53c2\u6570\u8def\u7531\u5230\u5bf9\u5e94\u7684\u5206\u7247\u5904\u7406\u5668\u8fdb\u884c\u5206\u914d\u3002<\/p>\n<pre>class ChunkManager:\r\n\"\"\"Manager for chunks.\"\"\"\r\n\r\ndef __init__(\r\nself,\r\nknowledge: Knowledge,\r\nchunk_parameter: Optional[ChunkParameters] = None,\r\nextractor: Optional[Extractor] = None,\r\n):\r\n\"\"\"\r\nCreate a new ChunkManager with the given knowledge.\r\n\r\nArgs:\r\nknowledge: (Knowledge) Knowledge datasource.\r\nchunk_parameter: (Optional[ChunkParameters]) Chunk parameter.\r\nextractor: (Optional[Extractor]) Extractor to use for summarization.\r\n\"\"\"\r\nself._knowledge = knowledge\r\nself._extractor = extractor\r\nself._chunk_parameters = chunk_parameter or ChunkParameters()\r\nself._chunk_strategy = (\r\nchunk_parameter.chunk_strategy\r\nif chunk_parameter and chunk_parameter.chunk_strategy\r\nelse self._knowledge.default_chunk_strategy().name\r\n)\r\nself._text_splitter = self._chunk_parameters.text_splitter\r\nself._splitter_type = self._chunk_parameters.splitter_type<\/pre>\n<p>&nbsp;<\/p>\n<p>\u5982\u4f55\u6269\u5c55\uff1a\u5982\u679c\u4f60\u60f3\u5728\u754c\u9762\u4e0a\u81ea\u5b9a\u4e49\u4e00\u4e2a\u65b0\u7684\u5206\u7247\u7b56\u7565<\/p>\n<ul>\n<li>\u65b0\u589e\u5207\u7247\u7b56\u7565<\/li>\n<li>\u65b0\u589eSplitter\u5b9e\u73b0\u903b\u8f91<\/li>\n<\/ul>\n<pre>class ChunkStrategy(Enum):\r\n\"\"\"Chunk Strategy Enum.\"\"\"\r\n\r\nCHUNK_BY_SIZE: _STRATEGY_ENUM_TYPE = (\r\nRecursiveCharacterTextSplitter,\r\n[\r\n{\r\n\"param_name\": \"chunk_size\",\r\n\"param_type\": \"int\",\r\n\"default_value\": 512,\r\n\"description\": \"The size of the data chunks used in processing.\",\r\n},\r\n{\r\n\"param_name\": \"chunk_overlap\",\r\n\"param_type\": \"int\",\r\n\"default_value\": 50,\r\n\"description\": \"The amount of overlap between adjacent data chunks.\",\r\n},\r\n],\r\n\"chunk size\",\r\n\"split document by chunk size\",\r\n)\r\n\r\nCHUNK_BY_PAGE: _STRATEGY_ENUM_TYPE = (\r\nPageTextSplitter,\r\n[],\r\n\"page\",\r\n\"split document by page\",\r\n)\r\n\r\nCHUNK_BY_PARAGRAPH: _STRATEGY_ENUM_TYPE = (\r\nParagraphTextSplitter,\r\n[\r\n{\r\n\"param_name\": \"separator\",\r\n\"param_type\": \"string\",\r\n\"default_value\": \"\\n\",\r\n\"description\": \"paragraph separator\",\r\n}\r\n],\r\n\"paragraph\",\r\n\"split document by paragraph\",\r\n)\r\n\r\nCHUNK_BY_SEPARATOR: _STRATEGY_ENUM_TYPE = (\r\nSeparatorTextSplitter,\r\n[\r\n{\r\n\"param_name\": \"separator\",\r\n\"param_type\": \"string\",\r\n\"default_value\": \"\\n\",\r\n\"description\": \"chunk separator\",\r\n},\r\n{\r\n\"param_name\": \"enable_merge\",\r\n\"param_type\": \"boolean\",\r\n\"default_value\": False,\r\n\"description\": (\r\n\"Whether to merge according to the chunk_size after \"\r\n\"splitting by the separator.\"\r\n),\r\n},\r\n],\r\n\"separator\",\r\n\"split document by separator\",\r\n)\r\n\r\nCHUNK_BY_MARKDOWN_HEADER: _STRATEGY_ENUM_TYPE = (\r\nMarkdownHeaderTextSplitter,\r\n[],\r\n\"markdown header\",\r\n\"split document by markdown header\",\r\n)<\/pre>\n<p>&nbsp;<\/p>\n<ul>\n<li>\u77e5\u8bc6\u62bd\u53d6<\/li>\n<li>\u5411\u91cf\u62bd\u53d6 -&gt; embedding, \u5b9e\u73b0<code>Embeddings<\/code>\u63a5\u53e3<\/li>\n<\/ul>\n<pre>@abstractmethod\r\ndef embed_documents(self, texts: List[str]) -&gt; List[List[float]]:\r\n\"\"\"Embed search docs.\"\"\"\r\n\r\n@abstractmethod\r\ndef embed_query(self, text: str) -&gt; List[float]:\r\n\"\"\"Embed query text.\"\"\"\r\n\r\nasync def aembed_documents(self, texts: List[str]) -&gt; List[List[float]]:\r\n\"\"\"Asynchronous Embed search docs.\"\"\"\r\nreturn await asyncio.get_running_loop().run_in_executor(\r\nNone, self.embed_documents, texts\r\n)\r\n\r\nasync def aembed_query(self, text: str) -&gt; List[float]:\r\n\"\"\"Asynchronous Embed query text.\"\"\"\r\nreturn await asyncio.get_running_loop().run_in_executor(\r\nNone, self.embed_query, text\r\n)<\/pre>\n<p>&nbsp;<\/p>\n<pre># EMBEDDING_MODEL=proxy_openai\r\n# proxy_openai_proxy_server_url=https:\/\/api.openai.com\/v1\r\n# proxy_openai_proxy_api_key={your-openai-sk}\r\n# proxy_openai_proxy_backend=text-embedding-ada-002\r\n\r\n## qwen embedding model, See dbgpt\/model\/parameter.py\r\n# EMBEDDING_MODEL=proxy_tongyi\r\n# proxy_tongyi_proxy_backend=text-embedding-v1\r\n# proxy_tongyi_proxy_api_key={your-api-key}\r\n\r\n## qianfan embedding model, See dbgpt\/model\/parameter.py\r\n# EMBEDDING_MODEL=proxy_qianfan\r\n# proxy_qianfan_proxy_backend=bge-large-zh\r\n# proxy_qianfan_proxy_api_key={your-api-key}\r\n# proxy_qianfan_proxy_api_secret={your-secret-key}<\/pre>\n<p>&nbsp;<\/p>\n<ul>\n<li>\u77e5\u8bc6\u56fe\u8c31\u62bd\u53d6 -&gt; knowledge graph<\/li>\n<\/ul>\n<pre>class TripletExtractor(LLMExtractor):\r\n\"\"\"TripletExtractor class.\"\"\"\r\n\r\ndef __init__(self, llm_client: LLMClient, model_name: str):\r\n\"\"\"Initialize the TripletExtractor.\"\"\"\r\nsuper().__init__(llm_client, model_name, TRIPLET_EXTRACT_PT)\r\n\r\nTRIPLET_EXTRACT_PT = (\r\n\"Some text is provided below. Given the text, \"\r\n\"extract up to knowledge triplets as more as possible \"\r\n\"in the form of (subject, predicate, object).\\n\"\r\n\"Avoid stopwords. The subject, predicate, object can not be none.\\n\"\r\n\"---------------------\\n\"\r\n\"Example:\\n\"\r\n\"Text: Alice is Bob's mother.\\n\"\r\n\"Triplets:\\n(Alice, is mother of, Bob)\\n\"\r\n\"Text: Alice has 2 apples.\\n\"\r\n\"Triplets:\\n(Alice, has 2, apple)\\n\"\r\n\"Text: Alice was given 1 apple by Bob.\\n\"\r\n\"Triplets:(Bob, gives 1 apple, Alice)\\n\"\r\n\"Text: Alice was pushed by Bob.\\n\"\r\n\"Triplets:(Bob, pushes, Alice)\\n\"\r\n\"Text: Bob's mother Alice has 2 apples.\\n\"\r\n\"Triplets:\\n(Alice, is mother of, Bob)\\n(Alice, has 2, apple)\\n\"\r\n\"Text: A Big monkey climbed up the tall fruit tree and picked 3 peaches.\\n\"\r\n\"Triplets:\\n(monkey, climbed up, fruit tree)\\n(monkey, picked 3, peach)\\n\"\r\n\"Text: Alice has 2 apples, she gives 1 to Bob.\\n\"\r\n\"Triplets:\\n\"\r\n\"(Alice, has 2, apple)\\n(Alice, gives 1 apple, Bob)\\n\"\r\n\"Text: Philz is a coffee shop founded in Berkeley in 1982.\\n\"\r\n\"Triplets:\\n\"\r\n\"(Philz, is, coffee shop)\\n(Philz, founded in, Berkeley)\\n\"\r\n\"(Philz, founded in, 1982)\\n\"\r\n\"---------------------\\n\"\r\n\"Text: {text}\\n\"\r\n\"Triplets:\\n\"\r\n)<\/pre>\n<p>&nbsp;<\/p>\n<ul>\n<li style=\"list-style-type: none;\">\n<ul>\n<li>\u5012\u6392\u7d22\u5f15\u62bd\u53d6 -&gt; keywords\u5206\u8bcd\n<ul>\n<li>\u53ef\u4ee5\u7528es\u9ed8\u8ba4\u7684\u5206\u8bcd\u5e93\uff0c\u4e5f\u53ef\u4ee5\u4f7f\u7528es\u7684\u63d2\u4ef6\u6a21\u5f0f\u81ea\u5b9a\u4e49\u5206\u8bcd<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<\/li>\n<li>\u77e5\u8bc6\u5b58\u50a8<\/li>\n<\/ul>\n<p>\u6574\u4e2a\u77e5\u8bc6\u6301\u4e45\u5316\u7edf\u4e00\u5b9e\u73b0\u4e86<code>IndexStoreBase<\/code>\u63a5\u53e3\uff0c\u76ee\u524d\u63d0\u4f9b\u4e86\u5411\u91cf\u6570\u636e\u5e93\u3001\u56fe\u6570\u636e\u5e93\u3001\u5168\u6587\u7d22\u5f15\u4e09\u7c7b\u5b9e\u73b0<\/p>\n<p><img decoding=\"async\" src=\"https:\/\/www.kdjingpai.com\/wp-content\/uploads\/2024\/12\/0e65020b817fc8e.png\" \/><\/p>\n<ul>\n<li>VectorStore\uff0c\u5411\u91cf\u6570\u636e\u5e93\u4e3b\u8981\u903b\u8f91\u90fd\u5728load_document()\uff0c\u5305\u62ec\u7d22\u5f15schema\u521b\u5efa\uff0c\u5411\u91cf\u6570\u636e\u5206\u6279\u5199\u5165\u7b49\u7b49\u3002<\/li>\n<\/ul>\n<pre># Base class hierarchy\r\n- VectorStoreBase\r\n- ChromaStore\r\n- MilvusStore\r\n- OceanbaseStore\r\n- ElasticsearchStore\r\n- PGVectorStore\r\n\r\n# Base class definition\r\nclass VectorStoreBase(IndexStoreBase, ABC):\r\n\"\"\"\r\nVector store base class.\r\n\"\"\"\r\n\r\n@abstractmethod\r\ndef load_document(self, chunks: List[Chunk]) -&gt; List[str]:\r\n\"\"\"\r\nLoad document in index database.\r\n\"\"\"\r\npass\r\n\r\n@abstractmethod\r\nasync def aload_document(self, chunks: List[Chunk]) -&gt; List[str]:\r\n\"\"\"\r\nLoad document in index database asynchronously.\r\n\"\"\"\r\npass\r\n\r\n@abstractmethod\r\ndef similar_search_with_scores(\r\nself,\r\ntext: str,\r\ntopk: int,\r\nscore_threshold: float,\r\nfilters: Optional[MetadataFilters] = None,\r\n) -&gt; List[Chunk]:\r\n\"\"\"\r\nPerform a similar search with scores in the index database.\r\n\"\"\"\r\npass\r\n\r\ndef similar_search(\r\nself,\r\ntext: str,\r\ntopk: int,\r\nfilters: Optional[MetadataFilters] = None,\r\n) -&gt; List[Chunk]:\r\n\"\"\"\r\nPerform a similar search in the index database.\r\n\"\"\"\r\nreturn self.similar_search_with_scores(text, topk, 1.0, filters)<\/pre>\n<p>&nbsp;<\/p>\n<ul>\n<li>GraphStore \uff0c\u5177\u4f53\u7684\u56fe\u5b58\u50a8\u63d0\u4f9b\u4e86\u4e09\u5143\u7ec4\u5199\u5165\u7684\u5b9e\u73b0\uff0c\u4e00\u822c\u4f1a\u8c03\u7528\u5177\u4f53\u7684\u56fe\u6570\u636e\u5e93\u7684\u67e5\u8be2\u8bed\u8a00\u6765\u5b8c\u6210\u3002\u4f8b\u5982<code>TuGraphStore<\/code>\u4f1a\u6839\u636e\u4e09\u5143\u7ec4\u751f\u6210\u5177\u4f53\u7684Cypher\u8bed\u53e5\u5e76\u6267\u884c\u3002<\/li>\n<\/ul>\n<ul>\n<li style=\"list-style-type: none;\">\n<ul>\n<li>\u56fe\u5b58\u50a8\u63a5\u53e3GraphStoreBase\u63d0\u4f9b\u7edf\u4e00\u7684\u56fe\u5b58\u50a8\u62bd\u8c61\uff0c\u76ee\u524d\u5185\u7f6e\u4e86<code>MemoryGraphStore<\/code>\u548c<code>TuGraphStore<\/code>\u7684\u5b9e\u73b0\uff0c\u6211\u4eec\u4e5f\u63d0\u4f9bNeo4j\u63a5\u53e3\u7ed9\u5f00\u53d1\u8005\u8fdb\u884c\u63a5\u5165\u3002<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<pre># GraphStoreBase -&gt; TuGraphStore -&gt; Neo4jStore\r\ndef insert_triplet(self, subj: str, rel: str, obj: str) -&gt; None:\r\n\"\"\"Add triplet.\"\"\"\r\n# Create queries to merge nodes and relationship\r\nsubj_query = f\"MERGE (n1:{self._node_label} {{id:'{subj}'}})\"\r\nobj_query = f\"MERGE (n2:{self._node_label} {{id:'{obj}'}})\"\r\nrel_query = (\r\nf\"MERGE (n1:{self._node_label} {{id:'{subj}'}})\"\r\nf\"-[r:{self._edge_label} {{id:'{rel}'}}]-&gt;\"\r\nf\"(n2:{self._node_label} {{id:'{obj}'}})\"\r\n)\r\n# Execute queries\r\nself.conn.run(query=subj_query)\r\nself.conn.run(query=obj_query)\r\nself.conn.run(query=rel_query)<\/pre>\n<p>&nbsp;<\/p>\n<ul>\n<li style=\"list-style-type: none;\">\n<ul>\n<li>FullTextStore: \u901a\u8fc7\u6784\u5efaes\u7d22\u5f15\uff0c\u901a\u8fc7es\u5185\u7f6e\u5206\u8bcd\u7b97\u6cd5\u8fdb\u884c\u5206\u8bcd\uff0c\u7136\u540e\u7531es\u6784\u5efakeyword-&gt;doc_id\u7684\u5012\u6392\u7d22\u5f15\u3002<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<pre>{\r\n\"analysis\": {\r\n\"analyzer\": {\r\n\"default\": {\r\n\"type\": \"standard\"\r\n}\r\n}\r\n},\r\n\"similarity\": {\r\n\"custom_bm25\": {\r\n\"type\": \"<a href=\"https:\/\/www.kdjingpai.com\/bm25\/\">BM25<\/a>\",\r\n\"k1\": self._k1,\r\n\"b\": self._b\r\n}\r\n}\r\n}\r\n\r\nself._es_mappings = {\r\n\"properties\": {\r\n\"content\": {\r\n\"type\": \"text\",\r\n\"similarity\": \"custom_bm25\"\r\n},\r\n\"metadata\": {\r\n\"type\": \"keyword\"\r\n}\r\n}\r\n}\r\n\r\n# FullTextStoreBase\r\n# ElasticDocumentStore\r\n# OpenSearchStore<\/pre>\n<p>&nbsp;<\/p>\n<p><strong>2.\u77e5\u8bc6\u68c0\u7d22<\/strong><\/p>\n<p>question -&gt; rewrite -&gt; similarity_search -&gt; rerank -&gt; context_candidates<\/p>\n<p>\u63a5\u4e0b\u6765\u662f\u77e5\u8bc6\u68c0\u7d22\uff0c\u76ee\u524d\u793e\u533a\u7684\u68c0\u7d22\u903b\u8f91\u4e3b\u8981\u5206\u4e3a\u8fd9\u51e0\u6b65\uff0c\u5982\u679c\u4f60\u8bbe\u7f6e\u4e86\u67e5\u8be2\u6539\u5199\u53c2\u6570\uff0c\u76ee\u524d\u4f1a\u901a\u8fc7\u5927\u6a21\u578b\u7ed9\u4f60\u8fdb\u884c\u4e00\u8f6e\u95ee\u9898\u6539\u5199\uff0c\u7136\u540e\u4f1a\u6839\u636e\u4f60\u7684\u77e5\u8bc6\u52a0\u5de5\u65b9\u5f0f\u8def\u7531\u5230\u5bf9\u5e94\u7684\u68c0\u7d22\u5668\uff0c\u5982\u679c\u4f60\u662f\u901a\u8fc7\u5411\u91cf\u8fdb\u884c\u52a0\u5de5\u7684\uff0c\u90a3\u5c31\u4f1a\u901a\u8fc7EmbeddingRetriever\u8fdb\u884c\u68c0\u7d22\uff0c\u5982\u679c\u4f60\u6784\u5efa\u65b9\u5f0f\u662f\u901a\u8fc7\u77e5\u8bc6\u56fe\u8c31\u6784\u5efa\u7684\uff0c\u5c31\u4f1a\u6309\u7167\u77e5\u8bc6\u56fe\u8c31\u65b9\u5f0f\u8fdb\u884c\u68c0\u7d22\uff0c\u5982\u679c\u4f60\u8bbe\u7f6e\u4e86rerank\u6a21\u578b\uff0c\u4f1a\u7ed9\u7c97\u7b5b\u540e\u7684\u5019\u9009\u503c\u8fdb\u884c\u7cbe\u7b5b\uff0c\u8ba9\u5019\u9009\u503c\u548c\u7528\u6237\u95ee\u9898\u66f4\u6709\u5173\u8054\u3002<\/p>\n<p><img decoding=\"async\" src=\"https:\/\/www.kdjingpai.com\/wp-content\/uploads\/2024\/12\/ae4453d80a3a59d.png\" \/><\/p>\n<ul>\n<li>EmbeddingRetriever<\/li>\n<\/ul>\n<pre>class EmbeddingRetriever(BaseRetriever):\r\n\"\"\"Embedding retriever.\"\"\"\r\n\r\ndef __init__(\r\nself,\r\nindex_store: IndexStoreBase,\r\ntop_k: int = 4,\r\nquery_rewrite: Optional[QueryRewrite] = None,\r\nrerank: Optional[Ranker] = None,\r\nretrieve_strategy: Optional[RetrieverStrategy] = RetrieverStrategy.EMBEDDING,\r\n):\r\npass\r\n\r\nasync def _aretrieve_with_score(\r\nself,\r\nquery: str,\r\nscore_threshold: float,\r\nfilters: Optional[MetadataFilters] = None,\r\n) -&gt; List[Chunk]:\r\n\"\"\"\r\nRetrieve knowledge chunks with score.\r\n\r\nArgs:\r\nquery (str): Query text.\r\nscore_threshold (float): Score threshold.\r\nfilters: Metadata filters.\r\n\r\nReturns:\r\nList[Chunk]: List of chunks with score.\r\n\"\"\"\r\nqueries = [query]\r\n\r\nnew_queries = await self._query_rewrite.rewrite(\r\norigin_query=query,\r\ncontext=context,\r\nnums=1\r\n)\r\nqueries.extend(new_queries)\r\n\r\ncandidates_with_score = [\r\nself._similarity_search_with_score(\r\nquery,\r\nscore_threshold,\r\nfilters,\r\nroot_tracer.get_current_span_id()\r\n)\r\nfor query in queries\r\n]\r\n\r\nnew_candidates_with_score = await self._rerank.arank(\r\nnew_candidates_with_score,\r\nquery\r\n)\r\n\r\nreturn new_candidates_with_score<\/pre>\n<ul>\n<li style=\"list-style-type: none;\">\n<ul>\n<li>index_store: \u5177\u4f53\u7684\u5411\u91cf\u6570\u636e\u5e93<\/li>\n<li>top_k: \u8fd4\u56de\u7684\u5177\u4f53\u5019\u9009chunk\u4e2a\u6570<\/li>\n<li>query_rewrite\uff1a\u67e5\u8be2\u6539\u5199\u51fd\u6570<\/li>\n<li>rerank\uff1a\u91cd\u6392\u5e8f\u51fd\u6570<\/li>\n<li>query:\u539f\u59cb\u67e5\u8be2<\/li>\n<li>score_threshold\uff1a\u5f97\u5206\uff0c\u6211\u4eec\u9ed8\u8ba4\u4f1a\u628a\u76f8\u4f3c\u5ea6\u5f97\u5206\u5c0f\u4e8e\u9608\u503c\u7684\u4e0a\u4e0b\u6587\u4fe1\u606f\u7ed9\u8fc7\u6ee4\u6389<\/li>\n<li>filters\uff1a<code>Optional[MetadataFilters]<\/code>, \u5143\u6570\u636e\u4fe1\u606f\u8fc7\u6ee4\u5668\uff0c\u4e3b\u8981\u662f\u53ef\u4ee5\u7528\u6765\u524d\u7f6e\u901a\u8fc7\u5c5e\u6027\u4fe1\u606f\u7b5b\u6389\u4e00\u4e9b\u4e0d\u5339\u914d\u7684\u5019\u9009\u4fe1\u606f\u3002<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<pre>from enum import Enum\r\nfrom typing import Union, List\r\nfrom pydantic import BaseModel, Field\r\n\r\nclass FilterCondition(str, Enum):\r\n\"\"\"Vector Store Meta data filter conditions.\"\"\"\r\nAND = \"and\"\r\nOR = \"or\"\r\n\r\nclass MetadataFilter(BaseModel):\r\n\"\"\"Meta data filter.\"\"\"\r\nkey: str = Field(\r\n..., \r\ndescription=\"The key of metadata to filter.\"\r\n)\r\n<a href=\"https:\/\/www.kdjingpai.com\/openai-tuichushougel\/\">operator<\/a>: FilterOperator = Field(\r\ndefault=FilterOperator.EQ, \r\ndescription=\"The operator of metadata filter.\"\r\n)\r\nvalue: Union[str, int, float, List[str], List[int], List[float]] = Field(\r\n..., \r\ndescription=\"The value of metadata to filter.\"\r\n)<\/pre>\n<p>&nbsp;<\/p>\n<ul>\n<li>Graph <a href=\"https:\/\/www.kdjingpai.com\/rag\/\">RAG<\/a><\/li>\n<\/ul>\n<p><img decoding=\"async\" src=\"https:\/\/www.kdjingpai.com\/wp-content\/uploads\/2024\/12\/6c25e3cf7353d51.png\" \/><\/p>\n<p>\u9996\u5148\u901a\u8fc7\u6a21\u578b\u8fdb\u884c\u5173\u952e\u8bcd\u62bd\u53d6\uff0c\u8fd9\u91cc\u53ef\u4ee5\u901a\u8fc7\u4f20\u7edf\u7684nlp\u6280\u672f\u8fdb\u884c\u5206\u8bcd\uff0c\u4e5f\u53ef\u4ee5\u901a\u8fc7\u5927\u6a21\u578b\u8fdb\u884c\u5206\u8bcd\uff0c\u7136\u540e\u8fdb\u884c\u5173\u952e\u8bcd\u6309\u7167\u540c\u4e49\u8bcd\u505a\u6269\u5145\uff0c\u627e\u5230\u5173\u952e\u8bcd\u7684\u5019\u9009\u5217\u8868\uff0c\u6700\u597d\u6839\u636e\u5173\u952e\u8bcd\u5019\u9009\u5217\u8868\u8c03\u7528explore\u65b9\u6cd5\u53ec\u56de\u5c40\u90e8\u5b50\u56fe\u3002<\/p>\n<pre>KEYWORD_EXTRACT_PT = (\r\n\"A question is provided below. Given the question, extract up to \"\r\n\"keywords from the text. Focus on extracting the keywords that we can use \"\r\n\"to best lookup answers to the question.\\n\"\r\n\"Generate as more as possible synonyms or alias of the keywords \"\r\n\"considering possible cases of capitalization, pluralization, \"\r\n\"common expressions, etc.\\n\"\r\n\"Avoid stopwords.\\n\"\r\n\"Provide the keywords and synonyms in comma-separated format.\"\r\n\"Formatted keywords and synonyms text should be separated by a semicolon.\\n\"\r\n\"---------------------\\n\"\r\n\"Example:\\n\"\r\n\"Text: Alice is Bob's mother.\\n\"\r\n\"Keywords:\\nAlice,mother,Bob;mummy\\n\"\r\n\"Text: Philz is a coffee shop founded in Berkeley in 1982.\\n\"\r\n\"Keywords:\\nPhilz,coffee shop,Berkeley,1982;coffee bar,coffee house\\n\"\r\n\"---------------------\\n\"\r\n\"Text: {text}\\n\"\r\n\"Keywords:\\n\"\r\n)\r\n\r\ndef explore(\r\nself,\r\nsubs: List[str],\r\ndirect: Direction = Direction.BOTH,\r\ndepth: Optional[int] = None,\r\nfan: Optional[int] = None,\r\nlimit: Optional[int] = None,\r\n) -&gt; Graph:\r\n\"\"\"Explore on graph.\"\"\"<\/pre>\n<p>&nbsp;<\/p>\n<ul>\n<li><code>DBSchemaRetriever<\/code>\u8fd9\u90e8\u5206\u662fChatData\u573a\u666f\u7684schema-linking\u68c0\u7d22\n<p>\u4e3b\u8981\u662f\u901a\u8fc7schema-linking\u65b9\u5f0f\u901a\u8fc7\u4e8c\u9636\u6bb5\u76f8\u4f3c\u5ea6\u68c0\u7d22\uff0c\u9996\u5148\u5148\u627e\u5230\u6700\u76f8\u5173\u7684\u8868\uff0c\u7136\u540e\u518d\u6700\u76f8\u5173\u7684\u5b57\u6bb5\u4fe1\u606f\u3002<\/p>\n<p>\u4f18\u70b9\uff1a\u8fd9\u79cd\u4e8c\u9636\u6bb5\u68c0\u7d22\u4e5f\u662f\u4e3a\u4e86\u89e3\u51b3\u793e\u533a\u53cd\u9988\u7684\u5927\u5bbd\u8868\u4f53\u9a8c\u7684\u95ee\u9898\u3002<\/li>\n<\/ul>\n<pre>def _similarity_search(self, query, filters: Optional[MetadataFilters] = None) -&gt; List[Chunk]:\r\n\"\"\"Similar search.\"\"\"\r\n\r\n# Perform similarity search with scores\r\ntable_chunks = self._table_vector_store_connector.similar_search_with_scores(\r\nquery, self._top_k, 0, filters\r\n)\r\n\r\n# Filter out chunks with 'separated' metadata\r\nnot_sep_chunks = [\r\nchunk for chunk in table_chunks if not chunk.metadata.get(\"separated\")\r\n]\r\nseparated_chunks = [\r\nchunk for chunk in table_chunks if chunk.metadata.get(\"separated\")\r\n]\r\n\r\n# If no separated chunks, return the non-separated chunks\r\nif not separated_chunks:\r\nreturn not_sep_chunks\r\n\r\n# Create tasks list for retrieving fields from separated chunks\r\ntasks = [\r\nlambda c=chunk: self._retrieve_field(c, query) for chunk in separated_chunks\r\n]\r\n\r\n# Run tasks concurrently with a concurrency limit of 3\r\nseparated_result = run_tasks(tasks, concurrency_limit=3)\r\n\r\n# Combine and return results\r\nreturn not_sep_chunks + separated_result<\/pre>\n<p>&nbsp;<\/p>\n<ul>\n<li style=\"list-style-type: none;\">\n<ul>\n<li>table_vector_store_connector: \u8d1f\u8d23\u68c0\u7d22\u6700\u76f8\u5173\u7684\u8868\u3002<\/li>\n<li>field_vector_store_connector: \u8d1f\u8d23\u68c0\u7d22\u6700\u76f8\u5173\u7684\u5b57\u6bb5\u3002<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<p>&nbsp;<\/p>\n<h2>2.\u77e5\u8bc6\u52a0\u5de5\uff0c\u77e5\u8bc6\u68c0\u7d22\u4f18\u5316\u601d\u8def<\/h2>\n<p>\u76ee\u524dRAG\u667a\u80fd\u95ee\u7b54\u5e94\u7528\u51e0\u4e2a\u75db\u70b9\uff1a<\/p>\n<ul>\n<li>\u77e5\u8bc6\u5e93\u6587\u6863\u8d8a\u6765\u8d8a\u591a\u4ee5\u540e\uff0c\u68c0\u7d22\u566a\u97f3\u5927\uff0c\u53ec\u56de\u51c6\u786e\u7387\u4e0d\u9ad8<\/li>\n<li>\u53ec\u56de\u4e0d\u5168\uff0c\u5b8c\u6574\u6027\u4e0d\u591f<\/li>\n<li>\u53ec\u56de\u548c\u7528\u6237\u95ee\u9898\u610f\u56fe\u76f8\u5173\u6027\u4e0d\u5927<\/li>\n<li>\u53ea\u80fd\u56de\u7b54\u9759\u6001\u6570\u636e\uff0c\u65e0\u6cd5\u52a8\u6001\u83b7\u53d6\u77e5\u8bc6\uff0c\u5bfc\u81f4\u7b54\u7591\u5e94\u7528\u6bd4\u8f83\u5446\uff0c\u6bd4\u8f83\u7b28\u3002<\/li>\n<\/ul>\n<h3><strong>1<\/strong><strong>. \u77e5\u8bc6\u5904\u7406\u4f18\u5316<\/strong><\/h3>\n<p>\u975e\u7ed3\u6784\u5316\/\u534a\u7ed3\u6784\u5316\/\u7ed3\u6784\u5316\u6570\u636e\u7684\u5904\u7406\uff0c\u51c6\u5907\u51b3\u5b9a\u7740RAG\u5e94\u7528\u7684\u4e0a\u9650\uff0c\u56e0\u6b64\u9996\u5148\u9700\u8981\u5728\u77e5\u8bc6\u5904\u7406\uff0c\u7d22\u5f15\u9636\u6bb5\u505a\u5927\u91cf\u7684\u7ec6\u7c92\u5ea6\u7684ETL\u5de5\u4f5c\uff0c\u4e3b\u8981\u4f18\u5316\u7684\u601d\u8def\u65b9\u5411\uff1a<\/p>\n<ul>\n<li>\u975e\u7ed3\u6784\u5316 -&gt; \u7ed3\u6784\u5316\uff1a\u6709\u6761\u7406\u5730\u7ec4\u7ec7\u77e5\u8bc6\u4fe1\u606f\u3002<\/li>\n<li>\u63d0\u53d6\u66f4\u52a0\u4e30\u5bcc\u7684, \u591a\u5143\u5316\u7684\u8bed\u4e49\u4fe1\u606f\u3002<\/li>\n<\/ul>\n<h4>\u00a01.1 \u77e5\u8bc6\u52a0\u8f7d<\/h4>\n<p><span draggable=\"true\">\u76ee\u7684\uff1a\u9700\u8981\u5bf9\u6587\u6863\u8fdb\u884c\u7cbe\u786e\u7684\u89e3\u6790\uff0c\u66f4\u591a\u5143\u5316\u7684\u8bc6\u522b\u5230\u4e0d\u540c\u7c7b\u578b\u7684\u6570\u636e\u3002<\/span><\/p>\n<p>\u4f18\u5316\u5efa\u8bae\uff1a<\/p>\n<ul>\n<li>\u5efa\u8bae\u5c06docx\u3001txt\u6216\u8005\u5176\u4ed6\u6587\u672c\u4e8b\u5148\u5904\u7406\u4e3apdf\u6216\u8005markdown\u683c\u5f0f\uff0c\u8fd9\u6837\u53ef\u4ee5\u5229\u7528\u4e00\u4e9b\u8bc6\u522b\u5de5\u5177\u66f4\u597d\u5730\u63d0\u53d6\u6587\u672c\u4e2d\u7684\u5404\u9879\u5185\u5bb9\u3002<\/li>\n<li>\u63d0\u53d6\u6587\u672c\u4e2d\u7684\u8868\u683c\u4fe1\u606f\u3002<\/li>\n<li>\u4fdd\u7559markdown\u548cpdf\u7684\u6807\u9898\u5c42\u7ea7\u4fe1\u606f\uff0c\u4e3a\u63a5\u4e0b\u6765\u7684\u5c42\u7ea7\u5173\u7cfb\u6811\u7b49\u7d22\u5f15\u65b9\u5f0f\u51c6\u5907\u3002<\/li>\n<li>\u4fdd\u7559\u56fe\u7247\u94fe\u63a5\uff0c\u516c\u5f0f\u7b49\u4fe1\u606f\uff0c\u4e5f\u7edf\u4e00\u5904\u7406\u6210markdown\u7684\u683c\u5f0f\u3002<\/li>\n<\/ul>\n<p><img decoding=\"async\" src=\"https:\/\/www.kdjingpai.com\/wp-content\/uploads\/2024\/12\/9cd2228638b0a2b.png\" \/><\/p>\n<h4>\u00a01.2 \u5207\u7247Chunk\u5c3d\u91cf\u4fdd\u6301\u5b8c\u6574<\/h4>\n<p>\u76ee\u7684\uff1a\u4fdd\u5b58\u4e0a\u4e0b\u6587\u5b8c\u6574\u6027\u548c\u76f8\u5173\u6027\uff0c\u8fd9\u76f4\u63a5\u5173\u4e4e\u56de\u590d\u51c6\u786e\u7387\u3002<\/p>\n<p>\u4fdd\u6301\u5728\u5927\u6a21\u578b\u7684\u4e0a\u4e0b\u6587\u9650\u5236\u5185\uff0c\u5206\u5757\u4fdd\u8bc1\u8f93\u5165\u5230LLMs\u7684\u6587\u672c\u4e0d\u4f1a\u8d85\u8fc7\u5176token\u9650\u5236\u3002<\/p>\n<p>\u4f18\u5316\u5efa\u8bae\uff1a<\/p>\n<ul>\n<li>\u56fe\u7247 + \u8868\u683c \u5355\u72ec\u62bd\u53d6\u6210Chunk\uff0c\u5c06\u8868\u683c\u548c\u56fe\u7247\u6807\u9898\u4fdd\u7559\u5230metadata\u5143\u6570\u636e\u91cc<\/li>\n<li>\u6587\u6863\u5185\u5bb9\u5c3d\u91cf\u6309\u7167\u6807\u9898\u5c42\u7ea7\u6216\u8005Markdown Header\u8fdb\u884c\u62c6\u5206\uff0c\u5c3d\u53ef\u80fd\u4fdd\u7559chunk\u7684\u5b8c\u6574\u6027\u3002<\/li>\n<li>\u5982\u679c\u6709\u81ea\u5b9a\u4e49\u5206\u9694\u7b26\u53ef\u4ee5\u6309\u7167\u81ea\u5b9a\u4e49\u5206\u5272\u7b26\u5207\u5206<\/li>\n<\/ul>\n<h4>\u00a01.3 \u591a\u5143\u5316\u7684\u4fe1\u606f\u62bd\u53d6<\/h4>\n<p>\u9664\u4e86\u5bf9\u6587\u6863\u8fdb\u884cEmbedding\u5411\u91cf\u62bd\u53d6\u5916\uff0c\u5176\u4ed6\u591a\u5143\u5316\u7684\u4fe1\u606f\u62bd\u53d6\u80fd\u591f\u5bf9\u6587\u6863\u8fdb\u884c\u6570\u636e\u589e\u5f3a\uff0c\u663e\u8457\u63d0\u5347RAG\u53ec\u56de\u6548\u679c\u3002<\/p>\n<ul>\n<li>\u77e5\u8bc6\u56fe\u8c31\n<ul>\n<li style=\"list-style-type: none;\">\n<ul>\n<li>\u4f18\u70b9\uff1a1. \u89e3\u51b3NativeRAG\u7684\u5b8c\u6574\u6027\u7f3a\u5931\uff0c\u4f9d\u7136\u5b58\u5728\u5e7b\u89c9\u95ee\u9898\uff0c\u77e5\u8bc6\u7684\u51c6\u786e\u6027\uff0c\u5305\u62ec\u77e5\u8bc6\u8fb9\u754c\u7684\u5b8c\u6574\u6027\u3001\u77e5\u8bc6\u7ed3\u6784\u548c\u8bed\u4e49\u7684\u6e05\u6670\u6027\uff0c\u662f\u5bf9\u76f8\u4f3c\u5ea6\u68c0\u7d22\u7684\u80fd\u529b\u7684\u4e00\u79cd\u8bed\u4e49\u8865\u5145\u3002<\/li>\n<li>\u9002\u7528\u573a\u666f\uff1a\u9002\u7528\u4e8e\u4e25\u8c28\u7684\u4e13\u4e1a\u9886\u57df(\u533b\u7597\uff0c\u8fd0\u7ef4\u7b49)\uff0c\u77e5\u8bc6\u7684\u51c6\u5907\u9700\u8981\u53d7\u5230\u7ea6\u675f\u7684\u5e76\u4e14\u77e5\u8bc6\u4e4b\u95f4\u80fd\u591f\u660e\u663e\u5efa\u7acb\u5c42\u7ea7\u5173\u7cfb\u7684\u3002<\/li>\n<li>\u5982\u4f55\u5b9e\u73b0\uff1a\n<p>1.\u4f9d\u8d56\u5927\u6a21\u578b\u63d0\u53d6(\u5b9e\u4f53,\u5173\u7cfb,\u5b9e\u4f53)\u4e09\u5143\u7ec4\u5173\u7cfb\u3002<\/p>\n<p>2. \u4f9d\u8d56\u524d\u671f\u9ad8\u8d28\u91cf\uff0c\u7ed3\u6784\u5316\u7684\u77e5\u8bc6\u51c6\u5907\uff0c\u6e05\u6d17\uff0c\u62bd\u53d6\uff0c\u901a\u8fc7\u4e1a\u52a1\u89c4\u5219\u901a\u8fc7\u624b\u52a8\u6216\u8005\u81ea\u5b9a\u4e49SOP\u6d41\u7a0b\u6784\u5efa\u77e5\u8bc6\u56fe\u8c31\u3002<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<p><img decoding=\"async\" src=\"https:\/\/www.kdjingpai.com\/wp-content\/uploads\/2024\/12\/94f329cc1153add.png\" \/><\/p>\n<ul>\n<li>Doc Tree\n<ul>\n<li style=\"list-style-type: none;\">\n<ul>\n<li>\u9002\u7528\u573a\u666f\uff1a\u89e3\u51b3\u4e86\u4e0a\u4e0b\u6587\u5b8c\u6574\u6027\u4e0d\u8db3\u7684\u95ee\u9898\uff0c\u4e5f\u80fd\u5339\u914d\u65f6\u5b8c\u5168\u4f9d\u636e\u8bed\u4e49\u548c\u5173\u952e\u8bcd\uff0c\u80fd\u591f\u51cf\u5c11\u566a\u97f3<\/li>\n<li>\u5982\u4f55\u5b9e\u73b0\uff1a\u4ee5\u6807\u9898\u5c42\u7ea7\u6784\u5efachunk\u7684\u6811\u5f62\u8282\u70b9\uff0c\u5f62\u6210\u4e00\u4e2a\u591a\u53c9\u6811\u7ed3\u6784\uff0c\u6bcf\u4e00\u5c42\u7ea7\u8282\u70b9\u53ea\u9700\u8981\u5b58\u50a8\u6587\u6863\u6807\u9898\uff0c\u53f6\u5b50\u8282\u70b9\u5b58\u50a8\u5177\u4f53\u7684\u6587\u672c\u5185\u5bb9\u3002\u8fd9\u6837\u5229\u7528\u6811\u7684\u904d\u5386\u7b97\u6cd5\uff0c\u5982\u679c\u7528\u6237\u95ee\u9898\u547d\u4e2d\u76f8\u5173\u975e\u53f6\u5b50\u6807\u9898\u8282\u70b9\uff0c\u5c31\u53ef\u4ee5\u5c06\u76f8\u5173\u7684\u5b50\u8282\u70b9\u6570\u636e\u8fdb\u884c\u53ec\u56de\u3002\u8fd9\u6837\u5c31\u4e0d\u4f1a\u5b58\u5728chunk\u5b8c\u6574\u6027\u7f3a\u5931\u7684\u95ee\u9898\u3002<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<p><img decoding=\"async\" src=\"https:\/\/www.kdjingpai.com\/wp-content\/uploads\/2024\/12\/c20e02d00657e7a.png\" \/><\/p>\n<p>\u8fd9\u90e8\u5206\u7684Feature\u6211\u4eec\u4e5f\u4f1a\u5728\u660e\u5e74\u5e74\u521d\u653e\u5230\u793e\u533a\u91cc\u9762\u3002<\/p>\n<ul>\n<li>\u63d0\u53d6QA\u5bf9\uff0c\u9700\u8981\u524d\u7f6e\u901a\u8fc7\u9884\u5b9a\u4e49\u6216\u8005\u6a21\u578b\u62bd\u53d6\u7684\u65b9\u5f0f\u63d0\u53d6QA\u5bf9\u4fe1\u606f<\/li>\n<\/ul>\n<ul>\n<li style=\"list-style-type: none;\">\n<ul>\n<li>\u9002\u7528\u573a\u666f\uff1a<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<ul>\n<li style=\"list-style-type: none;\">\n<ul>\n<li style=\"list-style-type: none;\">\n<ul>\n<li>\u80fd\u591f\u5728\u68c0\u7d22\u4e2d\u547d\u4e2d\u95ee\u9898\u5e76\u76f4\u63a5\u8fdb\u884c\u53ec\u56de\uff0c\u76f4\u63a5\u68c0\u7d22\u5230\u7528\u6237\u60f3\u8981\u7684\u7b54\u6848\uff0c\u9002\u7528\u4e8e\u4e00\u4e9bFAQ\u573a\u666f\uff0c\u53ec\u56de\u5b8c\u6574\u6027\u4e0d\u591f\u7684\u573a\u666f\u3002<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<ul>\n<li style=\"list-style-type: none;\">\n<ul>\n<li>\u5982\u4f55\u5b9e\u73b0\uff1a<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<ul>\n<li style=\"list-style-type: none;\">\n<ul>\n<li style=\"list-style-type: none;\">\n<ul>\n<li>\u9884\u5b9a\u4e49:\u9884\u5148\u4e3a\u6bcf\u4e2achunk\u6dfb\u52a0\u4e00\u4e9b\u95ee\u9898<\/li>\n<li>\u6a21\u578b\u62bd\u53d6:\u901a\u8fc7\u7ed9\u5b9a\u4e00\u4e0b\u4e0a\u4e0b\u6587\uff0c\u8ba9\u6a21\u578b\u8fdb\u884cQA\u5bf9\u62bd\u53d6<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<p>&nbsp;<\/p>\n<ul>\n<li>\u5143\u6570\u636e\u62bd\u53d6<\/li>\n<\/ul>\n<ul>\n<li style=\"list-style-type: none;\">\n<ul>\n<li>\u5982\u4f55\u5b9e\u73b0\uff1a\u6839\u636e\u81ea\u8eab\u4e1a\u52a1\u6570\u636e\u7279\u70b9\uff0c\u63d0\u53d6\u6570\u636e\u7684\u7279\u5f81\u8fdb\u884c\u4fdd\u7559\uff0c\u6bd4\u5982\u6807\u7b7e\uff0c\u7c7b\u522b\uff0c\u65f6\u95f4\uff0c\u7248\u672c\u7b49\u5143\u6570\u636e\u5c5e\u6027\u3002<\/li>\n<li>\u9002\u7528\u573a\u666f\uff1a\u68c0\u7d22\u65f6\u5019\u80fd\u591f\u9884\u5148\u6839\u636e\u5143\u6570\u636e\u5c5e\u6027\u8fdb\u884c\u8fc7\u6ee4\u6389\u5927\u90e8\u5206\u566a\u97f3\u3002<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<ul>\n<li>\u603b\u7ed3\u63d0\u53d6<\/li>\n<\/ul>\n<ul>\n<li style=\"list-style-type: none;\">\n<ul>\n<li>\u9002\u7528\u573a\u666f\uff1a\u89e3\u51b3<code>\u8fd9\u7bc7\u6587\u7ae0\u8bb2\u4e86\u4e2a\u5565<\/code>\uff0c<code>\u603b\u7ed3\u4e00\u4e0b<\/code>\u7b49\u5168\u5c40\u95ee\u9898\u573a\u666f\u3002<\/li>\n<li>\u5982\u4f55\u5b9e\u73b0\uff1a\u901a\u8fc7mapreduce\u7b49\u65b9\u5f0f\u5206\u6bb5\u62bd\u53d6\uff0c\u901a\u8fc7\u6a21\u578b\u4e3a\u6bcf\u6bb5chunk\u63d0\u53d6\u6458\u8981\u4fe1\u606f\u3002<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<p><img decoding=\"async\" src=\"https:\/\/www.kdjingpai.com\/wp-content\/uploads\/2024\/12\/d4f45f835350a71.png\" \/><\/p>\n<h4>\u00a01.4 \u77e5\u8bc6\u5904\u7406\u5de5\u4f5c\u6d41<\/h4>\n<p>\u76ee\u524d <a href=\"https:\/\/www.kdjingpai.com\/db-gpt\/\">DB-GPT<\/a> \u77e5\u8bc6\u5e93\u63d0\u4f9b\u4e86\u6587\u6863\u4e0a\u4f20 -&gt; \u89e3\u6790 -&gt; \u5207\u7247 -&gt; Embedding -&gt; \u77e5\u8bc6\u56fe\u8c31\u4e09\u5143\u7ec4\u62bd\u53d6 -&gt; \u5411\u91cf\u6570\u636e\u5e93\u5b58\u50a8 -&gt; \u56fe\u6570\u636e\u5e93\u5b58\u50a8\u7b49\u77e5\u8bc6\u52a0\u5de5\u7684\u80fd\u529b\uff0c\u4f46\u662f\u4e0d\u5177\u5907\u5bf9\u6587\u6863\u8fdb\u884c\u590d\u6742\u7684\u4e2a\u6027\u5316\u7684\u4fe1\u606f\u62bd\u53d6\u80fd\u529b\uff0c\u56e0\u6b64\u5e0c\u671b\u901a\u8fc7\u6784\u5efa\u77e5\u8bc6\u52a0\u5de5\u5de5\u4f5c\u6d41\u6a21\u7248\u6765\u5b8c\u6210\u590d\u6742\u7684\uff0c\u53ef\u89c6\u5316\u7684\uff0c\u7528\u6237\u53ef\u81ea\u5b9a\u4e49\u7684\u77e5\u8bc6\u62bd\u53d6\uff0c\u8f6c\u6362\uff0c\u52a0\u5de5\u6d41\u7a0b\u3002<\/p>\n<h3><img decoding=\"async\" src=\"https:\/\/www.kdjingpai.com\/wp-content\/uploads\/2024\/12\/ec3a68936096d57.png\" \/><\/h3>\n<h4>\u77e5\u8bc6\u52a0\u5de5\u5de5\u4f5c\u6d41\uff1a<\/h4>\n<p>https:\/\/www.yuque.com\/eosphoros\/dbgpt-docs\/vg2gsfyf3x9fuglf<\/p>\n<p>2. RAG\u6d41\u7a0b\u4f18\u5316RAG\u6d41\u7a0b\u7684\u4f18\u5316\u6211\u4eec\u53c8\u5206\u4e3a\u4e86\u9759\u6001\u6587\u6863\u7684RAG\u548c\u52a8\u6001\u6570\u636e\u83b7\u53d6\u7684RAG\uff0c\u76ee\u524d\u5927\u90e8\u5206\u6d89\u53ca\u5230\u7684RAG\u53ea\u8986\u76d6\u4e86\u975e\u7ed3\u6784\u5316\u7684\u6587\u6863\u9759\u6001\u8d44\u4ea7\uff0c\u4f46\u662f\u5b9e\u9645\u4e1a\u52a1\u5f88\u591a\u573a\u666f\u7684\u95ee\u7b54\u662f\u901a\u8fc7\u5de5\u5177\u83b7\u53d6\u52a8\u6001\u6570\u636e + \u9759\u6001\u77e5\u8bc6\u6570\u636e\u5171\u540c\u56de\u7b54\u7684\u573a\u666f\uff0c\u4e0d\u4ec5\u9700\u8981\u68c0\u7d22\u5230\u9759\u6001\u7684\u77e5\u8bc6\uff0c\u540c\u65f6\u9700\u8981RAG\u68c0\u7d22\u5230\u5de5\u5177\u8d44\u4ea7\u5e93\u91cc\u9762\u5de5\u5177\u4fe1\u606f\u5e76\u6267\u884c\u83b7\u53d6\u52a8\u6001\u6570\u636e\u3002<\/p>\n<h4>\u00a02.1 \u9759\u6001\u77e5\u8bc6RAG\u4f18\u5316<\/h4>\n<p><img decoding=\"async\" src=\"https:\/\/www.kdjingpai.com\/wp-content\/uploads\/2024\/12\/8e5e069ed7a504d.png\" \/><\/p>\n<h5>\uff081\uff09\u539f\u59cb\u95ee\u9898\u5904\u7406<\/h5>\n<p>\u76ee\u7684\uff1a\u6f84\u6e05\u7528\u6237\u8bed\u4e49\uff0c\u5c06\u7528\u6237\u7684\u539f\u59cb\u95ee\u9898\u4ece\u6a21\u7cca\u7684\uff0c\u610f\u56fe\u4e0d\u6e05\u6670\u7684\u67e5\u8be2\u4f18\u5316\u4e3a\u542b\u4e49\u66f4\u4e30\u5bcc\u7684\u4e00\u4e2a\u53ef\u68c0\u7d22\u7684Query<\/p>\n<ul>\n<li>\u539f\u59cb\u95ee\u9898\u5206\u7c7b\uff0c\u901a\u8fc7\u95ee\u9898\u5206\u7c7b\u53ef\u4ee5<\/li>\n<\/ul>\n<ul>\n<li style=\"list-style-type: none;\">\n<ul>\n<li>LLM\u5206\u7c7b(<code>LLMExtractor<\/code>)<\/li>\n<li>\u6784\u5efaembedding+\u903b\u8f91\u56de\u5f52\u5b9e\u73b0\u53cc\u5854\u6a21\u578b\uff0ctext2nlu\u00a0DB-GPT-Hub\/src\/dbgpt-hub-nlu\/README.zh.md at main \u00b7 eosphoros-ai\/DB-GPT-Hub<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<ul>\n<li style=\"list-style-type: none;\">\n<ul>\n<li style=\"list-style-type: none;\">\n<ul>\n<li>tip:\u9700\u8981\u9ad8\u8d28\u91cf\u7684Embedding\u6a21\u578b\uff0c\u63a8\u8350bge-v1.5-large<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<p><img decoding=\"async\" src=\"https:\/\/www.kdjingpai.com\/wp-content\/uploads\/2024\/12\/f290e69a1653c91.png\" \/><\/p>\n<ul>\n<li>\u53cd\u95ee\u7528\u6237\uff0c\u5982\u679c\u8bed\u4e49\u4e0d\u6e05\u6670\u5c06\u95ee\u9898\u518d\u629b\u7ed9\u7528\u6237\u8fdb\u884c\u95ee\u9898\u6f84\u6e05\uff0c\u901a\u8fc7\u591a\u8f6e\u4ea4\u4e92<\/li>\n<\/ul>\n<ul>\n<li style=\"list-style-type: none;\">\n<ul>\n<li>\u901a\u8fc7\u70ed\u641c\u8bcd\u5e93\u6839\u636e\u8bed\u4e49\u76f8\u5173\u6027\u7ed9\u7528\u6237\u63a8\u8350\u4ed6\u60f3\u8981\u7684\u95ee\u9898\u5019\u9009\u5217\u8868<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<ul>\n<li>\u69fd\u4f4d\u63d0\u53d6\uff0c\u76ee\u7684\u662f\u83b7\u53d6\u7528\u6237\u95ee\u9898\u4e2d\u7684\u5173\u952eslot\u4fe1\u606f\uff0c\u6bd4\u5982\u610f\u56fe\uff0c\u4e1a\u52a1\u5c5e\u6027\u7b49\u7b49<\/li>\n<\/ul>\n<ul>\n<li style=\"list-style-type: none;\">\n<ul>\n<li>LLM\u63d0\u53d6(<code>LLMExtractor<\/code>)<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<ul>\n<li>\u95ee\u9898\u6539\u5199<\/li>\n<\/ul>\n<ul>\n<li style=\"list-style-type: none;\">\n<ul>\n<li>\u70ed\u641c\u8bcd\u5e93\u8fdb\u884c\u6539\u5199<\/li>\n<li>\u591a\u8f6e\u4ea4\u4e92<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<h5>\uff082\uff09\u5143\u6570\u636e\u8fc7\u6ee4<\/h5>\n<p>\u5f53\u6211\u4eec\u628a\u7d22\u5f15\u5206\u6210\u8bb8\u591achunks\u5e76\u4e14\u90fd\u5b58\u50a8\u5728\u76f8\u540c\u7684\u77e5\u8bc6\u7a7a\u95f4\u91cc\u9762\uff0c\u68c0\u7d22\u6548\u7387\u4f1a\u6210\u4e3a\u95ee\u9898\u3002\u6bd4\u5982\u7528\u6237\u95ee&#8221;\u6d59\u6c5f\u6211\u6b66\u79d1\u6280\u516c\u53f8&#8221;\u76f8\u5173\u4fe1\u606f\u65f6\uff0c\u5e76\u4e0d\u60f3\u53ec\u56de\u5176\u4ed6\u516c\u53f8\u7684\u4fe1\u606f\u3002\u56e0\u6b64\uff0c\u5982\u679c\u53ef\u4ee5\u901a\u8fc7\u516c\u53f8\u540d\u79f0\u5143\u6570\u636e\u5c5e\u6027\u5148\u8fdb\u884c\u8fc7\u6ee4\uff0c\u5c31\u4f1a\u5927\u5927\u63d0\u5347\u6548\u7387\u548c\u76f8\u5173\u5ea6\u3002<\/p>\n<pre>async def aretrieve(\r\nself, \r\nquery: str, \r\nfilters: Optional[MetadataFilters] = None\r\n) -&gt; List[Chunk]:\r\n\"\"\"\r\nRetrieve knowledge chunks.\r\n\r\nArgs:\r\nquery (str): async query text.\r\nfilters (Optional[MetadataFilters]): metadata filters.\r\n\r\nReturns:\r\nList[Chunk]: list of chunks\r\n\"\"\"\r\nreturn await self._aretrieve(query, filters)<\/pre>\n<p>&nbsp;<\/p>\n<p><img decoding=\"async\" src=\"https:\/\/www.kdjingpai.com\/wp-content\/uploads\/2024\/12\/4d263e07ddd5539.png\" \/><\/p>\n<h5>\uff083\uff09 \u591a\u7b56\u7565\u6df7\u5408\u53ec\u56de<\/h5>\n<ul>\n<li>\u6309\u7167\u4f18\u5148\u7ea7\u53ec\u56de\uff0c\u5206\u522b\u4e3a\u4e0d\u540c\u7684\u68c0\u7d22\u5668\u5b9a\u4e49\u4f18\u5148\u7ea7\uff0c\u68c0\u7d22\u5230\u5185\u5bb9\u540e\u7acb\u5373\u8fd4\u56de\n<ul>\n<li style=\"list-style-type: none;\">\n<ul>\n<li>\u5b9a\u4e49\u4e0d\u540c\u68c0\u7d22\uff0c\u6bd4\u5982qa_retriever, doc_tree_retriever\u5199\u5165\u5230\u961f\u5217\u91cc\u9762\uff0c \u901a\u8fc7\u961f\u5217\u7684\u5148\u8fdb\u5148\u51fa\u7684\u7279\u6027\u5b9e\u73b0\u4f18\u5148\u7ea7\u53ec\u56de\u3002<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<pre>class RetrieverChain(BaseRetriever):\r\n\"\"\"Retriever chain class.\"\"\"\r\n\r\ndef __init__(\r\nself,\r\nretrievers: Optional[List[BaseRetriever]] = None,\r\nexecutor: Optional[Executor] = None,\r\n):\r\n\"\"\"Create retriever chain instance.\"\"\"\r\nself._retrievers = retrievers or []\r\nself._executor = executor or ThreadPoolExecutor()\r\n\r\nasync def retrieve(self, query: str, score_threshold: float, filters: Optional[dict] = None):\r\n\"\"\"Perform <a href=\"https:\/\/www.kdjingpai.com\/retrieval\/\">retrieval<\/a> with the given query, score threshold, and filters.\"\"\"\r\nfor retriever in self._retrievers:\r\ncandidates_with_scores = await retriever.aretrieve_with_scores(\r\nquery=query, score_threshold=score_threshold, filters=filters\r\n)\r\n\r\nif candidates_with_scores:\r\nreturn candidates_with_scores<\/pre>\n<p>&nbsp;<\/p>\n<ul>\n<li>\u591a\u77e5\u8bc6\u7d22\u5f15\/\u7a7a\u95f4\u5e76\u884c\u53ec\u56de\n<ul>\n<li style=\"list-style-type: none;\">\n<ul>\n<li>\u901a\u8fc7\u77e5\u8bc6\u7684\u4e0d\u540c\u7d22\u5f15\u5f62\u5f0f\uff0c\u901a\u8fc7\u5e76\u884c\u53ec\u56de\u65b9\u5f0f\u83b7\u53d6\u5019\u9009\u5217\u8868\uff0c\u4fdd\u8bc1\u53ec\u56de\u5b8c\u6574\u6027<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<p><img decoding=\"async\" src=\"https:\/\/www.kdjingpai.com\/wp-content\/uploads\/2024\/12\/ac388cab1c28a17.png\" \/><\/p>\n<h5>\uff084\uff09 \u540e\u7f6e\u8fc7\u6ee4<\/h5>\n<p>\u7ecf\u8fc7\u7c97\u7b5b\u5019\u9009\u5217\u8868\u540e\uff0c\u600e\u4e48\u901a\u8fc7\u7cbe\u7b5b\u8fc7\u6ee4\u566a\u97f3\u5462<\/p>\n<ul>\n<li>\u65e0\u5173\u7684\u5019\u9009\u5206\u7247\u5254\u9664\n<ul>\n<li style=\"list-style-type: none;\">\n<ul>\n<li>\u65f6\u6548\u6027\u5254\u9664<\/li>\n<li>\u4e1a\u52a1\u5c5e\u6027\u4e0d\u6ee1\u8db3\u5254\u9664<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<ul>\n<li>topk\u53bb\u91cd<\/li>\n<li>\u91cd\u6392\u5e8f \u4ec5\u4ec5\u9760\u7c97\u7b5b\u7684\u53ec\u56de\u8fd8\u4e0d\u591f\uff0c\u8fd9\u65f6\u5019\u6211\u4eec\u9700\u8981\u6709\u4e00\u4e9b\u7b56\u7565\u6765\u5bf9\u68c0\u7d22\u7684\u7ed3\u679c\u505a\u91cd\u6392\u5e8f\uff0c\u6bd4\u5982\u628a\u7ec4\u5408\u76f8\u5173\u5ea6\u3001\u5339\u914d\u5ea6\u7b49\u56e0\u7d20\u505a\u4e00\u4e9b\u91cd\u65b0\u8c03\u6574\uff0c\u5f97\u5230\u66f4\u7b26\u5408\u6211\u4eec\u4e1a\u52a1\u573a\u666f\u7684\u6392\u5e8f\u3002\u56e0\u4e3a\u5728\u8fd9\u4e00\u6b65\u4e4b\u540e\uff0c\u6211\u4eec\u5c31\u4f1a\u628a\u7ed3\u679c\u9001\u7ed9LLM\u8fdb\u884c\u6700\u7ec8\u5904\u7406\u4e86\uff0c\u6240\u4ee5\u8fd9\u4e00\u90e8\u5206\u7684\u7ed3\u679c\u5f88\u91cd\u8981\u3002\n<p>&nbsp;<\/p>\n<ul>\n<li>\u4f7f\u7528\u76f8\u5173\u91cd\u6392\u5e8f\u6a21\u578b\u8fdb\u884c\u7cbe\u7b5b\uff0c\u53ef\u4ee5\u4f7f\u7528\u5f00\u6e90\u7684\u6a21\u578b\uff0c\u4e5f\u53ef\u4ee5\u4f7f\u7528\u5e26\u4e1a\u52a1\u8bed\u4e49\u5fae\u8c03\u7684\u6a21\u578b\u3002<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<pre>## Rerank model\r\n# RERANK_MODEL = bce-reranker-base\r\n\r\n#### If you do not set RERANK_MODEL_PATH, DB-GPT will read the model path from EMBEDDING_MODEL_CONFIG based on the RERANK_MODEL.\r\n# RERANK_MODEL_PATH = \/Users\/chenketing\/Desktop\/project\/DB-GPT-NEW\/DB-GPT\/models\/bce-reranker-base_v1\r\n\r\n#### The number of rerank results to return\r\n# RERANK_TOP_K = 5<\/pre>\n<p>&nbsp;<\/p>\n<ul>\n<li style=\"list-style-type: none;\">\n<ul>\n<li>\u6839\u636e\u4e0d\u540c\u7d22\u5f15\u53ec\u56de\u7684\u5185\u5bb9\u8fdb\u884c\u4e1a\u52a1RRF\u52a0\u6743\u7efc\u5408\u6253\u5206\u5254\u9664<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<pre>score = 0.0\r\n\r\nfor q in queries:\r\nif d in result(q):\r\nscore += 1.0 \/ (k + rank(result(q), d))\r\n\r\nreturn score\r\n\r\n\r\n\r\n# where:\r\n# k is a ranking constant\r\n# q is a query in the set of queries\r\n# d is a document in the result set of q\r\n# result(q) is the result set of q\r\n# rank(result(q), d) is d's rank within the result(q) starting from 1<\/pre>\n<p>&nbsp;<\/p>\n<h5>\uff085\uff09 \u663e\u793a\u4f18\u5316+\u515c\u5e95\u8bdd\u672f\/\u8bdd\u9898\u5f15\u5bfc<\/h5>\n<ul>\n<li>\u8ba9\u6a21\u578b\u4f7f\u7528markdown\u7684\u683c\u5f0f\u8fdb\u884c\u8f93\u51fa<\/li>\n<\/ul>\n<pre>\u57fa\u4e8e\u4ee5\u4e0b\u7ed9\u51fa\u7684\u5df2\u77e5\u4fe1\u606f\uff0c\u9075\u5b88\u89c4\u8303\u7ea6\u675f\uff0c\u4e13\u4e1a\u3001\u7b80\u8981\u56de\u7b54\u7528\u6237\u7684\u95ee\u9898\u3002\r\n\r\n\u89c4\u8303\u7ea6\u675f\uff1a\r\n1. \u5982\u679c\u5df2\u77e5\u4fe1\u606f\u5305\u542b\u7684\u56fe\u7247\u3001\u94fe\u63a5\u3001\u8868\u683c\u3001\u4ee3\u7801\u5757\u7b49\u7279\u6b8a markdown \u6807\u7b7e\u683c\u5f0f\u7684\u4fe1\u606f\uff0c\u786e\u4fdd\u5728\u7b54\u6848\u4e2d\u5305\u542b\u539f\u6587\u8fd9\u4e9b\u56fe\u7247\u3001\u94fe\u63a5\u3001\u8868\u683c\u548c\u4ee3\u7801\u6807\u7b7e\uff0c\u4e0d\u8981\u4e22\u5f03\u4e0d\u8981\u4fee\u6539\uff0c\u4f8b\u5982\uff1a\r\n- \u56fe\u7247\u683c\u5f0f\uff1a`![image.png](xxx)`\r\n- \u94fe\u63a5\u683c\u5f0f\uff1a`[xxx](xxx)`\r\n- \u8868\u683c\u683c\u5f0f\uff1a`|xxx|xxx|xxx|`\r\n- \u4ee3\u7801\u683c\u5f0f\uff1a```xxx```\u3002\r\n2. \u5982\u679c\u65e0\u6cd5\u4ece\u63d0\u4f9b\u7684\u5185\u5bb9\u4e2d\u83b7\u53d6\u7b54\u6848\uff0c\u8bf7\u8bf4\uff1a\u201c\u77e5\u8bc6\u5e93\u4e2d\u63d0\u4f9b\u7684\u5185\u5bb9\u4e0d\u8db3\u4ee5\u56de\u7b54\u6b64\u95ee\u9898\u201d\uff0c\u7981\u6b62\u80e1\u4e71\u7f16\u9020\u3002\r\n3. \u56de\u7b54\u7684\u65f6\u5019\u6700\u597d\u6309\u7167 1.2.3. \u70b9\u8fdb\u884c\u603b\u7ed3\uff0c\u5e76\u4ee5 Markdown \u683c\u5f0f\u663e\u793a\u3002<\/pre>\n<p>&nbsp;<\/p>\n<p><img decoding=\"async\" src=\"https:\/\/www.kdjingpai.com\/wp-content\/uploads\/2024\/12\/ebafac6b50de02b.png\" \/><\/p>\n<h4>\u00a02.2 \u52a8\u6001\u77e5\u8bc6RAG\u4f18\u5316<\/h4>\n<p>\u6587\u6863\u7c7b\u77e5\u8bc6\u662f\u76f8\u5bf9\u9759\u6001\u7684\uff0c\u65e0\u6cd5\u56de\u7b54\u4e2a\u6027\u5316\u4ee5\u53ca\u52a8\u6001\u7684\u4fe1\u606f\uff0c \u9700\u8981\u4f9d\u8d56\u4e00\u4e9b\u7b2c\u4e09\u65b9\u5e73\u53f0\u5de5\u5177\u624d\u53ef\u4ee5\u56de\u7b54\uff0c\u57fa\u4e8e\u8fd9\u79cd\u60c5\u51b5\uff0c\u6211\u4eec\u9700\u8981\u4e00\u4e9b\u52a8\u6001RAG\u7684\u65b9\u6cd5\uff0c\u901a\u8fc7\u5de5\u5177\u8d44\u4ea7\u5b9a\u4e49 -&gt; \u5de5\u5177\u9009\u62e9 -&gt; \u5de5\u5177\u6821\u9a8c -&gt; \u5de5\u5177\u6267\u884c\u83b7\u53d6\u52a8\u6001\u6570\u636e\u3002<\/p>\n<h5>\uff081\uff09 \u5de5\u5177\u8d44\u4ea7\u5e93<\/h5>\n<p>\u6784\u5efa\u4f01\u4e1a\u9886\u57df\u5de5\u5177\u8d44\u4ea7\u5e93\uff0c\u5c06\u6563\u843d\u5230\u5404\u4e2a\u5e73\u53f0\u7684\u5de5\u5177API\uff0c\u5de5\u5177\u811a\u672c\u8fdb\u884c\u6574\u5408\uff0c\u8fdb\u800c\u63d0\u4f9b\u667a\u80fd\u4f53\u7aef\u5230\u7aef\u7684\u4f7f\u7528\u80fd\u529b\u3002\u6bd4\u5982\uff0c\u9664\u4e86\u9759\u6001\u77e5\u8bc6\u5e93\u4ee5\u5916\uff0c\u6211\u4eec\u53ef\u4ee5\u901a\u8fc7\u5bfc\u5165\u5de5\u5177\u5e93\u7684\u65b9\u5f0f\u8fdb\u884c\u5de5\u5177\u7684\u5904\u7406\u3002<\/p>\n<p><img decoding=\"async\" src=\"https:\/\/www.kdjingpai.com\/wp-content\/uploads\/2024\/12\/5e4a24544b5c987.png\" \/><\/p>\n<h5>\uff082\uff09 \u5de5\u5177\u53ec\u56de<\/h5>\n<p>\u5de5\u5177\u53ec\u56de\u6cbf\u7528\u9759\u6001\u77e5\u8bc6\u7684RAG\u53ec\u56de\u7684\u601d\u8def\uff0c\u518d\u901a\u8fc7\u5b8c\u6574\u7684\u5de5\u5177\u6267\u884c\u751f\u547d\u5468\u671f\u6765\u83b7\u53d6\u5de5\u5177\u6267\u884c\u7ed3\u679c\u3002<\/p>\n<p><img decoding=\"async\" src=\"https:\/\/www.kdjingpai.com\/wp-content\/uploads\/2024\/12\/b2ad9fe206e3dfa.png\" \/><\/p>\n<ul>\n<li>\u69fd\u4f4d\u63d0\u53d6\uff1a\u901a\u8fc7\u4f20\u7edfnlp\u83b7\u53d6LLM\u5c06\u7528\u6237\u95ee\u9898\u8fdb\u884c\u89e3\u6790\uff0c\u5305\u62ec\u5e38\u7528\u7684\u4e1a\u52a1\u7c7b\u578b\uff0c\u73af\u5883\u6807\uff0c\u9886\u57df\u6a21\u578b\u53c2\u6570\u7b49\u7b49<\/li>\n<li>\u5de5\u5177\u9009\u62e9\uff1a\u6cbf\u7528\u9759\u6001RAG\u7684\u601d\u8def\u53ec\u56de\uff0c\u4e3b\u8981\u6709\u4e24\u5c42\uff0c\u5de5\u5177\u540d\u53ec\u56de\u548c\u5de5\u5177\u53c2\u6570\u53ec\u56de\u3002\n<ul>\n<li style=\"list-style-type: none;\">\n<ul>\n<li>\u5de5\u5177\u53c2\u6570\u53ec\u56de\uff0c\u548cTableRAG\u601d\u8def\u7c7b\u4f3c\uff0c\u5148\u53ec\u56de\u8868\u540d\uff0c\u518d\u53ec\u56de\u5b57\u6bb5\u540d\u3002<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<ul>\n<li>\u53c2\u6570\u586b\u5145\uff1a\u9700\u8981\u6839\u636e\u53ec\u56de\u7684\u5de5\u5177\u53c2\u6570\u5b9a\u4e49\uff0c\u548c\u69fd\u4f4d\u63d0\u53d6\u51fa\u6765\u7684\u53c2\u6570\u8fdb\u884cmatch\n<ul>\n<li style=\"list-style-type: none;\">\n<ul>\n<li>\u53ef\u4ee5\u4ee3\u7801\u8fdb\u884c\u586b\u5145\uff0c\u4e5f\u53ef\u4ee5\u8ba9\u6a21\u578b\u8fdb\u884c\u586b\u5145\u3002<\/li>\n<li>\u4f18\u5316\u601d\u8def\uff1a\u7531\u4e8e\u5404\u4e2a\u5e73\u53f0\u5de5\u5177\u7684\u540c\u6837\u7684\u53c2\u6570\u7684\u53c2\u6570\u540d\u6ca1\u6709\u7edf\u4e00\uff0c\u4e5f\u4e0d\u65b9\u4fbf\u53bb\u6cbb\u7406\uff0c\u5efa\u8bae\u53ef\u4ee5\u5148\u8fdb\u884c\u4e00\u8f6e\u9886\u57df\u6a21\u578b\u6570\u636e\u6269\u5145\uff0c\u62ff\u5230\u6574\u4e2a\u9886\u57df\u6a21\u578b\u540e\uff0c\u9700\u8981\u7684\u53c2\u6570\u90fd\u4f1a\u5b58\u5728\u3002<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<ul>\n<li>\u53c2\u6570\u6821\u9a8c\n<ul>\n<li style=\"list-style-type: none;\">\n<ul>\n<li>\u5b8c\u6574\u6027\u6821\u9a8c\uff1a\u8fdb\u884c\u53c2\u6570\u4e2a\u6570\u5b8c\u6574\u6027\u6821\u9a8c<\/li>\n<li>\u53c2\u6570\u89c4\u5219\u6821\u9a8c\uff1a\u8fdb\u884c\u53c2\u6570\u540d\u7c7b\u578b\uff0c\u53c2\u6570\u503c\uff0c\u679a\u4e3e\u7b49\u7b49\u89c4\u5219\u6821\u9a8c\u3002<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<ul>\n<li>\u53c2\u6570\u7ea0\u6b63\/\u5bf9\u9f50\uff0c\u8fd9\u90e8\u5206\u4e3b\u8981\u662f\u4e3a\u4e86\u51cf\u5c11\u548c\u7528\u6237\u7684\u4ea4\u4e92\u6b21\u6570\uff0c\u81ea\u52a8\u5316\u5b8c\u6210\u7528\u6237\u53c2\u6570\u9519\u8bef\u7ea0\u6b63\uff0c\u5305\u62ec\u5927\u5c0f\u5199\u89c4\u5219\uff0c\u679a\u4e3e\u89c4\u5219\u7b49\u7b49\u3002eg:<\/li>\n<\/ul>\n<p><img decoding=\"async\" src=\"https:\/\/www.kdjingpai.com\/wp-content\/uploads\/2024\/12\/7a23319db973b5e.png\" \/><\/p>\n<h4>\u00a02.3 RAG\u8bc4\u6d4b<\/h4>\n<p>\u5728\u8bc4\u4f30\u667a\u80fd\u95ee\u7b54\u6d41\u7a0b\u65f6\uff0c\u9700\u8981\u5355\u72ec\u5bf9\u53ec\u56de\u76f8\u5173\u6027\u51c6\u786e\u7387\u4ee5\u53ca\u6a21\u578b\u95ee\u7b54\u7684\u76f8\u5173\u6027\u8fdb\u884c\u8bc4\u4f30\uff0c\u7136\u540e\u518d\u7efc\u5408\u8003\u8651\uff0c\u4ee5\u5224\u65adRAG\u6d41\u7a0b\u5728\u54ea\u4e9b\u65b9\u9762\u4ecd\u9700\u6539\u8fdb\u3002<\/p>\n<p>\u8bc4\u4ef7\u6307\u6807\uff1a<\/p>\n<pre>EvaluationMetric\r\n\u251c\u2500\u2500 LLMEvaluationMetric\r\n\u2502 \u251c\u2500\u2500 AnswerRelevancyMetric\r\n\u251c\u2500\u2500 RetrieverEvaluationMetric\r\n\u2502 \u251c\u2500\u2500 RetrieverSimilarityMetric\r\n\u2502 \u251c\u2500\u2500 RetrieverMRRMetric\r\n\u2502 \u2514\u2500\u2500 RetrieverHitRateMetric<\/pre>\n<p>&nbsp;<\/p>\n<ul>\n<li><code><strong>RAG<\/strong><\/code>\u53ec\u56de\u6307\u6807(RetrieverEvaluationMetric)\uff1a\n<ul>\n<li style=\"list-style-type: none;\">\n<ul>\n<li><code>RetrieverHitRateMetric<\/code>:\u547d\u4e2d\u7387\u8861\u91cf\u7684\u662fRAG<code>retriever<\/code>\u53ec\u56de\u51fa\u73b0\u5728\u68c0\u7d22\u7ed3\u679c\u524dtop-k\u4e2a\u6587\u6863\u4e2d\u7684\u6bd4\u4f8b\u3002<\/li>\n<li><code>RetrieverMRRMetric<\/code>:<code>Mean Reciprocal Rank<\/code>\u901a\u8fc7\u5206\u6790\u6700\u76f8\u5173\u6587\u6863\u5728\u68c0\u7d22\u7ed3\u679c\u91cc\u7684\u6392\u540d\u6765\u8ba1\u7b97\u6bcf\u4e2a\u67e5\u8be2\u7684\u51c6\u786e\u6027\u3002\u66f4\u5177\u4f53\u5730\u8bf4\uff0c\u5b83\u662f\u6240\u6709\u67e5\u8be2\u7684\u76f8\u5173\u6587\u6863\u6392\u540d\u5012\u6570\u7684\u5e73\u5747\u503c\u3002\u4f8b\u5982\uff0c\u82e5\u6700\u76f8\u5173\u7684\u6587\u6863\u6392\u5728\u7b2c\u4e00\u4f4d\uff0c\u5176\u5012\u6570\u6392\u540d\u4e3a 1\uff1b\u6392\u5728\u7b2c\u4e8c\u4f4d\u65f6\uff0c\u4e3a 1\/2\uff1b\u4ee5\u6b64\u7c7b\u63a8\u3002<\/li>\n<li><code>RetrieverSimilarityMetric<\/code>: \u76f8\u4f3c\u5ea6\u6307\u6807\u8ba1\u7b97\uff0c\u8ba1\u7b97\u53ec\u56de\u5185\u5bb9\u4e0e\u9884\u6d4b\u5185\u5bb9\u7684\u76f8\u4f3c\u5ea6\u3002\n<p>&nbsp;<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<p><code>\u6a21\u578b\u751f\u6210<\/code>\u7b54\u6848\u6307\u6807:<\/p>\n<p>&nbsp;<\/p>\n<ul>\n<li><code>AnswerRelevancyMetric<\/code>:\u667a\u80fd\u4f53\u7b54\u6848\u76f8\u5173\u6027\u6307\u6807\uff0c\u901a\u8fc7\u667a\u80fd\u4f53\u7b54\u6848\u4e0e\u7528\u6237\u63d0\u95ee\u7684\u5339\u914d\u7a0b\u5ea6\u3002\u9ad8\u76f8\u5173\u6027\u7684\u7b54\u6848\u4e0d\u4ec5\u8981\u6c42\u6a21\u578b\u80fd\u591f\u7406\u89e3\u7528\u6237\u7684\u95ee\u9898\uff0c\u8fd8\u8981\u6c42\u5176\u80fd\u591f\u751f\u6210\u4e0e\u95ee\u9898\u5bc6\u5207\u76f8\u5173\u7684\u7b54\u6848\u3002\u8fd9\u76f4\u63a5\u5f71\u54cd\u5230\u7528\u6237\u7684\u6ee1\u610f\u5ea6\u548c\u6a21\u578b\u7684\u5b9e\u7528\u6027\u3002<\/li>\n<\/ul>\n<p>&nbsp;<\/p>\n<h1>\u00a0 3.<strong>RAG\u843d\u5730\u6848\u4f8b\u5206\u4eab<\/strong><\/h1>\n<h3><strong>1. \u6570\u636e\u57fa\u7840\u8bbe\u65bd\u9886\u57df\u7684RAG<\/strong><\/h3>\n<h4>\u00a01.1 \u8fd0\u7ef4\u667a\u80fd\u4f53\u80cc\u666f<\/h4>\n<p>\u5728\u6570\u636e\u57fa\u7840\u8bbe\u65bd\u9886\u57df\uff0c\u6709\u5f88\u591a\u8fd0\u7ef4SRE\uff0c\u6bcf\u5929\u4f1a\u63a5\u6536\u5230\u5927\u91cf\u7684\u544a\u8b66\uff0c\u56e0\u6b64\u5f88\u591a\u65f6\u95f4\u6765\u9700\u8981\u54cd\u5e94\u5e94\u6025\u4e8b\u4ef6\uff0c\u8fdb\u800c\u8fdb\u884c\u6545\u969c\u8bca\u65ad\uff0c\u7136\u540e\u6545\u969c\u590d\u76d8\uff0c\u8fdb\u800c\u8fdb\u884c\u7ecf\u9a8c\u6c89\u6dc0\u3002\u53e6\u5916\u4e00\u90e8\u5206\u65f6\u95f4\u53c8\u9700\u8981\u54cd\u5e94\u7528\u6237\u54a8\u8be2\uff0c\u9700\u8981\u4ed6\u4eec\u7528\u4ed6\u4eec\u7684\u77e5\u8bc6\u4ee5\u53ca\u5de5\u5177\u4f7f\u7528\u7ecf\u9a8c\u8fdb\u884c\u7b54\u7591\u3002<\/p>\n<p>&nbsp;<\/p>\n<p>\u56e0\u6b64\u6211\u4eec\u5e0c\u671b\u901a\u8fc7\u6253\u9020\u4e00\u4e2a\u6570\u636e\u57fa\u7840\u8bbe\u65bd\u7684\u901a\u7528\u667a\u80fd\u4f53\u6765\u89e3\u51b3\u544a\u8b66\u8bca\u65ad\uff0c\u7b54\u7591\u7684\u8fd9\u4e9b\u95ee\u9898\u3002<\/p>\n<p><img decoding=\"async\" src=\"https:\/\/www.kdjingpai.com\/wp-content\/uploads\/2024\/12\/0fa7ddb06d03d04.png\" \/><\/p>\n<h4>\u00a01.2 \u4e25\u8c28\u4e13\u4e1a\u7684RAG<\/h4>\n<p>\u4f20\u7edf\u7684 RAG + Agent \u6280\u672f\u53ef\u4ee5\u89e3\u51b3\u901a\u7528\u7684\uff0c\u786e\u5b9a\u6027\u6ca1\u90a3\u4e48\u9ad8\u7684\uff0c\u5355\u6b65\u4efb\u52a1\u573a\u666f\u3002\u4f46\u662f\u9762\u5bf9\u6570\u636e\u57fa\u7840\u8bbe\u65bd\u9886\u57df\u7684\u4e13\u4e1a\u573a\u666f\uff0c\u6574\u4e2a\u68c0\u7d22\u8fc7\u7a0b\u5fc5\u987b\u662f\u786e\u5b9a\uff0c\u4e13\u4e1a\u548c\u771f\u5b9e\u7684\uff0c\u5e76\u4e14\u662f\u9700\u8981\u4e00\u6b65\u4e00\u6b65\u63a8\u7406\u7684\u3002<\/p>\n<p><img decoding=\"async\" src=\"https:\/\/www.kdjingpai.com\/wp-content\/uploads\/2024\/12\/76d86d0363bc157.png\" \/><\/p>\n<p>\u53f3\u8fb9\u662f\u4e00\u4e2a\u901a\u8fc7NativeRAG\u7684\u4e00\u4e2a\u6cdb\u6cdb\u800c\u8c08\u7684\u603b\u7ed3\uff0c\u53ef\u80fd\u5bf9\u4e8e\u4e00\u4e2aC\u7aef\u7684\u7528\u6237\uff0c\u5bf9\u4e13\u4e1a\u7684\u9886\u57df\u77e5\u8bc6\u6ca1\u90a3\u4e48\u4e86\u89e3\u65f6\uff0c\u53ef\u80fd\u662f\u6709\u7528\u7684\u4fe1\u606f\uff0c\u7136\u540e\u5bf9\u4e8e\u4e13\u4e1a\u7684\u4eba\u5458\u6765\u8bf4\uff0c\u8fd9\u90e8\u5206\u89e3\u7b54\u5c31\u6ca1\u6709\u4ec0\u4e48\u610f\u4e49\u4e86\u3002<\/p>\n<p>&nbsp;<\/p>\n<p>\u56e0\u6b64\u6211\u4eec\u6bd4\u8f83\u4e86\u901a\u7528\u7684\u667a\u80fd\u4f53\u548c\u6570\u636e\u57fa\u7840\u8bbe\u65bd\u667a\u80fd\u4f53\u5728RAG\u4e0a\u9762\u7684\u533a\u522b\uff1a<\/p>\n<ul>\n<li>\u901a\u7528\u7684\u667a\u80fd\u4f53\uff1a\u4f20\u7edf\u7684RAG\u5bf9\u77e5\u8bc6\u7684\u4e25\u8c28\u548c\u4e13\u4e1a\u6027\u8981\u6c42\u6ca1\u90a3\u4e48\u9ad8\uff0c\u9002\u7528\u4e8e\u5ba2\u670d\uff0c\u65c5\u6e38\uff0c\u5e73\u53f0\u7b54\u7591\u673a\u5668\u4eba\u8fd9\u6837\u7684\u4e00\u4e9b\u4e1a\u52a1\u573a\u666f\u3002<\/li>\n<li>\u6570\u636e\u57fa\u7840\u8bbe\u65bd\u667a\u80fd\u4f53\uff1aRAG\u6d41\u7a0b\u662f\u4e25\u8c28\u548c\u4e13\u4e1a\u7684\uff0c\u9700\u8981\u4e13\u5c5e\u7684RAG\u5de5\u4f5c\u6d41\u7a0b\uff0c\u4e0a\u4e0b\u6587\u5305\u62ec(\u544a\u8b66-&gt;\u5b9a\u4f4d-&gt;\u6b62\u8840-&gt;\u6062\u590d)\uff0c\u5e76\u4e14\u9700\u8981\u5bf9\u4e13\u5bb6\u6c89\u6dc0\u7684\u95ee\u7b54\u548c\u5e94\u6025\u7ecf\u9a8c\uff0c\u8fdb\u884c\u7ed3\u6784\u5316\u7684\u62bd\u53d6\uff0c\u5efa\u7acb\u5c42\u6b21\u5173\u7cfb\u3002\u56e0\u6b64\u6211\u4eec\u9009\u62e9\u77e5\u8bc6\u56fe\u8c31\u6765\u4f5c\u4e3a\u6570\u636e\u627f\u8f7d\u3002<\/li>\n<\/ul>\n<h3>\u00a01.3 \u77e5\u8bc6\u5904\u7406<\/h3>\n<p>\u57fa\u4e8e\u6570\u636e\u57fa\u7840\u8bbe\u65bd\u7684\u786e\u5b9a\u6027\u548c\u7279\u6b8a\u6027\uff0c\u6211\u4eec\u9009\u62e9\u901a\u8fc7\u7ed3\u5408\u77e5\u8bc6\u56fe\u8c31\u6765\u4f5c\u4e3a\u8bca\u65ad\u5e94\u6025\u7ecf\u9a8c\u7684\u77e5\u8bc6\u627f\u8f7d\u3002\u6211\u4eec\u901a\u8fc7SRE\u6c89\u6dc0\u4e0b\u6765\u7684\u5e94\u6025\u6392\u67e5\u4e8b\u4ef6\u77e5\u8bc6\u7ecf\u9a8c \u7ed3\u5408\u5e94\u6025\u590d\u76d8\u6d41\u7a0b\uff0c\u5efa\u7acb\u4e86DB\u5e94\u6025\u4e8b\u4ef6\u9a71\u52a8\u7684\u77e5\u8bc6\u56fe\u8c31\uff0c\u6211\u4eec\u4ee5DB\u6296\u52a8\u4e3a\u4f8b\uff0c\u5f71\u54cdDB\u6296\u52a8\u7684\u51e0\u4e2a\u4e8b\u4ef6\uff0c\u5305\u62ec\u6162SQL\u95ee\u9898\uff0c\u5bb9\u91cf\u95ee\u9898\uff0c\u6211\u4eec\u5728\u5404\u4e2a\u5e94\u6025\u4e8b\u4ef6\u95f4\u5efa\u7acb\u4e86\u5173\u7cfb\u3002<\/p>\n<p>&nbsp;<\/p>\n<p>\u6700\u540e\u901a\u8fc7\u6211\u4eec\u901a\u8fc7\u89c4\u8303\u5316\u5e94\u6025\u4e8b\u4ef6\u89c4\u5219\uff0c\u4e00\u6b65\u4e00\u6b65\u5730\u5efa\u7acb\u4e86\u591a\u6e90\u7684\u77e5\u8bc6 -&gt; \u77e5\u8bc6\u7ed3\u6784\u5316\u62bd\u53d6 -&gt;\u5e94\u6025\u5173\u7cfb\u62bd\u53d6 -&gt; \u4e13\u5bb6\u5ba1\u6838 -&gt; \u77e5\u8bc6\u5b58\u50a8\u7684\u4e00\u5957\u6807\u51c6\u5316\u7684\u77e5\u8bc6\u52a0\u5de5\u4f53\u7cfb\u3002<\/p>\n<p><img decoding=\"async\" src=\"https:\/\/www.kdjingpai.com\/wp-content\/uploads\/2024\/12\/0a59fb72ec9966d.png\" \/><\/p>\n<p>&nbsp;<\/p>\n<h4>\u00a01.4 \u77e5\u8bc6\u68c0\u7d22<\/h4>\n<p>\u5728\u667a\u80fd\u4f53\u68c0\u7d22\u9636\u6bb5\uff0c\u6211\u4eec\u4f7f\u7528GraphRAG\u4f5c\u4e3a\u9759\u6001\u77e5\u8bc6\u68c0\u7d22\u7684\u627f\u8f7d\uff0c\u56e0\u6b64\u8bc6\u522b\u5230DB\u6296\u52a8\u5f02\u5e38\u540e\uff0c\u627e\u5230\u4e86\u4e0eDB\u6296\u52a8\u5f02\u5e38\u8282\u70b9\u76f8\u5173\u7684\u8282\u70b9\u4f5c\u4e3a\u6211\u4eec\u5206\u6790\u4f9d\u636e\uff0c\u7531\u4e8e\u5728\u77e5\u8bc6\u62bd\u53d6\u9636\u6bb5\u6bcf\u4e00\u4e2a\u8282\u70b9\u8fd8\u4fdd\u7559\u4e86\u6bcf\u4e2a\u4e8b\u4ef6\u7684\u4e00\u4e9b\u5143\u6570\u636e\u4fe1\u606f\uff0c\u5305\u62ec\u4e8b\u4ef6\u540d\uff0c\u4e8b\u4ef6\u63cf\u8ff0\uff0c\u76f8\u5173\u5de5\u5177\uff0c\u5de5\u5177\u53c2\u6570\u7b49\u7b49\u3002<\/p>\n<p>&nbsp;<\/p>\n<p>\u56e0\u6b64\u6211\u4eec\u53ef\u4ee5\u901a\u8fc7\u6267\u884c\u5de5\u5177\u7684\u6267\u884c\u751f\u547d\u5468\u671f\u94fe\u8def\u6765\u83b7\u53d6\u8fd4\u56de\u7ed3\u679c\u62ff\u5230\u52a8\u6001\u6570\u636e\u6765\u4f5c\u4e3a\u5e94\u6025\u8bca\u65ad\u7684\u6392\u67e5\u4f9d\u636e\u3002\u901a\u8fc7\u8fd9\u79cd\u52a8\u9759\u7ed3\u5408\u7684\u6df7\u5408\u53ec\u56de\u7684\u65b9\u5f0f\u6bd4\u7eaf\u6734\u7d20\u7684RAG\u53ec\u56de\uff0c\u4fdd\u969c\u4e86\u6570\u636e\u57fa\u7840\u8bbe\u65bd\u667a\u80fd\u4f53\u6267\u884c\u7684\u786e\u5b9a\u6027\uff0c\u4e13\u4e1a\u6027\u548c\u4e25\u8c28\u6027\u3002<\/p>\n<p><img decoding=\"async\" src=\"https:\/\/www.kdjingpai.com\/wp-content\/uploads\/2024\/12\/5f60eb85e6265c6.png\" \/><\/p>\n<h4>\u00a01.5 AWEL + Agent<\/h4>\n<p>\u6700\u540e\u901a\u8fc7\u793e\u533aAWEL+AGENT\u6280\u672f\uff0c\u901a\u8fc7AGENT\u7f16\u6392\u7684\u8303\u5f0f\uff0c\u6253\u9020\u4e86\u4ece\u610f\u56fe\u4e13\u5bb6-&gt; \u5e94\u6025\u8bca\u65ad\u4e13\u5bb6 -&gt; \u8bca\u65ad\u6839\u56e0\u5206\u6790\u4e13\u5bb6\u3002<\/p>\n<p>&nbsp;<\/p>\n<p>\u6bcf\u4e2aAgent\u7684\u804c\u80fd\u90fd\u662f\u4e0d\u4e00\u6837\u7684\uff0c\u610f\u56fe\u4e13\u5bb6\u8d1f\u8d23\u8bc6\u522b\u89e3\u6790\u7528\u6237\u7684\u610f\u56fe\u548c\u8bc6\u522b\u544a\u8b66\u4fe1\u606f\u8bca\u65ad\u4e13\u5bb6\u9700\u8981\u901a\u8fc7GraphRAG \u5b9a\u4f4d\u5230\u9700\u8981\u5206\u6790\u7684\u6839\u56e0\u8282\u70b9\uff0c\u4ee5\u53ca\u83b7\u53d6\u5177\u4f53\u7684\u6839\u56e0\u4fe1\u606f\u3002\u5206\u6790\u4e13\u5bb6\u9700\u8981\u7ed3\u5408\u5404\u4e2a\u6839\u56e0\u8282\u70b9\u7684\u6570\u636e + \u5386\u53f2\u5206\u6790\u590d\u76d8\u62a5\u544a\u751f\u6210\u8bca\u65ad\u5206\u6790\u62a5\u544a<\/p>\n<p><img decoding=\"async\" src=\"https:\/\/www.kdjingpai.com\/wp-content\/uploads\/2024\/12\/1bc351017296eb9.png\" \/><\/p>\n<h3><strong>2. \u91d1\u878d\u8d22\u62a5\u5206\u6790\u9886\u57df\u7684RAG<\/strong><\/h3>\n<p>\u6700\u65b0\u5b9e\u8df5\uff01\u5982\u4f55\u57fa\u4e8e DB-GPT \u642d\u5efa\u8d22\u62a5\u5206\u6790\u52a9\u624b\uff1f<\/p>\n<p><img decoding=\"async\" src=\"https:\/\/www.kdjingpai.com\/wp-content\/uploads\/2024\/12\/90d10849214bd3b.png\" \/><\/p>\n<p>\u53ef\u4ee5\u56f4\u7ed5\u5404\u81ea\u9886\u57df\u6784\u5efa\u5c5e\u4e8e\u81ea\u5df1\u7684\u9886\u57df\u8d44\u4ea7\u5e93\u5305\u62ec\uff0c\u77e5\u8bc6\u8d44\u4ea7\uff0c\u5de5\u5177\u8d44\u4ea7\u4ee5\u53ca\u77e5\u8bc6\u56fe\u8c31\u8d44\u4ea7<\/p>\n<ul>\n<li>\u9886\u57df\u8d44\u4ea7:\u9886\u57df\u8d44\u4ea7\u5305\u62ec\u4e86\u77e5\u8bc6\u5e93\uff0cAPI\uff0c\u5de5\u5177\u811a\u672c\u3002<\/li>\n<li>\u8d44\u4ea7\u5904\u7406\uff0c\u6574\u4e2a\u8d44\u4ea7\u6570\u636e\u94fe\u8def\u6d89\u53ca\u4e86\u9886\u57df\u8d44\u4ea7\u52a0\u5de5\uff0c\u9886\u57df\u8d44\u4ea7\u68c0\u7d22\u548c\u9886\u57df\u8d44\u4ea7\u8bc4\u4f30\u3002\n<ul>\n<li style=\"list-style-type: none;\">\n<ul>\n<li>\u975e\u7ed3\u6784\u5316 -&gt; \u7ed3\u6784\u5316\uff1a\u6709\u6761\u7406\u5730\u5f52\u7c7b\uff0c\u6b63\u786e\u5730\u7ec4\u7ec7\u77e5\u8bc6\u4fe1\u606f\u3002<\/li>\n<li>\u63d0\u53d6\u66f4\u52a0\u4e30\u5bcc\u7684\u8bed\u4e49\u4fe1\u606f\u3002<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<ul>\n<li>\u8d44\u4ea7\u68c0\u7d22\uff1a\n<ul>\n<li style=\"list-style-type: none;\">\n<ul>\n<li>\u5e0c\u671b\u662f\u6709\u5c42\u7ea7\uff0c\u4f18\u5148\u7ea7\u7684\u68c0\u7d22\u800c\u5e76\u975e\u5355\u4e00\u7684\u68c0\u7d22<\/li>\n<li>\u540e\u7f6e\u8fc7\u6ee4\u5f88\u91cd\u8981\uff0c\u6700\u597d\u80fd\u901a\u8fc7\u4e1a\u52a1\u8bed\u4e49\u4e00\u4e9b\u89c4\u5219\u8fdb\u884c\u8fc7\u6ee4\u3002<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n","protected":false},"excerpt":{"rendered":"<p>\u524d\u8a00 \u5728\u8fc7\u53bb\u4e24\u5e74\u4e2d\uff0c\u68c0\u7d22\u589e\u5f3a\u751f\u6210\uff08RAG\uff0cRetrieval-Augmented Generation\uff09\u6280\u672f\u9010\u6e10\u6210\u4e3a\u63d0\u5347\u667a\u80fd\u4f53\u7684\u6838\u5fc3\u7ec4\u6210\u90e8\u5206\u3002\u901a\u8fc7\u7ed3\u5408\u68c0\u7d22\u4e0e\u751f\u6210\u7684\u53cc\u91cd\u80fd\u529b\uff0cRAG\u80fd\u591f\u5f15\u5165\u5916\u90e8\u77e5\u8bc6\uff0c\u4ece\u800c\u4e3a\u5927\u6a21\u578b\u5728\u590d\u6742\u573a\u666f\u4e2d\u7684\u5e94\u7528\u63d0\u4f9b\u66f4\u591a\u53ef\u80fd\u6027&#8230;<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[34],"tags":[],"class_list":["post-16854","post","type-post","status-publish","format-standard","hentry","category-knowledge"],"_links":{"self":[{"href":"https:\/\/www.kdjingpai.com\/en\/wp-json\/wp\/v2\/posts\/16854","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.kdjingpai.com\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.kdjingpai.com\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.kdjingpai.com\/en\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.kdjingpai.com\/en\/wp-json\/wp\/v2\/comments?post=16854"}],"version-history":[{"count":0,"href":"https:\/\/www.kdjingpai.com\/en\/wp-json\/wp\/v2\/posts\/16854\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.kdjingpai.com\/en\/wp-json\/wp\/v2\/media?parent=16854"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.kdjingpai.com\/en\/wp-json\/wp\/v2\/categories?post=16854"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.kdjingpai.com\/en\/wp-json\/wp\/v2\/tags?post=16854"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}