{"id":28609,"date":"2025-03-14T00:27:09","date_gmt":"2025-03-13T16:27:09","guid":{"rendered":"https:\/\/www.aisharenet.com\/?p=28609"},"modified":"2025-03-14T00:27:09","modified_gmt":"2025-03-13T16:27:09","slug":"shiyong-ollamalangch","status":"publish","type":"post","link":"https:\/\/www.kdjingpai.com\/ja\/shiyong-ollamalangch\/","title":{"rendered":"\u4f7f\u7528 Ollama+LangChain \u6784\u5efa\u672c\u5730 RAG \u5e94\u7528"},"content":{"rendered":"<p>\u8be5\u6559\u7a0b\u5047\u8bbe\u60a8\u5df2\u7ecf\u719f\u6089\u4ee5\u4e0b\u6982\u5ff5:<\/p>\n<ul>\n<li>Chat Models<\/li>\n<li>Chaining runnables<\/li>\n<li>Embeddings<\/li>\n<li>Vector stores<\/li>\n<li>Retrieval-augmented generation<\/li>\n<\/ul>\n<p>\u5f88\u591a\u6d41\u884c\u7684\u9879\u76ee\u5982\u00a0<a href=\"https:\/\/www.kdjingpai.com\/llamacpp\/\">llama.cpp<\/a> ,\u00a0<a href=\"https:\/\/www.kdjingpai.com\/ollama\/\">Ollama<\/a> , \u548c\u00a0<a href=\"https:\/\/www.kdjingpai.com\/llamafilejianhuall\/\">llamafile<\/a>\u00a0\u663e\u793a\u4e86\u672c\u5730\u73af\u5883\u4e2d\u8fd0\u884c\u5927\u8bed\u8a00\u6a21\u578b\u7684\u91cd\u8981\u6027\u3002<\/p>\n<p>LangChain \u4e0e\u8bb8\u591a\u53ef\u4ee5\u672c\u5730\u8fd0\u884c\u7684\u00a0<a href=\"https:\/\/python.langchain.com\/v0.2\/docs\/how_to\/local_llms\">\u5f00\u6e90 LLM \u4f9b\u5e94\u5546<\/a>\u00a0\u6709\u96c6\u6210\uff0cOllama\u00a0\u4fbf\u662f\u5176\u4e2d\u4e4b\u4e00\u3002<\/p>\n<p>&nbsp;<\/p>\n<h2>\u73af\u5883\u8bbe\u7f6e<\/h2>\n<p>\u9996\u5148\uff0c\u6211\u4eec\u9700\u8981\u8fdb\u884c\u73af\u5883\u8bbe\u7f6e\u3002<\/p>\n<p>Ollama \u7684\u00a0GitHub\u4ed3\u5e93\u00a0\u4e2d\u63d0\u4f9b\u4e86\u8be6\u7ec6\u7684\u8bf4\u660e, \u7b80\u5355\u603b\u7ed3\u5982\u4e0b:<\/p>\n<ul>\n<li>\u4e0b\u8f7d\u00a0\u5e76\u8fd0\u884c Ollama \u5e94\u7528\u7a0b\u5e8f<\/li>\n<li>\u4ece\u547d\u4ee4\u884c, \u53c2\u8003\u00a0Ollama \u6a21\u578b\u5217\u8868\u00a0\u548c\u00a0<a href=\"https:\/\/python.langchain.com\/v0.2\/docs\/integrations\/text_embedding\/\">\u6587\u672c\u5d4c\u5165\u6a21\u578b\u5217\u8868<\/a>\u00a0\u62c9\u53d6\u6a21\u578b\u3002\u5728\u8be5\u6559\u7a0b\u4e2d\uff0c\u6211\u4eec\u4ee5\u00a0<code>llama3.1:8b<\/code>\u00a0\u548c\u00a0<code>nomic-embed-text<\/code>\u00a0\u4e3a\u4f8b:\n<ul>\n<li>\u547d\u4ee4\u884c\u8f93\u5165\u00a0<code>ollama pull llama3.1:8b<\/code>\uff0c\u62c9\u53d6\u901a\u7528\u7684\u5f00\u6e90\u5927\u8bed\u8a00\u6a21\u578b\u00a0<code>llama3.1:8b<\/code><\/li>\n<li>\u547d\u4ee4\u884c\u8f93\u5165\u00a0<code>ollama pull nomic-embed-text<\/code>\u00a0\u62c9\u53d6\u00a0<a href=\"https:\/\/ollama.com\/search?c=embedding\">\u6587\u672c\u5d4c\u5165\u6a21\u578b<\/a>\u00a0<code>nomic-embed-text<\/code><\/li>\n<\/ul>\n<\/li>\n<li>\u5f53\u5e94\u7528\u8fd0\u884c\u65f6\uff0c\u6240\u6709\u6a21\u578b\u5c06\u81ea\u52a8\u5728\u00a0<code>localhost:11434<\/code>\u00a0\u4e0a\u542f\u52a8<\/li>\n<li>\u6ce8\u610f\uff0c\u4f60\u7684\u6a21\u578b\u9009\u62e9\u9700\u8981\u8003\u8651\u4f60\u7684\u672c\u5730\u786c\u4ef6\u80fd\u529b\uff0c\u8be5\u6559\u7a0b\u7684\u53c2\u8003\u663e\u5b58\u5927\u5c0f\u00a0<code>GPU Memory &gt; 8GB<\/code><\/li>\n<\/ul>\n<p>\u63a5\u4e0b\u6765\uff0c\u5b89\u88c5\u672c\u5730\u5d4c\u5165\u3001\u5411\u91cf\u5b58\u50a8\u548c\u6a21\u578b\u63a8\u7406\u6240\u9700\u7684\u5305\u3002<\/p>\n<pre><code># langchain_community\r\n%pip install -qU langchain langchain_community\r\n# Chroma\r\n%pip install -qU langchain_chroma\r\n# Ollama\r\n%pip install -qU langchain_ollama\r\n<\/code><\/pre>\n<pre><code>Note: you may need to restart the kernel to use updated packages.\r\nNote: you may need to restart the kernel to use updated packages.\r\nNote: you may need to restart the kernel to use updated packages.\r\n<\/code><\/pre>\n<p>You can also\u00a0<a href=\"https:\/\/github.com\/datawhalechina\/handy-ollama\/blob\/main\/docs\/integrations\/text_embedding\">see this page<\/a>\u00a0for a full list of available embeddings models<\/p>\n<p>&nbsp;<\/p>\n<h2>\u6587\u6863\u52a0\u8f7d<\/h2>\n<p>\u73b0\u5728\u8ba9\u6211\u4eec\u52a0\u8f7d\u5e76\u5206\u5272\u4e00\u4e2a\u793a\u4f8b\u6587\u6863\u3002<\/p>\n<p>\u6211\u4eec\u5c06\u4ee5 Lilian Weng \u7684\u5173\u4e8e Agent \u7684\u00a0<a href=\"https:\/\/lilianweng.github.io\/posts\/2023-06-23-agent\/\">\u535a\u5ba2<\/a>\u00a0\u4e3a\u4f8b\u3002<\/p>\n<pre><code>from langchain.text_splitter import RecursiveCharacterTextSplitter\r\nfrom langchain_community.document_loaders import WebBaseLoader\r\nloader = WebBaseLoader(\"https:\/\/lilianweng.github.io\/posts\/2023-06-23-agent\/\")\r\ndata = loader.load()\r\ntext_splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=0)\r\nall_splits = text_splitter.split_documents(data)\r\n<\/code><\/pre>\n<p>\u63a5\u7740\uff0c\u521d\u59cb\u5316\u5411\u91cf\u5b58\u50a8\u3002 \u6211\u4eec\u4f7f\u7528\u7684\u6587\u672c\u5d4c\u5165\u6a21\u578b\u662f\u00a0<a href=\"https:\/\/ollama.com\/library\/nomic-embed-text\"><code>nomic-embed-text<\/code><\/a>\u00a0\u3002<\/p>\n<pre><code>from langchain_chroma import Chroma\r\nfrom langchain_ollama import OllamaEmbeddings\r\nlocal_embeddings = OllamaEmbeddings(model=\"nomic-embed-text\")\r\nvectorstore = Chroma.from_documents(documents=all_splits, embedding=local_embeddings)\r\n<\/code><\/pre>\n<p>\u73b0\u5728\u6211\u4eec\u5f97\u5230\u4e86\u4e00\u4e2a\u672c\u5730\u7684\u5411\u91cf\u6570\u636e\u5e93! \u6765\u7b80\u5355\u6d4b\u8bd5\u4e00\u4e0b\u76f8\u4f3c\u5ea6\u68c0\u7d22:<\/p>\n<pre><code>question = \"What are the approaches to Task Decomposition?\"\r\ndocs = vectorstore.similarity_search(question)\r\nlen(docs)\r\n<\/code><\/pre>\n<pre><code>4\r\n<\/code><\/pre>\n<pre><code>docs[0]\r\n<\/code><\/pre>\n<pre><code>Document(metadata={'description': 'Building agents with LLM (large language model) as its core controller is a cool concept. Several proof-of-concepts demos, such as <a href=\"https:\/\/www.kdjingpai.com\/autogpt\/\">AutoGPT<\/a>, GPT-Engineer and BabyAGI, serve as inspiring examples. The potentiality of LLM extends beyond generating well-written copies, stories, essays and programs; it can be framed as a powerful general problem solver.\\nAgent System Overview In a LLM-powered autonomous agent system, LLM functions as the agent\u2019s brain, complemented by several key components:', 'language': 'en', 'source': 'https:\/\/lilianweng.github.io\/posts\/2023-06-23-agent\/', 'title': \"LLM Powered Autonomous Agents | Lil'Log\"}, page_content='Task decomposition can be done (1) by LLM with simple prompting like \"Steps for XYZ.\\\\n1.\", \"What are the subgoals for achieving XYZ?\", (2) by using task-specific instructions; e.g. \"Write a <a href=\"https:\/\/www.kdjingpai.com\/storyvideos\/\">story<\/a> outline.\" for writing a novel, or (3) with human inputs.')\r\n<\/code><\/pre>\n<p>\u63a5\u4e0b\u6765\u5b9e\u4f8b\u5316\u5927\u8bed\u8a00\u6a21\u578b\u00a0<code>llama3.1:8b<\/code>\u00a0\u5e76\u6d4b\u8bd5\u6a21\u578b\u63a8\u7406\u662f\u5426\u6b63\u5e38\uff1a<\/p>\n<pre><code>from langchain_ollama import <a href=\"https:\/\/www.kdjingpai.com\/chatollamajiyun\/\">ChatOllama<\/a>\r\nmodel = ChatOllama(\r\nmodel=\"llama3.1:8b\",\r\n)\r\n<\/code><\/pre>\n<pre><code>response_message = model.invoke(\r\n\"Simulate a rap battle between Stephen Colbert and John Oliver\"\r\n)\r\nprint(response_message.content)\r\n<\/code><\/pre>\n<pre><code>**The scene is set: a packed arena, the crowd on their feet. In the blue corner, we have Stephen Colbert, aka \"The O'Reilly Factor\" himself. In the red corner, the challenger, John Oliver. The judges are announced as Tina Fey, Larry Wilmore, and Patton Oswalt. The crowd roars as the two opponents face off.**\r\n**Stephen Colbert (aka \"The Truth with a Twist\"):**\r\nYo, I'm the king of satire, the one they all fear\r\nMy show's on late, but my jokes are clear\r\nI skewer the politicians, with precision and might\r\nThey tremble at my wit, day and night\r\n**John Oliver:**\r\nHold up, Stevie boy, you may have had your time\r\nBut I'm the new kid on the block, with a different prime\r\nTime to wake up from that 90s coma, son\r\nMy show's got bite, and my facts are never done\r\n**Stephen Colbert:**\r\nOh, so you think you're the one, with the \"Last Week\" crown\r\nBut your jokes are stale, like the ones I wore down\r\nI'm the master of absurdity, the lord of the spin\r\nYou're just a British import, trying to fit in\r\n**John Oliver:**\r\nStevie, my friend, you may have been the first\r\nBut I've got the skill and the wit, that's never blurred\r\nMy show's not afraid, to take on the fray\r\nI'm the one who'll make you think, come what may\r\n**Stephen Colbert:**\r\nWell, it's time for a showdown, like two old friends\r\nLet's see whose satire reigns supreme, till the very end\r\nBut I've got a secret, that might just seal your fate\r\nMy humor's contagious, and it's already too late!\r\n**John Oliver:**\r\nBring it on, Stevie! I'm ready for you\r\nI'll take on your jokes, and show them what to do\r\nMy sarcasm's sharp, like a scalpel in the night\r\nYou're just a relic of the past, without a fight\r\n**The judges deliberate, weighing the rhymes and the flow. Finally, they announce their decision:**\r\nTina Fey: I've got to go with John Oliver. His jokes were sharper, and his delivery was smoother.\r\nLarry Wilmore: Agreed! But Stephen Colbert's still got that old-school charm.\r\nPatton Oswalt: You know what? It's a tie. Both of them brought the heat!\r\n**The crowd goes wild as both opponents take a bow. The rap battle may be over, but the satire war is just beginning...\r\n<\/code><\/pre>\n<p>&nbsp;<\/p>\n<h2>\u6784\u5efa Chain \u8868\u8fbe\u5f62\u5f0f<\/h2>\n<p>\u6211\u4eec\u53ef\u4ee5\u901a\u8fc7\u4f20\u5165\u68c0\u7d22\u5230\u7684\u6587\u6863\u548c\u7b80\u5355\u7684 prompt \u6765\u6784\u5efa\u4e00\u4e2a\u00a0<code>summarization chain<\/code>\u00a0\u3002<\/p>\n<p>\u5b83\u4f7f\u7528\u63d0\u4f9b\u7684\u8f93\u5165\u952e\u503c\u683c\u5f0f\u5316\u63d0\u793a\u6a21\u677f\uff0c\u5e76\u5c06\u683c\u5f0f\u5316\u540e\u7684\u5b57\u7b26\u4e32\u4f20\u9012\u7ed9\u6307\u5b9a\u7684\u6a21\u578b\uff1a<\/p>\n<pre><code>from langchain_core.output_parsers import StrOutputParser\r\nfrom langchain_core.prompts import ChatPromptTemplate\r\nprompt = ChatPromptTemplate.from_template(\r\n\"Summarize the main themes in these retrieved docs: {docs}\"\r\n)\r\n# \u5c06\u4f20\u5165\u7684\u6587\u6863\u8f6c\u6362\u6210\u5b57\u7b26\u4e32\u7684\u5f62\u5f0f\r\ndef format_docs(docs):\r\nreturn \"\\n\\n\".join(doc.page_content for doc in docs)\r\nchain = {\"docs\": format_docs} | prompt | model | StrOutputParser()\r\nquestion = \"What are the approaches to Task Decomposition?\"\r\ndocs = vectorstore.similarity_search(question)\r\nchain.invoke(docs)\r\n<\/code><\/pre>\n<pre><code>'The main themes in these documents are:\\n\\n1. **Task Decomposition**: The process of breaking down complex tasks into smaller, manageable subgoals is crucial for efficient task handling.\\n2. **Autonomous Agent System**: A system powered by Large Language Models (LLMs) that can perform planning, <a href=\"https:\/\/www.kdjingpai.com\/reflection-2\/\">reflection<\/a>, and refinement to improve the quality of final results.\\n3. **Challenges in Planning and Decomposition**:\\n\\t* Long-term planning and task decomposition are challenging for LLMs.\\n\\t* Adjusting plans when faced with unexpected errors is difficult for LLMs.\\n\\t* Humans learn from trial and error, making them more robust than LLMs in certain situations.\\n\\nOverall, the documents highlight the importance of task decomposition and planning in autonomous agent systems powered by LLMs, as well as the challenges that still need to be addressed.'\r\n<\/code><\/pre>\n<p>&nbsp;<\/p>\n<h2>\u7b80\u5355QA<\/h2>\n<pre><code>from langchain_core.runnables import RunnablePassthrough\r\nRAG_TEMPLATE = \"\"\"\r\nYou are an assistant for question-answering tasks. Use the following <a href=\"https:\/\/www.kdjingpai.com\/pieces-for-developers\/\">pieces<\/a> of retrieved context to answer the question. If you don't know the answer, just say that you don't know. Use three sentences maximum and keep the answer concise.\r\n&lt;context&gt;\r\n{context}\r\n&lt;\/context&gt;\r\nAnswer the following question:\r\n{question}\"\"\"\r\nrag_prompt = ChatPromptTemplate.from_template(RAG_TEMPLATE)\r\nchain = (\r\nRunnablePassthrough.assign(context=lambda input: format_docs(input[\"context\"]))\r\n| rag_prompt\r\n| model\r\n| StrOutputParser()\r\n)\r\nquestion = \"What are the approaches to Task Decomposition?\"\r\ndocs = vectorstore.similarity_search(question)\r\n# Run\r\nchain.invoke({\"context\": docs, \"question\": question})\r\n<\/code><\/pre>\n<pre><code>'Task decomposition can be done through (1) simple prompting using LLM, (2) task-specific instructions, or (3) human inputs. This approach helps break down large tasks into smaller, manageable subgoals for efficient handling of complex tasks. It enables agents to plan ahead and improve the quality of final results through reflection and refinement.'\r\n<\/code><\/pre>\n<p>&nbsp;<\/p>\n<h2>\u5e26\u6709\u68c0\u7d22\u7684QA<\/h2>\n<p>\u6700\u540e\uff0c\u6211\u4eec\u5e26\u6709\u8bed\u4e49\u68c0\u7d22\u529f\u80fd\u7684 QA \u5e94\u7528\uff08\u672c\u5730 <a href=\"https:\/\/www.kdjingpai.com\/rag\/\">RAG<\/a> \u5e94\u7528\uff09\uff0c\u53ef\u4ee5\u6839\u636e\u7528\u6237\u95ee\u9898\u81ea\u52a8\u4ece\u5411\u91cf\u6570\u636e\u5e93\u4e2d\u68c0\u7d22\u8bed\u4e49\u4e0a\u6700\u76f8\u8fd1\u7684\u6587\u6863\u7247\u6bb5\uff1a<\/p>\n<pre><code>retriever = vectorstore.as_retriever()\r\nqa_chain = (\r\n{\"context\": retriever | format_docs, \"question\": RunnablePassthrough()}\r\n| rag_prompt\r\n| model\r\n| StrOutputParser()\r\n)\r\n<\/code><\/pre>\n<pre><code>question = \"What are the approaches to Task Decomposition?\"\r\nqa_chain.invoke(question)\r\n<\/code><\/pre>\n<pre><code>'Task decomposition can be done through (1) simple prompting in Large Language Models (LLM), (2) using task-specific instructions, or (3) with human inputs. This process involves breaking down large tasks into smaller, manageable subgoals for efficient handling of complex tasks.'\r\n<\/code><\/pre>\n<p>&nbsp;<\/p>\n<h2>\u603b\u7ed3<\/h2>\n<p>\u606d\u559c\uff0c\u81f3\u6b64\uff0c\u4f60\u5df2\u7ecf\u5b8c\u6574\u7684\u5b9e\u73b0\u4e86\u4e00\u4e2a\u57fa\u4e8e Langchain \u6846\u67b6\u548c\u672c\u5730\u6a21\u578b\u6784\u5efa\u7684 RAG \u5e94\u7528\u3002\u4f60\u53ef\u4ee5\u5728\u6559\u7a0b\u7684\u57fa\u7840\u4e0a\u66ff\u6362\u672c\u5730\u6a21\u578b\u6765\u5c1d\u8bd5\u4e0d\u540c\u6a21\u578b\u7684\u6548\u679c\u548c\u80fd\u529b\uff0c\u6216\u8fdb\u4e00\u6b65\u8fdb\u884c\u6269\u5c55\uff0c\u4e30\u5bcc\u5e94\u7528\u7684\u80fd\u529b\u548c\u8868\u73b0\u529b\uff0c\u6216\u8005\u6dfb\u52a0\u66f4\u591a\u5b9e\u7528\u6709\u8da3\u7684\u529f\u80fd\u3002<\/p>\n","protected":false},"excerpt":{"rendered":"<p>\u8be5\u6559\u7a0b\u5047\u8bbe\u60a8\u5df2\u7ecf\u719f\u6089\u4ee5\u4e0b\u6982\u5ff5: Chat Models Chaining runnables Embeddings Vector stores Retrieval-augmented generation \u5f88\u591a\u6d41\u884c\u7684\u9879\u76ee\u5982\u00a0llama.cp&#8230;<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[182],"tags":[],"class_list":["post-28609","post","type-post","status-publish","format-standard","hentry","category-shicao"],"_links":{"self":[{"href":"https:\/\/www.kdjingpai.com\/ja\/wp-json\/wp\/v2\/posts\/28609","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.kdjingpai.com\/ja\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.kdjingpai.com\/ja\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.kdjingpai.com\/ja\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.kdjingpai.com\/ja\/wp-json\/wp\/v2\/comments?post=28609"}],"version-history":[{"count":0,"href":"https:\/\/www.kdjingpai.com\/ja\/wp-json\/wp\/v2\/posts\/28609\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.kdjingpai.com\/ja\/wp-json\/wp\/v2\/media?parent=28609"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.kdjingpai.com\/ja\/wp-json\/wp\/v2\/categories?post=28609"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.kdjingpai.com\/ja\/wp-json\/wp\/v2\/tags?post=28609"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}