使用 Langchain 和 Ollama 的 PDF 聊天机器人分步指南

介绍

在技术不断改变我们与信息交互方式的时代，PDF 聊天机器人的概念将便利性和效率提升到了新的水平。本文深入探讨了使用 Langchain 和 Ollama 创建 PDF 聊天机器人的有趣领域，其中只需最少的配置即可访问开源模型。告别框架选择和模型参数调整的复杂性，我们踏上释放 PDF 聊天机器人潜力的旅程。

了解如何无缝安装 Ollama、下载模型以及制作 PDF 聊天机器人，为你的查询提供智能响应。让我们探索这种令人兴奋的技术与文档处理的融合，使信息检索变得比以往更容易。

学习目标

了解如何在计算机上安装 Ollama。

了解如何使用 Ollama 下载并运行开源模型。

了解使用 Langchain 和 Ollama 创建 PDF 聊天机器人的过程。

先决条件

Ollama是什么？

如何安装Ollama？

下载模型

创建聊天机器人

常见问题

先决条件

要正确理解本文，你需要：

对 Python 有良好的了解，并且

Langchain的基础知识，即连锁店、矢量商店等。

Langchain 提供了用于创建 LLM 应用程序的各种类型的功能。它本身值得一篇单独的文章。如果你不知道Langchain是什么，我建议你看一些关于Langchain的文章或教程。你也可以看看这个视频：

https://youtu.be/DXmiJKrQIvg?si=SzHzQ_T1BXjHw0o4

Ollama是什么？

Ollama 使你能够下载开源模型并在本地使用。它会自动从最佳来源下载模型。如果你的计算机上有专用 GPU，它将通过GPU(https://www.analyticsvidhya.com/blog/2023/03/cpu-vs-gpu/)加速运行模型。你不需要手动设置它。你甚至可以通过更改提示来自定义模型（是的，你不需要 Langchain）。Ollama 还可以作为 docker 镜像提供，以便你可以将自己的模型部署为 docker 容器。现在让我们看看如何在你的计算机上安装 Ollama。

如何安装Ollama？

不幸的是，Ollama 仅适用于MacOS和Linux。不过不用担心，Windows 用户还有一种方法可以使用 Ollama – WSL2。如果你的计算机上没有 WSL2，请阅读(https://open.substack.com/pub/srang992/p/how-to-create-a-wsl-development-environment?r=2c5vg4&utm_campaign=post&utm_medium=web)。

在这里，我解释了有关 WSL2 的所有内容以及如何在 VS Code 中使用它。如果你已经安装了它，请打开 Ubuntu 并在终端中运行以下命令。

curl https://ollama.ai/install.sh | sh

这将在 WSL2 中安装 Ollama。如果你使用的是MacOS，请访问此处(https://ollama.ai/download/mac)。现在你可以使用 Ollama 下载模型了。保持终端打开，我们还没有完成。

下载模型

Ollama 提供各种模型 - llama2、llama2-uncensored、codellama、orca-mini等。如果你想了解所有可用模型，可以访问此(https://ollama.ai/library)网站。你将在此处下载orca-mini 3b模型。它是一个 Llama 模型，使用 Orca 论文中定义的方法创建的 Orca 风格数据集进行训练。

虽然这个模型很小（1.9GB），但仍然给出了不错的反应。要下载此模型，请运行以下命令：

ollama run orca-mini

此命令将下载并在终端中运行orca-mini模型。在运行此模型之前，请确保计算机上至少有 8GB RAM。

如果你正在运行该模型，请询问它一些问题并查看它的响应情况。这是一个例子：

从上面的例子可以看出，它有时会提供不相关的信息。我们可以通过更改提示来解决此问题。我们将在创建聊天机器人时更改提示。

现在模型可以正常工作了，让我们使用它创建 PDF 聊天机器人。

创建聊天机器人设置项目目录

如果你不知道如何在WSL2中创建项目，请参阅本文(https://open.substack.com/pub/srang992/p/how-to-create-a-wsl-development-environment?r=2c5vg4&utm_campaign=post&utm_medium=web)的最后部分。在那里我解释了一切。这里我们使用VS Code作为IDE。

安装必要的库

在开始之前，让我们安装所需的库。创建一个名为requirements.txt的文件并在下面写入依赖项。

langchain

pymupdf

huggingface-hub

faiss-cpu

sentence-transformers

之后，只需在 VS Code 中打开终端并运行以下命令。

pip install -r requirements.txt

现在还需要一些时间。你不需要 Ollama 的任何软件包，因为这是一个应用程序。现在我们可以开始编写聊天机器人的代码了。

创建必要的函数

要在 Langchain 中创建聊天机器人，我们必须遵循以下步骤：

使用Langchain的任何 PDF 加载器读取 PDF 文件。

如果文档确实很大，最好将其分成更小的部分，也称为块。这样，我们就可以确保模型获得针对你的问题的正确信息，而无需使用太多资源。这就像只为模型提供所需的部分，而不是一次用所有东西压倒它。每个模型都有代币限制，不是吗？

之后，我们必须将这些块转换为向量嵌入，以便模型更好地理解数据。然后我们必须创建向量存储，在其中存储这些嵌入并在需要时有效地检索它们。这里我们将使用FAISS矢量库。

现在我们准备查询 PDF 文件。为此，我们将使用Langchain 的RetrievalQA。它将把向量存储作为检索器和我们将使用的模型。其他一些参数也根据我们的需要使用。

导入必要的包

这些是基本 PDF 聊天机器人的步骤。现在，如果你想创建一个复杂的聊天机器人，还需要一些额外的步骤。比如加内存，包括路由等。现在我们为每一步创建一些函数，这样就不用多次重复代码进行测试了。首先，我们导入必要的包：

# Importing the necessary packages

from langchain.embeddings import HuggingFaceEmbeddings

from langchain.document_loaders import PyMuPDFLoader

from langchain.text_splitter import RecursiveCharacterTextSplitter

from langchain.vectorstores import FAISS

from langchain.chains import RetrievalQA

import textwrap

之后，我们创建第一个函数来加载 PDF 文件。在这里，你将使用 Langchain 的PyMuPDFLoader阅读 PDF 文件。

# This will load the PDF file

def load_pdf_data(file_path):

# Creating a PyMuPDFLoader object with file_path

loader = PyMuPDFLoader(file_path=file_path)

# loading the PDF file

docs = loader.load()

# returning the loaded document

return docs

然后我们必须将文档分成几个块。这里我们将使用Langchain 的RecursiveCharacterTextSplitter，它是最流行的文本分割工具。

# Responsible for splitting the documents into several chunks

def split_docs(documents, chunk_size=1000, chunk_overlap=20):

# Initializing the RecursiveCharacterTextSplitter with

# chunk_size and chunk_overlap

text_splitter = RecursiveCharacterTextSplitter(

chunk_size=chunk_size,

chunk_overlap=chunk_overlap

)

# Splitting the documents into chunks

chunks = text_splitter.split_documents(documents=documents)

# returning the document chunks

return chunks

现在是嵌入的时候了。为此，我们必须首先使用Langchain的HuggingFaceEmbedding加载嵌入模型，然后使用FAISS创建向量存储。用于嵌入的模型是all-MiniLM-L6-v2，稍后你将看到。

# function for loading the embedding model

def load_embedding_model(model_path, normalize_embedding=True):

return HuggingFaceEmbeddings(

model_name=model_path,

model_kwargs={'device':'cpu'}, # here we will run the model with CPU only

encode_kwargs = {

'normalize_embeddings': normalize_embedding # keep True to compute cosine similarity

}

)

# Function for creating embeddings using FAISS

def create_embeddings(chunks, embedding_model, storing_path="vectorstore"):

# Creating the embeddings using FAISS

vectorstore = FAISS.from_documents(chunks, embedding_model)

# Saving the model in current directory

vectorstore.save_local(storing_path)

# returning the vectorstore

return vectorstore

让我们创建一个自定义提示模板，以便聊天机器人能够按预期工作。orca-mini型号的默认提示如下。

prompt = """

### System:

You are an AI Assistant that follows instructions extreamly well.

Help as much as you can.

### User:

{prompt}

### Response:

"""

因此，保持这种提示格式可以让我们得到更好的响应。下面是修改后的提示模板：

template = """

### System:

You are an respectful and honest assistant. You have to answer the user's

questions using only the context provided to you. If you don't know the answer,

just say you don't know. Don't try to make up an answer.

### Context:

{context}

### User:

{question}

### Response:

"""

现在我们必须创建问答链。这里我们使用Langchain的RetrievalQA。RetrievalQA 不会赋予聊天机器人记忆，即它只是回答你的问题，但不会记住之前的对话。

# Creating the chain for Question Answering

def load_qa_chain(retriever, llm, prompt):

return RetrievalQA.from_chain_type(

llm=llm,

retriever=retriever, # here we are using the vectorstore as a retriever

chain_type="stuff",

return_source_documents=True, # including source documents in output

chain_type_kwargs={'prompt': prompt} # customizing the prompt

)

我们还将创建一个函数来美化响应。这是代码：

# Prettifying the response

def get_response(query, chain):

# Getting response from chain

response = chain({'query': query})

# Wrapping the text for better output in Jupyter Notebook

wrapped_text = textwrap.fill(response['result'], width=100)

print(wrapped_text)

现在我们创建了所有必要的功能，是时候创建聊天机器人了。

创建 Jupyter 笔记本

首先，在你的目录中创建一个 Jupyter Notebook。为此，创建一个扩展名为 .ipynb 的新文件。另外不要忘记从 pip 安装 jupyter。否则，笔记本将无法运行。只需运行以下命令即可安装 jupyter。

pip install jupyter

如果 VS Code 未检测到 Jupyter Notebook 的内核，你将在右上角看到一个选项“选择内核”。你将在下面看到一些选项。

选择Python 环境。它会再次给你一些选择。

选择已加星标的环境。之后，你就可以使用 Jupyter Notebook 了。

导入必要的库

现在，我们将导入必要的库。这里我们将导入三个库：

我们在其中编写所有函数的 Python 脚本。我给出了文件名lang_funcs.py

来自 langchain.llms 的Ollama以及

来自 langchain 的提示模板。

让我们导入这些库：

from lang_funcs import *

from langchain.llms import Ollama

from langchain import PromptTemplate

加载模型

现在我们必须加载 orca-mini 模型和名为all-MiniLM-L6-v2的嵌入模型。这种嵌入模型很小但很有效。

# Loading orca-mini from Ollama

llm = Ollama(model="orca-mini", temperature=0)

# Loading the Embedding Model

embed = load_embedding_model(model_path="all-MiniLM-L6-v2")

Ollama 模型本地托管在端口 11434 中。我们不必指定，因为它已经在langchain 的Ollama()类中指定。如果嵌入模型未下载到你的计算机上，它将自动从 Huggingface 执行此操作。只需等待一段时间即可开始。

加载数据并创建矢量存储

是时候加载数据并创建嵌入了。这里我们将使用一本关于 ML 的 PDF 书籍。

# loading and splitting the documents

docs = load_pdf_data(file_path="data/ml_book.pdf")

documents = split_docs(documents=docs)

# creating vectorstore

vectorstore = create_embeddings(documents, embed)

# converting vectorstore to a retriever

retriever = vectorstore.as_retriever()

创建嵌入需要一些时间。保持耐心。

你就快到了！

我知道你已经走了很长的路。现在我们必须创建链并开始测试我们的聊天机器人。那么让我们这样做吧。

# Creating the prompt from the template which we created before

prompt = PromptTemplate.from_template(template)

# Creating the chain

chain = load_qa_chain(retriever, llm, prompt)

现在我们准备测试我们的聊天机器人。

测试聊天机器人

现在让我们向聊天机器人问一些问题。这里有些例子：

>>>get_response("What is random forest?", chain)

>>> Random Forest is an ensemble learning technique that uses decision trees as base classifiers to

create a new classifier. It is a type of ensemble learning method that combines multiple decision

trees to improve the accuracy and reduce overfitting of the final model. The Random Forest algorithm

introduces extra randomness when growing trees, instead of searching for the very best feature when

splitting a node (like regular Decision Trees do). This results in a greater tree diversity, which

trades a higher bias for a lower variance. It is also more computationally efficient than other

ensemble learning methods like Bagging and Boosting.

>>>get_response("What is Voting Classifier?", chain)

>>> A voting classifier is a type of machine learning algorithm that predicts the class of a given

input data point based on the majority vote of multiple classifiers. In other words, it takes the

predictions of several different classifiers and predicts the class that gets the most votes. This

approach can be used to improve the accuracy of machine learning models by using the diversity of

predictions from different classifiers. The hard voting classifier is a type of voting classifier

that predicts the class with the highest majority vote.

>>>get_response("What is the difference between voting classifier and random forest?", chain)

>>> Voting classifiers and Random Forests are two different approaches to creating an ensemble of

classifiers. Voting classifiers use majority vote to predict the class that receives the most

votes from the classifiers in the ensemble. This approach is based on the idea that the majority

opinion is likely to be correct, and that by aggregating the predictions of multiple classifiers, we

can improve our accuracy. The hard voting classifier is a simple implementation of this approach.

On the other hand, Random Forests is an ensemble learning method that uses decision trees as base

classifiers. It is trained via the bagging method, where each tree is grown from a randomly selected

subset of the training data. The Random Forest classifier is more convenient and optimized for

Decision Trees than voting classifiers, and it generally yields an overall better model.

它确实给出了很好的回应。你可以再问一些问题，看看它如何回应。

结论

我希望现在你对如何使用 Langchain 和 Ollama 创建 PDF 聊天机器人有了清晰的了解。Ollama 是这个领域的新人，它确实让我们的生活变得更轻松。

你已经看到我们如何仅用一行初始化 orca-mini 模型。否则，你必须使用 Langchain 的 HuggingfacePipeline。

要点

Ollama 简化模型部署： Ollama 通过提供一种在本地计算机上下载和运行开源模型的简单方法，简化了开源模型的部署。

PDF 聊天机器人开发：了解创建 PDF 聊天机器人所涉及的步骤，包括加载 PDF 文档、将其拆分为块以及创建聊天机器人链。

自定义以获得更好的响应：了解如何自定义提示和模板以改进聊天机器人的响应。

本文中使用的所有代码：https://github.com/srang992/Ollama-Chatbot

常见问题

Q1. 如何知道我必须使用哪种型号？

答：这完全取决于你的用例。但据我所知，如果你创建一个像我们在本文中创建的聊天机器人，请尝试采用相对较小的模型。否则，聊天机器人的性能不会很好。这里我们使用 Orca-mini，因为它不是很大，但它提供了良好的响应。

Q2。我应该使用哪个 PDFLoader？

答：如果你想阅读单栏PDF，那么你可以使用PyPDFLoader。如果你要阅读多列的 PDF，你应该使用 PyMuPDFLoader。现在这是我的偏好。你也可以尝试其他 pdf 加载器。

Q3。在我的计算机上运行 WSL2 有哪些要求？

答：要运行 WSL2，你的计算机至少需要 4GB RAM。如果你的计算机上安装了 Windows 11，那么该过程会更加顺利。

Q4。虽然我使用的是 Ollama，但看起来聊天机器人的创建过程仍然很乏味。这是为什么？

答：Langchain为我们提供了各种定制服务。更多可定制意味着你必须编写更多代码。你可以使用 Llamaindex，因为这使你能够更轻松地创建聊天机器人。

Q5. 我可以使用 Embedchain 创建这个聊天机器人吗？

答：当写这篇文章时，Embedchain 不支持 Ollama。但如果你想在没有 Ollama 的情况下创建这个聊天机器人，你可以这样做。

使用 Langchain 和 Ollama 的 PDF 聊天机器人分步指南

相关阅读

磐创AI

磐创AI

举报文章问题

举报评论问题

用户登录×