ConversationalRetrieval Chain

대분류

인공지능/데이터

프레임워크

소분류

LangChain

유형

LangChain

RAG

부유형

Regacy Chains Migrating Package

주요 레퍼런스

https://vijaykumarkartha.medium.com/beginners-guide-to-conversational-retrieval-chain-using-langchain-3ddf1357f371

https://blog.dheerajinampudi.com/retrieval-chains-enhancing-rags-with-different-retrieval-techniques-c6071f1a0ff3

최종 편집 일시

2024/11/03 11:50

생성 일시

2024/11/02 17:41

13 more properties

대화형 검색(Conversational Retrieval) 체인

1. User Question

2. Search Embeddings

3. Chat History

4. If Chat History Provided (조건 분기)

5. Request LLM to Rephrase the Question

7. LLM Response to the User

Document Loading (문서 로딩) & 모듈 불러오기

LLM Model 정의

대화형 검색(Conversational Retrieval) 체인

•

뭔 이름이 이렇게 길어? 라고 놀라지말고 그냥 Practice 1과 2를 합친 방식이라고 생각하면 된다.

•

공식 문서에서도 검색 증강 생성(Retriever)과 채팅 기록(Conversation)을 결합한 올인원 방식이라고 설명하고 있다.

•

이점

◦

내부 구조가 더 명확해진다.

ConversationalRetrievalChain은 채팅 기록에 대해 초기 쿼리를 참조하지 않아서 전체 질문 재구성 단계를 숨긴다.

→ 즉, 클래스에는 두 세트의 구성 가능한 프롬프트, LLM 등이 포함되어 있다는 것을 의미한다.

◦

소스 문서를 더 쉽게 반환한다.

◦

스트리밍 및 비동기 작업과 같은 실행 가능한 메서드를 지원한다.

동작 과정

1. User Question

•

사용자가 시스템에 질문을 입력하는 단계

•

이후 단계에서 대화 기록과 관련 문서를 기반으로 응답이 생성된다.

2. Search Embeddings

•

사용자의 질문을 벡터로 변환하여 Vector Store에서 관련 정보를 검색하는 단계

* 이때, 검색되는 정보는 사용자의 질문과 연관성이 높은 문서 조각들로, Context로 활용될 수 있다.

•

질문에 가장 적합한 정보를 빠르게 찾기 위해 사용자의 질문과 유사한 문서 조각을 검색한다.

3. Chat History

•

대화 기록이 있는지 확인하는 단계

대화 기록이 있다면, 현재 질문과 이전 대화 내용을 결합하여 문맥을 반영한 응답을 생성할 수 있다.

•

더 일관성 있는 응답을 제공하기 위해 대화 기록을 참조한다.

4. If Chat History Provided (조건 분기)

•

이전 대화 내용을 반영하여 질문을 개선할지 여부를 결정하는 단계

•

대화 기록이 있는지 여부에 따라 분기한다.

◦

Yes: 대화 기록이 있을 경우, LLM에게 질문을 재구성하도록 요청한다.

◦

No: 대화 기록이 없을 경우, 현재 질문과 관련 문서 조각만을 바탕으로 LLM이 응답을 생성하게 한다.

5. Request LLM to Rephrase the Question

•

대화 기록을 바탕으로 질문을 재구성하는 단계

•

예를 들어, 사용자가 이전 질문의 답변에 이어서 구체적인 추가 질문을 할 경우, LLM이 이전 질문의 문맥을 포함하여 현재 질문을 더 명확히 재구성할 수 있다.

•

대화의 일관성을 유지하기 위해 LLM이 이전 문맥을 고려한 질문을 생성하게 한다.

6. LLM

•

재구성된 질문 또는 원래 질문과 관련 문서(Context)를 바탕으로 답변을 생성하는 단계

•

사용자의 질문과 관련 문서의 문맥을 반영하여 최적의 응답을 생성한다.

7. LLM Response to the User

•

최종적으로 생성된 응답이 사용자에게 전달된다. 

•

이 응답은 대화의 문맥을 반영하여 보다 정확하고 일관성 있는 답변을 제공한다.

기본 세팅

설치

•

해당 과정은 cpu를 사용해서 진행하여서 faiss-cpu 모듈을 사용하였다.

!pip install -U langchain langchain-community langchain-core langchain-openai langgraph faiss-cpu
Python
복사

Key 등록

•

아래 방법이나 .env에 정의하자.

import os

os.environ['OPENAI_API_KEY'] = '여기에 사용하는 키 입력'
Python
복사

Document Loading (문서 로딩) & 모듈 불러오기

•

샘플 데이터를 사용

# Load docs
from langchain_community.document_loaders import WebBaseLoader
from langchain_community.vectorstores import FAISS
from langchain_openai.chat_models import ChatOpenAI
from langchain_openai.embeddings import OpenAIEmbeddings
from langchain_text_splitters import RecursiveCharacterTextSplitter

loader = WebBaseLoader("https://lilianweng.github.io/posts/2023-06-23-agent/")
data = loader.load()
Python
복사

Split

•

위에서 뽑은 data를 리스트 형태, Document로 분리해서 뽑아준다. - RetrieverQA 참고

text_splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=0)
all_splits = text_splitter.split_documents(data)
Python
복사

Store Split

•

벡터DB에 쪼개진 데이터를 임베딩하여 삽입한다.

vectorstore = FAISS.from_documents(documents=all_splits, embedding=OpenAIEmbeddings())
Python
복사

LLM Model 정의

•

gpt-4o-mini 모델을 사용

llm = ChatOpenAI(model="gpt-4o-mini")
Python
복사

LCEL

•

체인 생성

from langchain.chains import create_history_aware_retriever, create_retrieval_chain
from langchain.chains.combine_documents import create_stuff_documents_chain

system_prompt = (
    "You are an assistant for question-answering tasks. "
    "Use the following pieces of retrieved context to answer "
    "the question. If you don't know the answer, say that you "
    "don't know. Use three sentences maximum and keep the "
    "answer concise."
    "\n\n"
    "{context}"
)

qa_prompt = ChatPromptTemplate.from_messages(
    [
        ("system", system_prompt),
        ("placeholder", "{chat_history}"),
        ("human", "{input}"),
    ]
)

qa_chain = create_stuff_documents_chain(llm, qa_prompt)
Python
복사

condense_question_system_template = (
    "Given a chat history and the latest user question "
    "which might reference context in the chat history, "
    "formulate a standalone question which can be understood "
    "without the chat history. Do NOT answer the question, "
    "just reformulate it if needed and otherwise return it as is."
)

condense_question_prompt = ChatPromptTemplate.from_messages(
    [
        ("system", condense_question_system_template),
        ("placeholder", "{chat_history}"),
        ("human", "{input}"),
    ]
)

history_aware_retriever = create_history_aware_retriever(
    llm, vectorstore.as_retriever(), condense_question_prompt
)

convo_qa_chain = create_retrieval_chain(history_aware_retriever, qa_chain)
Python
복사

•

response값 확인

response = convo_qa_chain.invoke(
    {
        "input": "What are autonomous agents?",
        "chat_history": [],
    }
)

> response.keys()
dict_keys(['input', 'chat_history', 'context', 'answer'])

> response['answer']
Autonomous agents are systems that can perform tasks independently, often using advanced technologies like artificial intelligence and machine learning. They can plan, make decisions, and execute actions without human intervention, which allows them to handle complex processes such as scientific experiments. An example of this is LLM-empowered agents that can browse the Internet, execute code, and leverage other models for tasks like drug development.
Python
복사