Model(Hugging Face)

대분류

인공지능/데이터

프레임워크

소분류

LangChain

유형

LangChain

부유형

Introduction LangChain

최종 편집 일시

2024/10/29 05:24

생성 일시

2024/10/29 02:51

14 more properties

Hugging Face Local Pipelines

Hugging Face Local Pipelines

•

허깅 페이스 모델은 지역적으로 실행될 수 있다 .

•

Hugging Face Model Hub는 12만 개 이상의 모델, 2만 개의 데이터 세트, 5만 개의 데모 앱(공간)을 호스팅하며, 모두 오픈 소스이고 공개적으로 사용 가능하며, 사람들이 쉽게 협업하고 함께 ML을 구축할 수 있는 온라인 플랫폼이다. 

•

이러한 호출은 로컬 파이프라인 래퍼를 통해 LangChain에서 호출하거나 HuggingFaceHub 클래스를 통해 호스팅된 추론 엔드포인트를 호출하여 호출할 수 있다.

허깅 페이스 파이프라인

HuggingFacePipeline

•

GPU가 있는 컴퓨터에서 실행하는 경우, device=n 매개변수를 지정하여 지정된 장치에 모델을 배치할 수 있다. 기본값은 CPU 추론의 경우 -1.

•

다중 GPU를 사용하거나 모델이 단일 GPU에 비해 너무 큰 경우, device_map=“auto”를 지정하면 모델 가중치를 로드하는 방법을 자동으로 결정하기 위해 Accelerate 라이브러리가 필요하고 이 라이브러리를 사용할 수 있다.

from langchain_community.llms.huggingface_pipeline import HuggingFacePipeline

# HuggingFace 모델을 다운로드
hf = HuggingFacePipeline.from_model_id(
    model_id='beomi/KoAlpaca-Polyglot-5.8B',  # 사용할 모델의 ID를 지정
    task="text-generation",  # 수행할 작업을 지정, 여기서는 텍스트 생성
    # 파이프라인에 전달할 추가 인자를 설정, 여기서는 생성할 최대 토큰 수를 제한
    pipeline_kwargs={"max_new_tokens": 512},
    device=0,  # replace with device_map="auto" to use the accelerate library.
)
Python
복사

Prompt

from langchain.prompts import PromptTemplate

template = """
Answer the following question in Korean.

#Question:
{question}

#Answer: """  # 질문과 답변 형식을 정의하는 템플릿
prompt = PromptTemplate.from_template(template)  # 템플릿을 사용하여 프롬프트 객체 생성

> prompt
PromptTemplate(input_variables=['question'], input_types={}, partial_variables={}, template='\nAnswer the following question in Korean.\n\n#Question:\n{question}\n\n#Answer: ')
Python
복사

Chain

# 프롬프트와 언어 모델을 연결하여 체인 생성
chain = prompt | hf

question = "대한민국의 수도는 어디야?"  # 질문 정의
response = chain.invoke({"question": question})

> print(response)
Answer the following question in Korean.

#Question:
대한민국의 수도는 어디야?

#Answer: 
대한민국의 수도는 서울입니다.
Python
복사

HuggingFace Model

import torch, gc

# Flush memory
del chain, prompt, hf

gc.collect()
torch.cuda.empty_cache()
Python
복사

LLM

from langchain_community.llms.huggingface_pipeline import HuggingFacePipeline
from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline

model_id = 'beomi/KoAlpaca-Polyglot-5.8B'  # 사용할 모델의 ID를 지정합니다.
tokenizer = AutoTokenizer.from_pretrained(
    model_id
)  # 지정된 모델의 토크나이저를 로드합니다.
model = AutoModelForCausalLM.from_pretrained(model_id)  # 지정된 모델을 로드합니다.
Python
복사

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
> device
device(type='cuda')

> model.to(device)
GPTNeoXForCausalLM(
  (gpt_neox): GPTNeoXModel(
    (embed_in): Embedding(30080, 4096)
    (emb_dropout): Dropout(p=0.0, inplace=False)
    (layers): ModuleList(
      (0-27): 28 x GPTNeoXLayer(
        (input_layernorm): LayerNorm((4096,), eps=1e-05, elementwise_affine=True)
        (post_attention_layernorm): LayerNorm((4096,), eps=1e-05, elementwise_affine=True)
        (post_attention_dropout): Dropout(p=0.0, inplace=False)
        (post_mlp_dropout): Dropout(p=0.0, inplace=False)
        (attention): GPTNeoXSdpaAttention(
          (rotary_emb): GPTNeoXRotaryEmbedding()
          (query_key_value): Linear(in_features=4096, out_features=12288, bias=True)
          (dense): Linear(in_features=4096, out_features=4096, bias=True)
          (attention_dropout): Dropout(p=0.0, inplace=False)
        )
        (mlp): GPTNeoXMLP(
          (dense_h_to_4h): Linear(in_features=4096, out_features=16384, bias=True)
          (dense_4h_to_h): Linear(in_features=16384, out_features=4096, bias=True)
          (act): GELUActivation()
        )
      )
    )
    (final_layer_norm): LayerNorm((4096,), eps=1e-05, elementwise_affine=True)
  )
  (embed_out): Linear(in_features=4096, out_features=30080, bias=False)
)
Python
복사

Pipeline

# 텍스트 생성 파이프라인을 생성하고, 최대 생성할 새로운 토큰 수를 10으로 설정합니다.
pipe = pipeline("text-generation", model=model,
                tokenizer=tokenizer, max_new_tokens=512)
                
                
Hardware accelerator e.g. GPU is available in the environment, but no `device` argument is passed to the `Pipeline` object. Model will be on CPU.
Python
복사

HuggingFacePipeline

# HuggingFacePipeline 객체를 생성하고, 생성된 파이프라인을 전달합니다.
hf = HuggingFacePipeline(pipeline=pipe)
Python
복사

PromptTemplate

from langchain.prompts import PromptTemplate

template = """
Answer the following question in Korean.

#Question:
{question}

#Answer: """  # 질문과 답변 형식을 정의하는 템플릿
prompt = PromptTemplate.from_template(template)  # 템플릿을 사용하여 프롬프트 객체 생성
Python
복사

chain

# 프롬프트와 언어 모델을 연결하여 체인 생성
chain = prompt | hf

question = "대한민국의 수도는 어디야?"  # 질문 정의
response = chain.invoke({"question": question})

> print(response)
Answer the following question in Korean.

#Question: 
대한민국의 수도는 어디야?

#Answer: 
서울입니다.
Python
복사