Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

complete prompt is appended at the start of my response generated by llama3 #24437

Open
5 tasks done
ibtsamraza opened this issue Jul 19, 2024 · 3 comments
Open
5 tasks done
Assignees
Labels
🤖:bug Related to a bug, vulnerability, unexpected error with an existing feature 🔌: huggingface Primarily related to HuggingFace integrations investigate Flagged for investigation. stale Issue has not had recent activity or appears to be solved. Stale issues will be automatically closed

Comments

@ibtsamraza
Copy link

Checked other resources

  • I added a very descriptive title to this issue.
  • I searched the LangChain documentation with the integrated search.
  • I used the GitHub search to find a similar question and didn't find it.
  • I am sure that this is a bug in LangChain rather than my code.
  • The bug is not resolved by updating to the latest stable version of LangChain (or the specific integration package).

Example Code

prompt = PromptTemplate(
template="""<|begin_of_text|><|start_header_id|>system<|end_header_id|> You are an assistant for question-answering tasks.
Use the following pieces of retrieved context to answer the question and give response from the context given to you as truthfully as you can.
Do not add anything  from you and If you don't know the answer, just say that you don't know.
<|eot_id|>
<|start_header_id|>user<|end_header_id|>
Question: {question}
Context: {context}
Chat History: {chat_history}
Answer: <|eot_id|><|start_header_id|>assistant<|end_header_id|>""",
input_variables=["question", "context", "chat_history"],

)

global memory
memory = ConversationBufferWindowMemory(k=4,
    memory_key='chat_history', return_messages=True, output_key='answer')

# LLMs Using API

llm = HuggingFaceHub(repo_id='meta-llama/Meta-Llama-3-8B-Instruct', huggingfacehub_api_token=api_key", model_kwargs={
                  "temperature": 0.1,"max_length": 300, "max_new_tokens": 300})


compressor = CohereRerank()
compression_retriever = ContextualCompressionRetriever(
    base_compressor=compressor, base_retriever=retriever3
)


global chain_with_memory

# Create the custom chain
chain_with_memory = ConversationalRetrievalChain.from_llm(
    llm=llm,
    memory=memory,
    retriever=compression_retriever,
    combine_docs_chain_kwargs={"prompt": prompt},
    return_source_documents=True,
)

Error Message and Stack Trace (if applicable)

llm_reponse before guardrails {'question': 'how many F grade a student can have in bachelor', 'chat_history': [], 'answer': "<|begin_of_text|><|start_header_id|>system<|end_header_id|> You are an assistant for question-answering tasks.\n Use the following pieces of retrieved context to answer the question and give response from the context given to you as truthfully as you can.\n Do not add anything from you and If you don't know the answer, just say that you don't know.\n <|eot_id|>\n <|start_header_id|>user<|end_header_id|>\n Question: how many F grade a student can have in bachelor\n Context:

Description

i am building a rag pipeline and it was working fine in my local environment but when i deploy it on a server the prompt template was appended at the start of my llm response. When i compare my local and server environment the only difference was on server langchain 0.2.9 and langchain-community were running while on my local setup langchain 0.2.6 was running . Any one who face the same issue or have any solution

System Info

langchain==0.2.9
langchain-cohere==0.1.9
langchain-community==0.2.7
langchain-core==0.2.21
langchain-experimental==0.0.62
langchain-text-splitters==0.2.2

@langcarl langcarl bot added the investigate Flagged for investigation. label Jul 19, 2024
@dosubot dosubot bot added 🔌: huggingface Primarily related to HuggingFace integrations 🤖:bug Related to a bug, vulnerability, unexpected error with an existing feature labels Jul 19, 2024
@efriis
Copy link
Member

efriis commented Jul 19, 2024

@Jofthomas from huggingface can help here!

@Soumil32
Copy link

Soumil32 commented Aug 8, 2024

The pull request I submitted should fix this! #25136

Copy link

dosubot bot commented Dec 21, 2024

Hi, @ibtsamraza. I'm Dosu, and I'm helping the LangChain team manage their backlog. I'm marking this issue as stale.

Issue Summary:

  • You reported a bug where the Llama3 model appends the complete prompt at the start of its response.
  • The issue persists even after updating to the latest version.
  • Example code was provided to demonstrate the issue.
  • @Soumil32 submitted a pull request (Fix for issue #19933: langchain_huggingface #25136) to resolve this problem.

Next Steps:

  • Please confirm if this issue is still relevant to the latest version of the LangChain repository. If so, you can keep the discussion open by commenting on the issue.
  • Otherwise, the issue will be automatically closed in 7 days.

Thank you for your understanding and contribution!

@dosubot dosubot bot added the stale Issue has not had recent activity or appears to be solved. Stale issues will be automatically closed label Dec 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🤖:bug Related to a bug, vulnerability, unexpected error with an existing feature 🔌: huggingface Primarily related to HuggingFace integrations investigate Flagged for investigation. stale Issue has not had recent activity or appears to be solved. Stale issues will be automatically closed
Projects
None yet
Development

No branches or pull requests

3 participants