Skip to content

ChatOllama with_structured_output not honoured by langchain. Works fine using direct ollama chat() call. #29410

Open
@jonmach

Description

@jonmach

Checked other resources

  • I added a very descriptive title to this issue.
  • I searched the LangChain documentation with the integrated search.
  • I used the GitHub search to find a similar question and didn't find it.
  • I am sure that this is a bug in LangChain rather than my code.
  • The bug is not resolved by updating to the latest stable version of LangChain (or the specific integration package).

Example Code

Python versions are:

langchain 0.3.15
langchain-community 0.3.15
langchain-core 0.3.31
langchain-ollama 0.2.2
ollama 0.4.7

Running ollama 0.5.7 (pip install -U ollama did not increase the version beyond 0.4.7)

Using with_structured_output() seems to work for a very simple example such as the following:

from langchain_ollama import ChatOllama
from typing import Optional
from pydantic import BaseModel, Field

class Person(BaseModel):
    name: str
    age: int

llm = ChatOllama(
    model="qwen2.5:1.5b",
    temperature=0,
).with_structured_output(Person)
llm.invoke("Erick 27")

However, for a more complex requirement, it fails with Ollama returning a value of None.

from pydantic import BaseModel, Field
from typing import Optional
from openai import OpenAI
from langchain_ollama import ChatOllama

# Define the output model
class Experience(BaseModel):
    company: str = Field(..., description="The name of the company.")
    position: str = Field(..., description="The job title held at the company.")
    start_date: str = Field(..., description="The date when you started working at the company.")
    end_date: str = Field(..., description="The date when you left the company. If still employed, use 'Present'.")

class Education(BaseModel):
    institution_name: str = Field(..., description="The name of the educational institution.")
    degree: str = Field(..., description="The degree obtained from the institution.")
    start_date: str = Field(..., description="The date when you started attending school at the institution.")
    end_date: str = Field(..., description="The date when you graduated. If still enrolled, use 'Present'.")

class Resume(BaseModel):
    full_name: str = Field(..., description="The full name of the person on the resume.")
    contact_email: str = Field(..., description="The email address for contacting the person.")
    phone_number: str = Field(..., description="The phone number for contacting the person.")
    summary: str = Field(..., description="A brief summary of the person's career highlights.")
    experience: Optional[list[Experience]] = Field([], description="List of experiences held by the person.")
    education: Optional[list[Education]] = Field([], description="List of educational institutions attended by the person.")

with open('CVs/resume.md', 'r') as file:
    resume_data = file.read()

verbose=True
model = "qwen2.5:14b"

prompt = f""" 
Analyse the following resume from the content between the triple backticks below:  For the resume below, identify the following information:
    1) Their personal details, including name, email, phone number and anything else they provide.
    2) An overall summary of their experience to provide a general background.
    3) A list of the companies they have worked for. This should include the company name, the dates they started and and ended working for the company, and the tasks and activities they carried out.
    4) A list of universities or colleges that the person went to. This should include the name of the college the title of the qualification, and the dates they started and ended.

    The raw data is here: 
    
       ```{resume_data}```
"""

llm = ChatOllama( model=model,
                    num_ctx = 32000,
                    timeout = 600,
                    temperature=0.0,
                    verboseness = verbose,
                    response = "json")

structured_llm = llm.with_structured_output(Resume)
print("Calling LLM")
response = structured_llm.invoke(prompt)
print(response)

It also fails without the 'response = "json" included.

I just get a None response.

Oddly, this is not consistent. Sometimes, I get back a response, but it fails satisfying the Resume type. because it won't find education items. Even though Education is an optional type in the Resume class.

Many thanks to @rick-github for help with the RCA for this.

The request sent to ollama was as follows:

{
  "model": "qwen2.5:14b",
  "stream": true,
  "options": {
    "num_ctx": 32000,
    "temperature": 0.0
  },
  "messages": [
    {
      "role": "user",
      "content": " \nAnalyse the following ... or collaboration.\n```\n"
    }
  ],
  "tools": [
    {
      "type": "function",
      "function": {
        "name": "Resume",
        "description": "",
        "parameters": {
          "type": "object",
          "required": [
            "full_name",
            "contact_email",
            "phone_number",
            "summary"
          ],
          "properties": {
            "full_name": {
              "type": "string",
              "description": "The full name of the person on the resume."
            },
            "contact_email": {
              "type": "string",
              "description": "The email address for contacting the person."
            },
            "phone_number": {
              "type": "string",
              "description": "The phone number for contacting the person."
            },
            "summary": {
              "type": "string",
              "description": "A brief summary of the person's career highlights."
            },
            "experience": {
              "description": "List of experiences held by the person."
            },
            "education": {
              "description": "List of educational institutions attended by the person."
            }
          }
        }
      }
    }
  ]
}

The schema hasn't been recursed and is passed as a tool definition, so the model doesn't know what data is needed. It does its best and fills in a bunch of details, but the returned data fails validation checks.

If the request is sent directly to ollama, the results are better.

Using a direct call to ollama via the chat() function works perfectly.

import ollama, json, sys 

response = ollama.chat(model=model,
                     messages=[{"role":"user","content":prompt}],
                     options={"num_ctx":32000, "temperature":0.0},
                     format=Resume.model_json_schema(),
     )
print(json.dumps(json.loads(response.message.content), indent=4))

The prompt sent to Ollama has the full schema in the format field:

{
  "model": "qwen2.5:14b",
  "stream": false,
  "options": {
    "num_ctx": 32000,
    "temperature": 0.0
  },
  "format": {
    "$defs": {
      "Education": {
        "properties": {
          "institution_name": {
            "description": "The name of the educational institution.",
            "title": "Institution Name",
            "type": "string"
          },
          "degree": {
            "description": "The degree obtained from the institution.",
            "title": "Degree",
            "type": "string"
          },
          "start_date": {
            "description": "The date when you started attending school at the institution.",
            "title": "Start Date",
            "type": "string"
          },
          "end_date": {
            "description": "The date when you graduated. If still enrolled, use 'Present'.",
            "title": "End Date",
            "type": "string"
          }
        },
        "required": [
          "institution_name",
          "degree",
          "start_date",
          "end_date"
        ],
        "title": "Education",
        "type": "object"
      },
      "Experience": {
        "properties": {
          "company": {
            "description": "The name of the company.",
            "title": "Company",
            "type": "string"
          },
          "position": {
            "description": "The job title held at the company.",
            "title": "Position",
            "type": "string"
          },
          "start_date": {
            "description": "The date when you started working at the company.",
            "title": "Start Date",
            "type": "string"
          },
          "end_date": {
            "description": "The date when you left the company. If still employed, use 'Present'.",
            "title": "End Date",
            "type": "string"
          }
        },
        "required": [
          "company",
          "position",
          "start_date",
          "end_date"
        ],
        "title": "Experience",
        "type": "object"
      }
    },
    "properties": {
      "full_name": {
        "description": "The full name of the person on the resume.",
        "title": "Full Name",
        "type": "string"
      },
      "contact_email": {
        "description": "The email address for contacting the person.",
        "title": "Contact Email",
        "type": "string"
      },
      "phone_number": {
        "description": "The phone number for contacting the person.",
        "title": "Phone Number",
        "type": "string"
      },
      "summary": {
        "description": "A brief summary of the person's career highlights.",
        "title": "Summary",
        "type": "string"
      },
      "experience": {
        "anyOf": [
          {
            "items": {
              "$ref": "#/$defs/Experience"
            },
            "type": "array"
          },
          {
            "type": "null"
          }
        ],
        "default": [],
        "description": "List of experiences held by the person.",
        "title": "Experience"
      },
      "education": {
        "anyOf": [
          {
            "items": {
              "$ref": "#/$defs/Education"
            },
            "type": "array"
          },
          {
            "type": "null"
          }
        ],
        "default": [],
        "description": "List of educational institutions attended by the person.",
        "title": "Education"
      }
    },
    "required": [
      "full_name",
      "contact_email",
      "phone_number",
      "summary"
    ],
    "title": "Resume",
    "type": "object"
  },
  "messages": [
    {
      "role": "user",
      "content": " \nAnalyse the following ... or collaboration.\n```\n"
    }
  ],
  "tools": []
}

This was a trace from one run that provided a relatively comprehensive answer:

{
    "full_name": "Not provided in the resume",
    "contact_email": "Not explicitly provided, but a placeholder is given: Feel free to reach out via email",
    "phone_number": "Not provided in the resume",
    "summary": "The individual has over two decades of experience in software engineering and architecture roles. They have worked at several companies including Tech Innovators Inc., NextGen Solutions, Alpha Development Corp., and CodeSphere LLC. Their career highlights include designing scalable microservices architectures, leading development teams, integrating AI/ML capabilities into legacy systems, and automating internal processes to reduce operational costs.",
    "experience": [
        {
            "company": "Tech Innovators Inc.",
            "position": "Senior Software Engineer",
            "start_date": "June 2015",
            "end_date": "Present"
        },
        {
            "company": "NextGen Solutions",
            "position": "Software Architect",
            "start_date": "March 2010",
            "end_date": "May 2015"
        },
        {
            "company": "Alpha Development Corp.",
            "position": "Lead Developer",
            "start_date": "January 2005",
            "end_date": "February 2010"
        },
        {
            "company": "CodeSphere LLC",
            "position": "Software Engineer",
            "start_date": "June 2000",
            "end_date": "December 2004"
        }
    ],
    "education": [
        {
            "institution_name": "Massachusetts Institute of Technology",
            "degree": "Master of Science in Computer Science",
            "start_date": "August 1998",
            "end_date": "May 2000"
        },
        {
            "institution_name": "University of California, Berkeley",
            "degree": "Bachelor of Science in Computer Science",
            "start_date": "August 1994",
            "end_date": "May 1998"
        }
    ]
}

For some reason, I cannot upload the small resume file, so here it is in cleartext:

Professional Experience

Senior Software Engineer

Tech Innovators Inc.
June 2015 – Present

  • Designed and implemented scalable microservices architecture for a SaaS platform, improving performance by 30%.
  • Led a team of 12 engineers, mentoring junior developers and conducting regular code reviews.
  • Integrated AI/ML capabilities into legacy systems, increasing operational efficiency by 20%.
  • Championed DevOps practices, reducing deployment times from days to hours.

Software Architect

NextGen Solutions
March 2010 – May 2015

  • Architected and delivered a real-time analytics platform for financial services, handling millions of transactions daily.
  • Migrated a monolithic system to a distributed microservices-based architecture, enabling faster feature delivery.
  • Partnered with product managers to define technical requirements and roadmap, aligning business goals with engineering efforts.

Lead Developer

Alpha Development Corp.
January 2005 – February 2010

  • Built a high-availability e-commerce platform that handled over 500,000 daily users.
  • Created APIs to integrate third-party payment gateways, enhancing user experience and reducing downtime.
  • Conducted performance optimizations that improved application speed by 40%.

Software Engineer

CodeSphere LLC
June 2000 – December 2004

  • Developed enterprise-grade web applications using Java and C++.
  • Automated internal processes, saving the company 15% in operational costs annually.
  • Collaborated with cross-functional teams to deliver projects on time and within budget.

Education

Master of Science in Computer Science

Massachusetts Institute of Technology
August 1998 – May 2000

Bachelor of Science in Computer Science

University of California, Berkeley
August 1994 – May 1998


Skills

  • Programming Languages: Python, Java, C++, JavaScript
  • Cloud Platforms: AWS, Azure, Google Cloud
  • Architecture: Microservices, Distributed Systems, RESTful APIs
  • Tools: Docker, Kubernetes, Terraform
  • Agile Development, DevOps, AI/ML Integration

Certifications

  • AWS Certified Solutions Architect – Professional
  • Certified Kubernetes Administrator (CKA)
  • Certified ScrumMaster (CSM)

Contact

Feel free to reach out via email or phone for opportunities or collaboration.

Error Message and Stack Trace (if applicable)

No response

Description

I'm trying to use the langchain ChatOllama call to return a Pydantic class with well-defined fields filled in as a result of a context search
I expect that where the required fields are contained within the context, that the Pydantic fields will be populated.
It seems that the format is not being picked up by langchain using the ChatOllama call, but does work when a native ollama library is used.
This results in no valid response being returned by the ChatOllama call.

System Info

System Information

OS: Darwin
OS Version: Darwin Kernel Version 24.2.0: Fri Dec 6 18:56:34 PST 2024; root:xnu-11215.61.5~2/RELEASE_ARM64_T6020
Python Version: 3.11.6 (v3.11.6:8b6ee5ba3b, Oct 2 2023, 11:18:21) [Clang 13.0.0 (clang-1300.0.29.30)]

Package Information

langchain_core: 0.3.31
langchain: 0.3.15
langchain_community: 0.3.15
langsmith: 0.1.128
langchain_anthropic: 0.3.0
langchain_chroma: 0.2.0
langchain_cohere: 0.3.1
langchain_experimental: 0.3.2
langchain_google_genai: 2.0.4
langchain_groq: 0.2.0
langchain_mistralai: 0.2.2
langchain_nomic: 0.1.3
langchain_ollama: 0.2.2
langchain_openai: 0.2.14
langchain_pinecone: 0.2.2
langchain_tests: 0.3.8
langchain_text_splitters: 0.3.5
langchainhub: 0.1.15

Optional packages not installed

langserve

Other Dependencies

aiohttp: 3.10.11
anthropic: 0.42.0
async-timeout: 5.0.1
chromadb: 0.6.3
cohere: 5.13.11
dataclasses-json: 0.6.7
defusedxml: 0.7.1
fastapi: 0.115.7
google-generativeai: 0.8.3
groq: 0.15.0
httpx: 0.28.1
httpx-sse: 0.4.0
jsonpatch: 1.33
nomic: 3.1.3
numpy: 1.26.4
ollama: 0.4.7
openai: 1.59.7
orjson: 3.10.3
packaging: 23.2
pandas: 2.2.3
pillow: 10.4.0
pinecone: 5.4.2
pydantic: 2.10.5
pydantic-settings: 2.7.1
pytest: 8.3.4
pytest-asyncio: 0.25.2
pytest-socket: 0.7.0
PyYAML: 6.0.2
requests: 2.32.3
SQLAlchemy: 2.0.32
syrupy: 4.8.1
tabulate: 0.9.0
tenacity: 9.0.0
tiktoken: 0.8.0
tokenizers: 0.21.0
types-requests: 2.32.0.20241016
typing-extensions: 4.12.2

Metadata

Metadata

Assignees

No one assigned

    Labels

    Ɑ: coreRelated to langchain-core

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions