Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PydanticUserError: SQLDatabaseToolkit is not fully defined; you should define BaseCache, then call SQLDatabaseToolkit.model_rebuild(). #28284

Closed
5 tasks done
yadav-shivani opened this issue Nov 22, 2024 · 5 comments
Assignees
Labels
🤖:bug Related to a bug, vulnerability, unexpected error with an existing feature

Comments

@yadav-shivani
Copy link

Checked other resources

  • I added a very descriptive title to this issue.
  • I searched the LangChain documentation with the integrated search.
  • I used the GitHub search to find a similar question and didn't find it.
  • I am sure that this is a bug in LangChain rather than my code.
  • The bug is not resolved by updating to the latest stable version of LangChain (or the specific integration package).

Example Code

from langchain_community.agent_toolkits.sql.toolkit import SQLDatabaseToolkit
from langchain_community.utilities.sql_database import SQLDatabase
from langchain_openai import ChatOpenAI
from langchain_community.agent_toolkits import create_sql_agent

db = SQLDatabase.from_uri("sqlite:///Chinook.db")
llm = llm_chat()# LLM of your choice

# Initialize your SQLDatabaseToolkit
toolkit = SQLDatabaseToolkit(db=db, llm=llm)
agent_executor = create_sql_agent(llm_chat(), toolkit=toolkit,max_iterations=11,verbose=True,handle_parsing_errors=True)

Error Message and Stack Trace (if applicable)

image

Description

  1. I am trying to build a sql agent which can query over tabular data using create_sql_agent and SQLDatabaseToolkit from Langchain.
  2. My code was working fine till day before. Since yesterday SQLDatabasetoolkit function is throwing error.
  3. I will use the build agent to query database but right now cannot even built the toolkit. I am new to this, not sure where to make changes. I believe it is something related to update.

Any help is highly appreciated :)

System Info

System Information

OS: Linux
OS Version: #1 SMP Tue Oct 22 16:38:23 UTC 2024
Python Version: 3.11.9 | packaged by conda-forge | (main, Apr 19 2024, 18:36:13) [GCC 12.3.0]

Package Information

langchain_core: 0.3.19
langchain: 0.3.7
langchain_community: 0.3.7
langsmith: 0.1.144
langchain_aws: 0.2.7
langchain_text_splitters: 0.3.2

Optional packages not installed

langgraph
langserve

Other Dependencies

aiohttp: 3.11.7
async-timeout: Installed. No version info available.
boto3: 1.34.131
dataclasses-json: 0.6.7
httpx: 0.27.0
httpx-sse: 0.4.0
jsonpatch: 1.33
numpy: 1.26.4
orjson: 3.10.11
packaging: 23.2
pydantic: 2.10.1
pydantic-settings: 2.6.1
PyYAML: 6.0.1
requests: 2.31.0
requests-toolbelt: 1.0.0
SQLAlchemy: 2.0.31
tenacity: 8.4.1
typing-extensions: 4.12.2

@dosubot dosubot bot added the 🤖:bug Related to a bug, vulnerability, unexpected error with an existing feature label Nov 22, 2024
@ksteimel
Copy link

I had a similar issue and downgrading pydantic to 2.9.2 resolved it for me.

@ahmed33033
Copy link

#28257 mentions a very similar issue!

@eyurtsev eyurtsev self-assigned this Nov 22, 2024
@eyurtsev
Copy link
Collaborator

Breaking due to new pydantic release: https://github.com/pydantic/pydantic/releases/tag/v2.10.1 investigating

@yadav-shivani
Copy link
Author

I had a similar issue and downgrading pydantic to 2.9.2 resolved it for me.

Thanks a lot!. It is working for me now.

@Viicos
Copy link

Viicos commented Nov 24, 2024

As discussed on Slack: #28297 fixed the issue, however there might be a more robust way to fix it.

The SQLDatabaseToolkit is a Pydantic model with a llm field annotated as BaseLanguageModel. This class, also a Pydantic model, has a cache field defined:

class BaseLanguageModel(
RunnableSerializable[LanguageModelInput, LanguageModelOutputVar], ABC
):
"""Abstract base class for interfacing with language models.
All language model wrappers inherited from BaseLanguageModel.
"""
cache: Union[BaseCache, bool, None] = Field(default=None, exclude=True)

Because BaseCache is imported in an if TYPE_CHECKING: block, meaning Pydantic can't know about it (same for Callbacks, used in another field):

if TYPE_CHECKING:
from langchain_core.caches import BaseCache
from langchain_core.callbacks import Callbacks

For every Pydantic model using BaseLanguageModel, the current fix from #28297 added a .model_rebuild() call and imported the missing symbols. While this works, this is quite confusing as is. The reason it works is model_rebuild() will use the module namespace where is it called to resolve annotations that previously failed to resolve. Hence the added BaseCache and Callbacks imports in the aforementioned PR. Here are options I would consider instead:

  • For every model_rebuild() call, provide the missing annotations using the _types_namespace argument, e.g. SQLDatabaseToolkit.model_rebuild(_types_namespace={'BaseCache': BaseCache, 'Callbacks': Callbacks}). This is more explicit, as we at least know why the two imports were added and not used explicitly in the module (I believe that's why you had to use the import BaseCache as BaseCache form — initially used by typing stub files — to avoid having the import marked as unused).
  • Fix the root issue: instead of rebuilding every model making use of BaseLanguageModel, I would strongly recommend having BaseLanguageModel defined properly in the first place. It seems like moving BaseCache and Callbacks outside of the if TYPE_CHECKING: block does not cause any circular import issues.

Additionally, it seems like BaseToolkit (which SQLDatabaseToolkit inherits from) is defined as a Pydantic model. SQLDatabaseToolkit has only one field that Pydantic can validate (BaseLanguageModel, itself a Pydantic model), meaning you had to use arbitrary_types_allowed=True. Is validation necessary for these kind of classes? Pydantic's primarily usage is to validate data, having most of the fields using arbitrary types (meaning no validation is performed) defeats the initial purpose of the library 1.

Footnotes

  1. Note that I'm not familiar with this library, there might be a valid reason for BaseLanguageModel/SQLDatabaseToolkit to be Pydantic models.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🤖:bug Related to a bug, vulnerability, unexpected error with an existing feature
Projects
None yet
Development

No branches or pull requests

5 participants