Skip to content

[PydanticV2] Add parameter to use Python Regex Engine in order to support look-around #2232

Open
@ilovelinux

Description

Is your feature request related to a problem? Please describe.

  1. I wrote a valid JSON Schema with many properties
  2. Some properties' pattern make use of look-ahead and look-behind, which are supported by JSON Schema specifications.
    See: JSON Schema supported patterns
  3. datamodel-code-generator generated the PydanticV2 models.
  4. PydanticV2 doesn't support look-around, look-ahead and look-behind by default (see regex cannot work from v1 in v2 in fields pydantic/pydantic#7058)
ImportError while loading [...]: in <module>
[...]
.venv/lib/python3.12/site-packages/pydantic/_internal/_model_construction.py:205: in __new__
    complete_model_class(
.venv/lib/python3.12/site-packages/pydantic/_internal/_model_construction.py:552: in complete_model_class
    cls.__pydantic_validator__ = create_schema_validator(
.venv/lib/python3.12/site-packages/pydantic/plugin/_schema_validator.py:50: in create_schema_validator
    return SchemaValidator(schema, config)
E   pydantic_core._pydantic_core.SchemaError: Error building "model" validator:
E     SchemaError: Error building "model-fields" validator:
E     SchemaError: Field "version":
E     SchemaError: Error building "str" validator:
E     SchemaError: regex parse error:
E       ^((?!0[0-9])[0-9]+(\.(?!$)|)){2,4}$
E         ^^^
E   error: look-around, including look-ahead and look-behind, is not supported

Describe the solution you'd like
PydanticV2 supports look-around, look-ahead and look-behind using Python as regex engine: pydantic/pydantic#7058 (comment)

I'd like to have a configuration parameter to use python-re as regex engine for Pydantic V2.

Describe alternatives you've considered
Workaround:

  1. Create a custom BaseModel
from pydantic import BaseModel, ConfigDict


class _BaseModel(BaseModel):
    model_config = ConfigDict(regex_engine='python-re')
  1. Use that class as BaseModel:
datamodel-codegen --base-model "module.with.basemodel._BaseModel"

EDIT:

Configuration used:

[tool.datamodel-codegen]
# Options
input = "<project>/data/schemas/"
input-file-type = "jsonschema"
output = "<project>/models/"
output-model-type = "pydantic_v2.BaseModel"
# Typing customization
base-class = "<project>.models._base_model._BaseModel"
enum-field-as-literal = "all"
use-annotated = true
use-standard-collections = true
use-union-operator = true
# Field customization
collapse-root-models = true
snake-case-field = true
use-field-description = true
# Model customization
disable-timestamp = true
enable-faux-immutability = true
target-python-version = "3.12"
use-schema-description = true
# OpenAPI-only options
#
# We may not want to use these options as we are not generating from OpenAPI schemas
# but this is a workaround to avoid `type | None` in when we have a default value.
#
# The author of the tool doesn't know why he flagged this option as OpenAPI only.
# Reference: https://github.com/koxudaxi/datamodel-code-generator/issues/1441
strict-nullable = true

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions