Skip to content

[FEATURE] Have custom Regexp timeout configuration for logging #331

@nina-instrumentl

Description

@nina-instrumentl

Scope check

  • This is core LLM communication (not application logic)
  • This benefits most users (not just my use case)
  • This can't be solved in application code with current RubyLLM
  • I read the Contributing Guide

Due diligence

  • I searched existing issues
  • I checked the documentation

What problem does this solve?

When processing complex files through ruby_llm, the application encounters a timeout error during request logging:

Regexp::TimeoutError: regexp match timeout.

This happens because the logger filter is using default Regexp.timeout for its regular expression: https://github.com/crmne/ruby_llm/blob/main/lib/ruby_llm/connection.rb#L65

This default behavior becomes problematic when ruby_llm is the only part of the application handling large/complex payloads, yet the timeout applies globally.

Proposed solution

Introduce a configurable log_regexp_timeout option in ruby_llm. This would allow consumers to set a custom timeout value for regex operations used within logging filters, separate from the global Regexp.timeout.

Update filters to use custom timeout that is separate from the global Ruby timeout in https://github.com/crmne/ruby_llm/blob/main/lib/ruby_llm/connection.rb#L60

 logger.filter(Regexp.new('[A-Za-z0-9+/=]{100,}', timeout: @config.log_regexp_timeout), 'data":"[BASE64 DATA]"')
 logger.filter(Regexp.new('[-\\d.e,\\s]{100,}', timeout: @config.log_regexp_timeout), '[EMBEDDINGS ARRAY]')

If log_regexp_timeout is not provided, fall back to Regexp.timeout to preserve default behavior.

Why this belongs in RubyLLM

  • Regexp.timeout was introduced in Ruby 3.2 to mitigate DDoS attacks. However, it's too rigid when an application processes both simple and complex content.
  • ruby_llm is the only component in many applications that deals with large structured data (e.g., base64, embedding vectors), making it a natural place to encapsulate this configuration.
  • Since ruby_llm defines its own Faraday logging filters, it's best positioned to support fine-tuned timeout handling without affecting the rest of the application.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions