[Enhancement] Add Pattern Caching to Rex Command Functions

## Description
The `RexExtractFunction` and `RexExtractMultiFunction` currently compile regex patterns on every invocation, which can be a performance bottleneck for queries processing large datasets. We should implement pattern caching to improve performance.

## Current Implementation
From https://github.com/opensearch-project/sql/pull/4109, currently in, `RexExtractFunction.java:54-56` and `RexExtractMultiFunction.java`:
```java
public static String extractGroup(String text, String pattern, int groupIndex) {
  try {
    Pattern compiledPattern = Pattern.compile(pattern);  // Compiled every time
    Matcher matcher = compiledPattern.matcher(text);
    // ...
  }
}
```

## Problem
1. **Pattern compilation is expensive**: The `Pattern.compile()` operation involves parsing the regex string and building an internal state machine, which is computationally expensive.

2. **Repeated compilation**: In a typical query like `source=logs | rex field=message "(?<user>\w+)@(?<domain>\w+\.com)"`, the same pattern is compiled once for every row in the dataset.

3. **Impact at scale**: For a dataset with millions of rows, this results in millions of redundant pattern compilations of the exact same regex.

## Proposed Solution
Implement a pattern cache similar to Apache Calcite's approach:

```java
public final class RexExtractFunction extends ImplementorUDF {
  // Cache compiled patterns with max size and expiration
  private static final LoadingCache<String, Pattern> PATTERN_CACHE = 
    CacheBuilder.newBuilder()
      .maximumSize(256)  // Limit cache size to prevent memory issues
      .expireAfterAccess(1, TimeUnit.HOURS)  // Expire unused patterns
      .build(CacheLoader.from(pattern -> {
        try {
          return Pattern.compile(pattern);
        } catch (PatternSyntaxException e) {
          throw new IllegalArgumentException("Invalid regex pattern: " + pattern, e);
        }
      }));

  public static String extractGroup(String text, String pattern, int groupIndex) {
    try {
      Pattern compiledPattern = PATTERN_CACHE.get(pattern);  // Reuse compiled pattern
      Matcher matcher = compiledPattern.matcher(text);
      // ...
    } catch (ExecutionException e) {
      // Handle cache loading exception
    }
  }
}
```

## Benefits
1. **Performance improvement**: Benchmarks show pattern caching can improve regex operations by 10-100x for repeated patterns
2. **Reduced CPU usage**: Eliminates redundant compilation work
3. **Better scalability**: More efficient processing of large datasets
4. **Memory bounded**: Cache size limits prevent unbounded memory growth

## Implementation Considerations / Exit Criteria
1. **Cache configuration**: Need to determine optimal cache size and expiration settings
2. **Thread safety**: Guava's LoadingCache is thread-safe by default
3. **Error handling**: Need to properly handle cache loading exceptions
4. **Monitoring**: Consider adding metrics for cache hit/miss rates

## References
- Apache Calcite pattern caching implementation: [SqlFunctions.java#L461-L475](https://github.com/apache/calcite/blob/44b5798/core/src/main/java/org/apache/calcite/runtime/SqlFunctions.java#L461-L475)
- Similar optimization in Elasticsearch Grok processor
- Java Pattern compilation performance analysis

## Priority
Medium - This is a performance optimization that becomes more important as data volumes grow. While not blocking functionality, it can significantly improve query performance for production workloads.

## Affected Components
- `RexExtractFunction.java`
- `RexExtractMultiFunction.java`
- Potentially other regex-based functions in the codebase

## Estimated Effort
Small - 1-2 days including implementation, testing, and benchmarking

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Enhancement] Add Pattern Caching to Rex Command Functions #4235

Description

Current Implementation

Problem

Proposed Solution

Benefits

Implementation Considerations / Exit Criteria

References

Priority

Affected Components

Estimated Effort

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Enhancement] Add Pattern Caching to Rex Command Functions #4235

Description

Description

Current Implementation

Problem

Proposed Solution

Benefits

Implementation Considerations / Exit Criteria

References

Priority

Affected Components

Estimated Effort

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions