Skip to content

Adding Index Setting for Data compression#20355

Merged
bharath-techie merged 5 commits intoopensearch-project:feature/datafusionfrom
darjisagar7:IndexSettingCompression
Feb 24, 2026
Merged

Adding Index Setting for Data compression#20355
bharath-techie merged 5 commits intoopensearch-project:feature/datafusionfrom
darjisagar7:IndexSettingCompression

Conversation

@darjisagar7
Copy link

@darjisagar7 darjisagar7 commented Jan 2, 2026

Description

[Describe what this change achieves]

>....                                                                                                                                                                                                              
        "compression.enabled": "true",
        "optimized.enabled": "true",
        "index.parquet.page_row_limit": 50000,
        "index.parquet.compression_type": "ZSTD"
    },
    "mappings": {
        "properties": {
            "id": {
                "type": "keyword"
            },
            "name": {
                "type": "keyword"
            },
            "age": {
                "type": "integer"
            },
            "salary": {
                "type": "long"
            },
            "score": {
                "type": "double"
            },
            "active": {
                "type": "boolean"
            },
            "created_date": {
                "type": "date"
            }
        }
    }
}'

{"acknowledged":true,"shards_acknowledged":true,"index":"index-7"}%                                                                                                                                                                                                                                                                                         
parquet-tools inspect OpenSearch/build/testclusters/runTask-0/data/nodes/0/indices/jhYkYQ8PQNGg-UAsLQEq8g/0/parquet/_parquet_file_generation_16.parquet

############ file meta data ############
created_by: parquet-rs version 53.4.1
num_columns: 15
num_rows: 6
num_row_groups: 1
format_version: 1.0
serialized_size: 2793


############ Columns ############
_routing
_doc_count
active
salary
_ignored
_seq_no
score
name
_id
id
created_date
_version
age
___row_id
_primary_term

############ Column(_routing) ############
name: _routing
path: _routing
max_definition_level: 1
max_repetition_level: 0
physical_type: BYTE_ARRAY
logical_type: String
converted_type (legacy): UTF8
compression: ZSTD (space_saved: -47%)

############ Column(_doc_count) ############
name: _doc_count
path: _doc_count
max_definition_level: 1
max_repetition_level: 0
physical_type: INT64
logical_type: None
converted_type (legacy): NONE
compression: ZSTD (space_saved: -47%)

Related Issues

Resolves #[Issue number to be closed when this PR is merged]

Check List

  • Functionality includes testing.
  • API changes companion pull request created, if applicable.
  • Public documentation issue/PR created, if applicable.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 2, 2026

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link
Contributor

github-actions bot commented Jan 2, 2026

❌ Gradle check result for f60cfd8: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?


// Enhanced native methods that handle validation and provide better error reporting
public static native void createWriter(String file, long schemaAddress) throws IOException;
public static native void createWriter(String file, long schemaAddress, boolean isCompressionEnabled) throws IOException;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change makes it very specific for compression. Can we instead pass a map here which can be read and have attribute names? Also, there are certain settings which we may want to apply for certain columns only (some may get handled via mappings as well). Can you look into if we can make the argument passed to Rust aligned to handle the index/field level settings?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Iceberg configuration for ref

write.format.default parquet Default file format for the table; parquet, avro, or orc
write.delete.format.default data file format Default delete file format for the table; parquet, avro, or orc
write.parquet.row-group-size-bytes 134217728 (128 MB) Parquet row group size
write.parquet.page-size-bytes 1048576 (1 MB) Parquet page size
write.parquet.page-row-limit 20000 Parquet page row limit
write.parquet.dict-size-bytes 2097152 (2 MB) Parquet dictionary page size
write.parquet.compression-codec zstd Parquet compression codec: zstd, brotli, lz4, gzip, snappy, uncompressed
write.parquet.compression-level null Parquet compression level
write.parquet.bloom-filter-enabled.column.col1 (not set) Hint to parquet to write a bloom filter for the column: 'col1'
write.parquet.bloom-filter-max-bytes 1048576 (1 MB) The maximum number of bytes for a bloom filter bitset
write.parquet.bloom-filter-fpp.column.col1 0.01 The false positive probability for a bloom filter applied to 'col1' (must > 0.0 and < 1.0)

);

public static final Setting<Boolean> INDEX_COMPRESSION_ENABLED_SETTING = Setting.boolSetting(
"index.compression.enabled",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we use the existing codec setting here? We can validate what values to support in Parquet plugin itself.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, some settings may come from parquet directly. If there is any setting which is specific to a dataformat, we should have the setting declared in the plugin itself

    public List<Setting<?>> getSettings() {
        return Collections.emptyList();
    }

@darjisagar7 darjisagar7 force-pushed the IndexSettingCompression branch from f60cfd8 to 04f4f7d Compare January 20, 2026 06:17
@github-actions
Copy link
Contributor

❌ Gradle check result for 04f4f7d: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@github-actions
Copy link
Contributor

❌ Gradle check result for c01b518: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

"index.parquet.row_group_size_bytes",
new ByteSizeValue(128, ByteSizeUnit.MB),
Setting.Property.IndexScope,
Setting.Property.Dynamic
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For now, its ok to be not dynamic

@darjisagar7 darjisagar7 force-pushed the IndexSettingCompression branch from c01b518 to 969e025 Compare February 19, 2026 10:42
@github-actions
Copy link
Contributor

❌ Gradle check result for 969e025: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

.set_compression(Compression::ZSTD(ZstdLevel::try_new(3).unwrap()))
.build();

let props = WriterPropertiesBuilder::build(&config);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

handle for merges as well


// Push current settings to Rust store once on construction, then keep in sync on updates
pushSettingsToRust(indexSettings);
indexSettings.getScopedSettings().addSettingsUpdateConsumer(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since settings are not dynamic for now, we can remove these

}

// Read configuration from the global settings store
let config: NativeSettings = SETTINGS_STORE.lock().unwrap().clone();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how is this index level?

@github-actions
Copy link
Contributor

❌ Gradle check result for 178887e: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@github-actions
Copy link
Contributor

github-actions bot commented Feb 23, 2026

PR Reviewer Guide 🔍

(Review updated until commit ee3dde9)

Here are some key observations to aid the review process:

🧪 No relevant tests
🔒 No security concerns identified
✅ No TODO sections
🔀 Multiple PR themes

Sub-PR theme: Add settings infrastructure and configuration classes

Relevant files:

  • modules/parquet-data-format/src/main/java/com/parquet/parquetdataformat/ParquetSettings.java
  • modules/parquet-data-format/src/main/java/com/parquet/parquetdataformat/bridge/FieldConfig.java
  • modules/parquet-data-format/src/main/java/com/parquet/parquetdataformat/bridge/NativeSettings.java
  • modules/parquet-data-format/src/main/rust/src/field_config.rs
  • modules/parquet-data-format/src/main/rust/src/native_settings.rs
  • modules/parquet-data-format/src/main/rust/src/writer_properties_builder.rs

Sub-PR theme: Integrate settings with execution engine and plugin system

Relevant files:

  • modules/parquet-data-format/src/main/java/com/parquet/parquetdataformat/ParquetDataFormatPlugin.java
  • modules/parquet-data-format/src/main/java/com/parquet/parquetdataformat/engine/ParquetDataFormat.java
  • modules/parquet-data-format/src/main/java/com/parquet/parquetdataformat/engine/ParquetExecutionEngine.java
  • modules/parquet-data-format/src/main/java/com/parquet/parquetdataformat/memory/ArrowBufferPool.java
  • server/src/main/java/org/opensearch/index/engine/exec/composite/CompositeIndexingExecutionEngine.java
  • server/src/main/java/org/opensearch/index/engine/exec/coord/CompositeEngine.java
  • server/src/main/java/org/opensearch/index/engine/exec/text/TextDF.java
  • server/src/main/java/org/opensearch/plugins/DataSourcePlugin.java

Sub-PR theme: Update writer and merge operations to use index-specific settings

Relevant files:

  • modules/parquet-data-format/src/main/java/com/parquet/parquetdataformat/bridge/RustBridge.java
  • modules/parquet-data-format/src/main/java/com/parquet/parquetdataformat/bridge/NativeParquetWriter.java
  • modules/parquet-data-format/src/main/java/com/parquet/parquetdataformat/vsr/VSRManager.java
  • modules/parquet-data-format/src/main/java/com/parquet/parquetdataformat/writer/ParquetWriter.java
  • modules/parquet-data-format/src/main/java/com/parquet/parquetdataformat/merge/ParquetMergeExecutor.java
  • modules/parquet-data-format/src/main/java/com/parquet/parquetdataformat/merge/ParquetMergeStrategy.java
  • modules/parquet-data-format/src/main/java/com/parquet/parquetdataformat/merge/RecordBatchMergeStrategy.java
  • modules/parquet-data-format/src/main/rust/src/lib.rs
  • modules/parquet-data-format/src/main/rust/src/parquet_merge.rs
  • modules/parquet-data-format/src/main/rust/tests/parquet_merge_tests.rs
  • modules/parquet-data-format/benchmarks/src/main/java/com/parquet/parquetdataformat/benchmark/ParquetWriterCloseBenchmark.java
  • modules/parquet-data-format/benchmarks/src/main/java/com/parquet/parquetdataformat/benchmark/ParquetWriterCreateBenchmark.java
  • modules/parquet-data-format/benchmarks/src/main/java/com/parquet/parquetdataformat/benchmark/ParquetWriterWriteBenchmark.java

⚡ Recommended focus areas for review

Commented Code

Lines 95-105 contain commented-out code for dynamic settings updates. This should either be implemented, removed, or documented with a clear TODO explaining when it will be enabled.

//        indexSettings.getScopedSettings().addSettingsUpdateConsumer(
//            ignored -> pushSettingsToRust(indexSettings),
//            List.of(
//                ParquetSettings.COMPRESSION_TYPE,
//                ParquetSettings.COMPRESSION_LEVEL,
//                ParquetSettings.PAGE_SIZE_BYTES,
//                ParquetSettings.PAGE_ROW_LIMIT,
//                ParquetSettings.DICT_SIZE_BYTES
//            )
//        );
    }
Unused Import

Line 16 imports WriteableSetting.SettingType.ByteSizeValue but this import appears unused in the file. Verify if this is needed or remove it.

import static org.opensearch.common.settings.WriteableSetting.SettingType.ByteSizeValue;
Error Handling

The pushSettingsToRust method catches all exceptions and only logs them (lines 117-120). Consider whether initialization should fail if settings cannot be pushed to Rust, as this could lead to inconsistent behavior.

    RustBridge.onSettingsUpdate(config);
} catch (Exception e) {
    logger.error("Failed to push Parquet settings to Rust store", e);
}
Error Handling

In Java_com_parquet_parquetdataformat_bridge_RustBridge_createWriter, when settings update fails, the error is logged but execution continues (line 312). This could lead to writers being created with incorrect settings.

    match NativeParquetWriter::create_writer(filename, index_name, schema_address as i64) {
        Ok(_) => 0,
        Err(e) => {
            log_error!("[RUST] create_writer failed: {}", e);
            -1
        }
    }
}

@github-actions
Copy link
Contributor

github-actions bot commented Feb 23, 2026

PR Code Suggestions ✨

Latest suggestions up to ee3dde9

Explore these optional code suggestions:

CategorySuggestion                                                                                                                                    Impact
Possible issue
Ensure resource cleanup in finally block

The arrowBufferPool.close() should be called even if removeSettings fails. Wrap the
buffer pool closure in a try-finally block or use try-with-resources pattern to
ensure proper resource cleanup regardless of exceptions.

modules/parquet-data-format/src/main/java/com/parquet/parquetdataformat/engine/ParquetExecutionEngine.java [184-190]

 public void close() throws IOException {
     try {
         RustBridge.removeSettings(indexSettings.getIndex().getName());
     } catch (Exception e) {
         logger.warn("Failed to remove Parquet settings from Rust store for index [{}]", indexSettings.getIndex().getName(), e);
+    } finally {
+        arrowBufferPool.close();
     }
-    arrowBufferPool.close();
 }
Suggestion importance[1-10]: 9

__

Why: Critical resource management issue. Without a finally block, if removeSettings throws an exception, arrowBufferPool.close() won't be called, leading to resource leaks. This is a serious bug that could cause memory leaks in production.

High
Add JNI exception checking

The JNI method calls lack proper exception handling. If a Java exception occurs
during call_method, it should be checked and cleared before returning an error to
prevent JVM crashes or undefined behavior.

modules/parquet-data-format/src/main/rust/src/lib.rs [372-418]

 fn read_native_settings(env: &mut JNIEnv, obj: &JObject) -> Result<NativeSettings, Box<dyn std::error::Error>> {
     macro_rules! get_boxed_int {
         ($method:expr) => {{
-            let jobj = env.call_method(obj, $method, "()Ljava/lang/Integer;", &[])?.l()?;
+            let jobj = env.call_method(obj, $method, "()Ljava/lang/Integer;", &[])?;
+            if env.exception_check()? {
+                env.exception_clear()?;
+                return Err("Java exception during method call".into());
+            }
+            let jobj = jobj.l()?;
             if jobj.is_null() {
                 None
             } else {
                 Some(env.call_method(&jobj, "intValue", "()I", &[])?.i()?)
             }
         }};
     }
     ...
 }
Suggestion importance[1-10]: 7

__

Why: The suggestion to add JNI exception checking is valid for robustness, but the current code already uses the ? operator which propagates errors. However, explicit exception checking could prevent potential JVM crashes. The impact is moderate since the existing error handling may be sufficient in most cases.

Medium
General
Propagate settings synchronization failures

Silently catching and logging the exception when pushing settings to Rust may lead
to inconsistent state between Java and Rust layers. Consider propagating the
exception or implementing a retry mechanism to ensure settings synchronization.

modules/parquet-data-format/src/main/java/com/parquet/parquetdataformat/engine/ParquetExecutionEngine.java [107-121]

-private void pushSettingsToRust(IndexSettings indexSettings) {
+private void pushSettingsToRust(IndexSettings indexSettings) throws IOException {
     NativeSettings config = new NativeSettings();
     config.setIndexName(indexSettings.getIndex().getName());
     config.setCompressionType(indexSettings.getValue(ParquetSettings.COMPRESSION_TYPE));
     config.setCompressionLevel(indexSettings.getValue(ParquetSettings.COMPRESSION_LEVEL));
     config.setPageSizeBytes(indexSettings.getValue(ParquetSettings.PAGE_SIZE_BYTES).getBytes());
     config.setPageRowLimit(indexSettings.getValue(ParquetSettings.PAGE_ROW_LIMIT));
     config.setDictSizeBytes(indexSettings.getValue(ParquetSettings.DICT_SIZE_BYTES).getBytes());
     config.setRowGroupSizeBytes(indexSettings.getValue(ParquetSettings.ROW_GROUP_SIZE_BYTES).getBytes());
-    try {
-        RustBridge.onSettingsUpdate(config);
-    } catch (Exception e) {
-        logger.error("Failed to push Parquet settings to Rust store", e);
-    }
+    RustBridge.onSettingsUpdate(config);
 }
Suggestion importance[1-10]: 6

__

Why: Valid concern about silent failure handling, but the current approach of logging errors may be intentional to prevent initialization failures. Propagating the exception could cause the entire engine initialization to fail. The suggestion has merit but needs careful consideration of the failure handling strategy.

Low

Previous suggestions

Suggestions up to commit c1ebbbf
CategorySuggestion                                                                                                                                    Impact
General
Handle cleanup failures properly

The close() method should not silently swallow exceptions during cleanup. If
settings removal fails, it may indicate a resource leak in the Rust layer. Consider
rethrowing the exception or at least logging it at ERROR level to ensure visibility
of potential issues.

modules/parquet-data-format/src/main/java/com/parquet/parquetdataformat/engine/ParquetExecutionEngine.java [184-188]

 try {
     RustBridge.removeSettings(indexSettings.getIndex().getName());
 } catch (Exception e) {
-    logger.warn("Failed to remove Parquet settings from Rust store for index [{}]", indexSettings.getIndex().getName(), e);
+    logger.error("Failed to remove Parquet settings from Rust store for index [{}]", indexSettings.getIndex().getName(), e);
+    throw new IOException("Failed to cleanup Parquet settings", e);
 }
Suggestion importance[1-10]: 5

__

Why: The suggestion to change logging level from warn to error and rethrow the exception is reasonable for visibility, but the close() method signature doesn't declare throws IOException in the existing code, making the improved code incompatible. The suggestion has merit for error handling but needs adjustment to match the method signature.

Low
Log when using default settings

Using unwrap_or_default() silently falls back to default settings when an index's
configuration is missing. This could mask configuration errors where settings were
expected but not found. Consider logging a warning when falling back to defaults to
aid debugging.

modules/parquet-data-format/src/main/rust/src/lib.rs [57-61]

 let config: NativeSettings = SETTINGS_STORE
     .get(&index_name)
     .map(|r| r.clone())
-    .unwrap_or_default();
+    .unwrap_or_else(|| {
+        log_info!("[RUST] No settings found for index '{}', using defaults", index_name);
+        NativeSettings::default()
+    });
Suggestion importance[1-10]: 4

__

Why: Adding logging when falling back to default settings is a minor improvement for debugging and observability. However, this is a relatively low-impact change that only enhances logging without addressing any functional issues or bugs.

Low
Suggestions up to commit 0cd71c7
CategorySuggestion                                                                                                                                    Impact
Possible issue
Propagate settings initialization failures

The method silently swallows exceptions when pushing settings to Rust fails. This
could lead to inconsistent state where Java believes settings are applied but Rust
uses defaults. Consider throwing a runtime exception or returning a boolean to
indicate failure, allowing the caller to handle initialization errors appropriately.

modules/parquet-data-format/src/main/java/com/parquet/parquetdataformat/engine/ParquetExecutionEngine.java [107-121]

 private void pushSettingsToRust(IndexSettings indexSettings) {
     NativeSettings config = new NativeSettings();
     config.setIndexName(indexSettings.getIndex().getName());
     config.setCompressionType(indexSettings.getValue(ParquetSettings.COMPRESSION_TYPE));
     config.setCompressionLevel(indexSettings.getValue(ParquetSettings.COMPRESSION_LEVEL));
     config.setPageSizeBytes(indexSettings.getValue(ParquetSettings.PAGE_SIZE_BYTES).getBytes());
     config.setPageRowLimit(indexSettings.getValue(ParquetSettings.PAGE_ROW_LIMIT));
     config.setDictSizeBytes(indexSettings.getValue(ParquetSettings.DICT_SIZE_BYTES).getBytes());
     config.setRowGroupSizeBytes(indexSettings.getValue(ParquetSettings.ROW_GROUP_SIZE_BYTES).getBytes());
     try {
         RustBridge.onSettingsUpdate(config);
     } catch (Exception e) {
         logger.error("Failed to push Parquet settings to Rust store", e);
+        throw new RuntimeException("Failed to initialize Parquet settings for index: " + indexSettings.getIndex().getName(), e);
     }
 }
Suggestion importance[1-10]: 7

__

Why: The suggestion correctly identifies that exceptions are being swallowed during settings initialization, which could lead to inconsistent state. However, the improved_code is identical to the existing_code except for adding a throw statement, and the issue is about error handling strategy rather than a critical bug. The score reflects moderate importance for improving error handling.

Medium
General
Validate JNI object type

The JNI method calls in read_native_settings lack validation of the Java object type
before invoking methods. If an incorrect object type is passed, this will cause a
JNI error at runtime. Add type checking using env.is_instance_of() at the function
entry to fail fast with a clear error message.

modules/parquet-data-format/src/main/rust/src/lib.rs [372-419]

 fn read_native_settings(env: &mut JNIEnv, obj: &JObject) -> Result<NativeSettings, Box<dyn std::error::Error>> {
+    // Validate object type before accessing methods
+    let class = env.find_class("com/parquet/parquetdataformat/bridge/NativeSettings")?;
+    if !env.is_instance_of(obj, &class)? {
+        return Err("Invalid object type: expected NativeSettings".into());
+    }
+    
     macro_rules! get_boxed_int {
         ($method:expr) => {{
             let jobj = env.call_method(obj, $method, "()Ljava/lang/Integer;", &[])?.l()?;
             if jobj.is_null() {
                 None
             } else {
                 Some(env.call_method(&jobj, "intValue", "()I", &[])?.i()?)
             }
         }};
     }
     ...
 }
Suggestion importance[1-10]: 6

__

Why: The suggestion proposes adding type validation for JNI objects to fail fast with clear errors. While this is a valid improvement for robustness, it's a defensive programming practice rather than fixing an actual bug. The existing_code snippet doesn't fully match the actual code structure in the diff (which shows the complete function implementation), and this is primarily a type checking enhancement.

Low
Suggestions up to commit 58f7c11
CategorySuggestion                                                                                                                                    Impact
Possible issue
Fix incorrect DashMap usage

The code attempts to lock and dereference SETTINGS_STORE, which is a DashMap, not a
Mutex. DashMap doesn't have a lock() method. This will cause a compilation error.
Use SETTINGS_STORE.insert() instead to add settings to the concurrent map.

modules/parquet-data-format/src/main/rust/src/lib.rs [843-846]

-*SETTINGS_STORE.lock().unwrap() = NativeSettings {
+SETTINGS_STORE.insert("test-index".to_string(), NativeSettings {
     compression_type: Some("ZSTD".to_string()),
     ..Default::default()
-};
+});
Suggestion importance[1-10]: 10

__

Why: The code attempts to call lock() on SETTINGS_STORE, which is a DashMap, not a Mutex. This is a critical compilation error that will prevent the code from building. The suggestion correctly identifies the issue and provides the proper insert() method usage.

High
Add missing index_name parameter

The create_writer method signature was changed to require an index_name parameter,
but this test call is missing it. This will cause a compilation error. Add the
missing index_name parameter to match the updated signature.

modules/parquet-data-format/src/main/rust/src/lib.rs [847-850]

-if NativeParquetWriter::create_writer(filename.clone(), schema_ptr).is_ok() {
+if NativeParquetWriter::create_writer(filename.clone(), "test-index".to_string(), schema_ptr).is_ok() {
     success_count.fetch_add(1, Ordering::SeqCst);
     let _ = NativeParquetWriter::close_writer(filename);
 }
Suggestion importance[1-10]: 10

__

Why: The create_writer method signature was updated to require an index_name parameter, but this test call is missing it. This is a critical compilation error that will prevent the tests from running. The fix is necessary and correct.

High
General
Ensure proper cleanup order

The removeSettings call in the close() method should be executed before closing the
arrowBufferPool to ensure proper cleanup order. If the buffer pool close fails,
settings won't be removed. Move this call before the buffer pool cleanup.

modules/parquet-data-format/src/main/java/com/parquet/parquetdataformat/engine/ParquetExecutionEngine.java [183-190]

-RustBridge.removeSettings(indexSettings.getIndex().getName());
+try {
+    RustBridge.removeSettings(indexSettings.getIndex().getName());
+} catch (Exception e) {
+    logger.warn("Failed to remove Parquet settings from Rust store for index [{}]", indexSettings.getIndex().getName(), e);
+}
+arrowBufferPool.close();
Suggestion importance[1-10]: 4

__

Why: While the suggestion to reorder cleanup operations has some merit for ensuring settings are removed before buffer pool closure, the current implementation already handles exceptions properly. The impact is minor since both operations are independent and the existing order is acceptable.

Low

@github-actions
Copy link
Contributor

❌ Gradle check result for 58f7c11: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@darjisagar7 darjisagar7 force-pushed the IndexSettingCompression branch from 58f7c11 to 0cd71c7 Compare February 23, 2026 05:41
@github-actions
Copy link
Contributor

Persistent review updated to latest commit 0cd71c7

@github-actions
Copy link
Contributor

❌ Gradle check result for 0cd71c7: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Signed-off-by: Sagar Darji <darjisagar7@gmail.com>
Signed-off-by: Sagar Darji <darjisagar7@gmail.com>
…rence

Signed-off-by: Sagar Darji <darjisagar7@gmail.com>
Signed-off-by: Sagar Darji <darjisagar7@gmail.com>
@darjisagar7 darjisagar7 force-pushed the IndexSettingCompression branch from 0cd71c7 to c1ebbbf Compare February 24, 2026 06:19
@github-actions
Copy link
Contributor

Persistent review updated to latest commit c1ebbbf

@github-actions
Copy link
Contributor

❌ Gradle check result for c1ebbbf: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Signed-off-by: Sagar Darji <darjisagar7@gmail.com>
@darjisagar7 darjisagar7 force-pushed the IndexSettingCompression branch from c1ebbbf to ee3dde9 Compare February 24, 2026 07:32
@github-actions
Copy link
Contributor

Persistent review updated to latest commit ee3dde9

@bharath-techie bharath-techie merged commit 19e823e into opensearch-project:feature/datafusion Feb 24, 2026
8 of 31 checks passed
@github-actions
Copy link
Contributor

❌ Gradle check result for ee3dde9: null

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants