- 
          
 - 
                Notifications
    
You must be signed in to change notification settings  - Fork 248
 
Refactor with pool removal #787
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
          
     Open
      
      
            JohannesLichtenberger
  wants to merge
  81
  commits into
  main
  
    
      
        
          
  
    
      Choose a base branch
      
     
    
      
        
      
      
        
          
          
        
        
          
            
              
              
              
  
           
        
        
          
            
              
              
           
        
       
     
  
        
          
            
          
            
          
        
       
    
      
from
refactor-with-pool-removal
  
      
      
   
  
    
  
  
  
 
  
      
    base: main
Could not load branches
            
              
  
    Branch not found: {{ refName }}
  
            
                
      Loading
              
            Could not load tags
            
            
              Nothing to show
            
              
  
            
                
      Loading
              
            Are you sure you want to change the base?
            Some commits from the old base branch may be removed from the timeline,
            and old review comments may become outdated.
          
          
                
     Open
            
            
          Conversation
  
    
      This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
      Learn more about bidirectional Unicode characters
    
  
  
    
    - Updated all JSON node types (OBJECT, ARRAY, OBJECT_KEY, STRING_VALUE, NUMBER_VALUE, etc.) to use uniform MemorySegment-based deserialization pattern - Implemented lazy loading for all value types (strings, numbers, booleans, nulls) - Nodes now deserialize using layout-based slicing for better performance - Removed ~100 lines of unused helper methods from NodeKind - Fixed AbstractStringNode hash computation to use toByteArray() instead of getDestination() - All JSON nodes now follow the same pattern as OBJECT and ARRAY for consistency - Build verified successful with no compilation errors
…ialization - Add size prefix (4 bytes) after NodeKind byte to avoid reading variable-sized data - Use 8-byte aligned headers (NodeKind + size + 3-byte padding) for proper alignment - Add end padding to ensure each node's total size is multiple of 8 - Switch all JSON nodes to UNALIGNED VarHandles for compatibility with factory-created nodes - Fix ObjectKeyNode to include 4-byte internal padding before hash field - Fix JsonNodeFactoryImpl to write internal padding when creating ObjectKeyNode - Fix setBooleanValue to handle both BooleanNode and ObjectBooleanNode types - Remove complex size calculation methods (calculateStopBitDataSize, calculateNumberDataSize) Benefits: - No double-reading of variable-sized content (strings, numbers) - Faster deserialization with direct MemorySegment slicing - Simpler, more maintainable code - Tests: PathSummaryTest and JsonNodeTrxGetPreviousRevisionNumberTest passing
…ules The net.openhft.hashing library needs access to sun.nio.ch.DirectBuffer when hashing DirectByteBuffer instances created from MemorySegments. Without these --add-opens flags, tests fail with IllegalAccessError. This fix allows: - Access to sun.nio.ch for DirectBuffer operations - Access to java.nio for ByteBuffer operations Tests now pass successfully.
…dding format - Add NodeKind byte before size prefix - Use 3 bytes padding (total 8 bytes with NodeKind) - Skip NodeKind byte before deserialize - Tests now pass with proper 8-byte alignment
…adding format - Fixed StringNodeTest, NumberNodeTest, BooleanNodeTest, NullNodeTest - Fixed ObjectNumberNodeTest, ObjectStringNodeTest, ObjectBooleanNodeTest, ObjectNullNodeTest, ObjectKeyNodeTest - Corrected serialization order for value nodes (siblings before/after value depending on node type) - All JSON node tests now pass with proper 8-byte alignment
- Created JsonNodeTestHelper with writeHeader(), writeEndPadding(), updateSizePrefix(), and finalizeSerialization() methods - Updated all 11 JSON node tests to use the helper methods - Reduced ~20 lines of duplicated code per test to 1-2 lines - Tests remain fully passing
…izer class - Created JsonNodeSerializer in main source with writeSizePrefix(), readSizePrefix(), writeEndPadding(), updateSizePrefix(), and calculateEndPadding() - Removed duplicate private methods from NodeKind.java - Updated NodeKind.java to use JsonNodeSerializer methods - Updated JsonNodeTestHelper to delegate to JsonNodeSerializer - Eliminated code duplication between production and test code - All tests still pass
- Added NodeKind byte before serialization in all 4 round-trip tests - Added bytesIn.readByte() to skip NodeKind byte before deserialization - Ensures proper 8-byte alignment for MemorySegment access - All 17 tests now pass
- Added serializeNumber() and deserializeNumber() static methods to NodeKind - Added helper methods serializeBigInteger() and deserializeBigInteger() - Updated NUMBER_VALUE and OBJECT_NUMBER_VALUE serialization to use shared methods - Removed duplicate serialization/deserialization code from NumberNode - Removed duplicate serialization/deserialization code from ObjectNumberNode - Both node types now use centralized logic from NodeKind for consistency
…obal() - Updated both constructors to use Arena.ofAuto() for automatic memory management - Arena.ofAuto() automatically releases memory when no longer reachable - Improves memory management by allowing automatic cleanup instead of global lifetime
…rializeNumber() - Changed NumberNode.serializeNumber() to NodeKind.serializeNumber() - Changed ObjectNumberNode.serializeNumber() to NodeKind.serializeNumber() - Fixes compilation errors after refactoring number serialization to NodeKind
…y offset - Changed serializeDelegateWithoutIDs to use putVarLong instead of writeLong - Changed deserializeNodeDelegateWithoutIDs to use getVarLong instead of readLong - This fixes JsonRedBlackTreeIntegrationTest failures - RB nodes (CASRB, PATHRB, NAMERB, RB_NODE_VALUE) need variable-length encoding for efficient storage since parent key offsets are typically small values
- Revert GrowingMemorySegment to use Arena.ofAuto() by default * Nodes store MemorySegment references that outlive BytesOut instances * Arena.ofAuto() allows GC to manage cleanup when segments become unreachable * Prevents premature deallocation bugs - Add Arena parameter constructors for explicit arena control * GrowingMemorySegment(Arena, int) for custom arena * MemorySegmentBytesOut(Arena, int) for custom arena * Enables using confined arenas for temporary buffers with clear lifecycles - Optimize KeyValueLeafPage.processEntries() with Arena.ofConfined() * Use confined arena for temporary serialization buffers * Normal records: data copied to slotMemory, temp buffer freed immediately * Overflow records: explicitly copied to Arena.global() for persistence * Provides immediate memory cleanup for ~99% of serialization operations This hybrid approach balances manual control (where beneficial) with automatic management (where lifecycles are complex). All tests pass.
- Added serializeNumber() and deserializeNumber() static methods to NodeKind - Added helper methods serializeBigInteger() and deserializeBigInteger() - Updated NUMBER_VALUE and OBJECT_NUMBER_VALUE serialization to use shared methods - Removed duplicate serialization/deserialization code from NumberNode - Removed duplicate serialization/deserialization code from ObjectNumberNode - Both node types now use centralized logic from NodeKind for consistency
…obal() - Updated both constructors to use Arena.ofAuto() for automatic memory management - Arena.ofAuto() automatically releases memory when no longer reachable - Improves memory management by allowing automatic cleanup instead of global lifetime
…rializeNumber() - Changed NumberNode.serializeNumber() to NodeKind.serializeNumber() - Changed ObjectNumberNode.serializeNumber() to NodeKind.serializeNumber() - Fixes compilation errors after refactoring number serialization to NodeKind
…y offset - Changed serializeDelegateWithoutIDs to use putVarLong instead of writeLong - Changed deserializeNodeDelegateWithoutIDs to use getVarLong instead of readLong - This fixes JsonRedBlackTreeIntegrationTest failures - RB nodes (CASRB, PATHRB, NAMERB, RB_NODE_VALUE) need variable-length encoding for efficient storage since parent key offsets are typically small values
- Revert GrowingMemorySegment to use Arena.ofAuto() by default * Nodes store MemorySegment references that outlive BytesOut instances * Arena.ofAuto() allows GC to manage cleanup when segments become unreachable * Prevents premature deallocation bugs - Add Arena parameter constructors for explicit arena control * GrowingMemorySegment(Arena, int) for custom arena * MemorySegmentBytesOut(Arena, int) for custom arena * Enables using confined arenas for temporary buffers with clear lifecycles - Optimize KeyValueLeafPage.processEntries() with Arena.ofConfined() * Use confined arena for temporary serialization buffers * Normal records: data copied to slotMemory, temp buffer freed immediately * Overflow records: explicitly copied to Arena.global() for persistence * Provides immediate memory cleanup for ~99% of serialization operations This hybrid approach balances manual control (where beneficial) with automatic management (where lifecycles are complex). All tests pass.
…lization' into refactor-json-nodes-lazy-deserialization # Conflicts: # JsonShredderTest_testChicagoDescendantAxisParallel_2024_08_12_221741.jfr.zip # analysis-5-trxs.jfr
…alization - Replace delegate pattern with MemorySegment-backed storage using VarHandles - Core layout: 68 bytes (16B NodeDelegate + 32B StructNode + 20B NameNode) - Remove typeKey from serialized data (computed on-the-fly as 'xs:untyped' hash) - Add size prefix and padding for proper 8-byte alignment - Update ELEMENT serialization/deserialization in NodeKind - Update XmlNodeFactoryImpl to create MemorySegment-based ElementNode instances - Fix GrowingMemorySegment.grow() buffer overflow bug (copy only valid bytes) - Update ElementNodeTest and NodePageTest for new implementation All XML tests passing (271 tests, 47 skipped)
…rialization - Replace delegate pattern with MemorySegment-backed storage using VarHandles - Core layout: 36 bytes (16B NodeDelegate + 20B NameNode) - Value bytes stored separately (variable length) with compressed flag - Remove typeKey from serialized data (computed on-the-fly as 'xs:untyped' hash) - Add size prefix and padding for proper 8-byte alignment - Update ATTRIBUTE serialization/deserialization in NodeKind - Update XmlNodeFactoryImpl to create MemorySegment-based AttributeNode instances - Update AttributeNodeTest for new implementation All XML tests passing (271 tests, 47 skipped)
…rialization - Replace delegate pattern with MemorySegment-backed storage using VarHandles - Core layout: 36 bytes (16B NodeDelegate + 20B NameNode) - Add size prefix and padding for proper 8-byte alignment - Update NAMESPACE serialization/deserialization in NodeKind - Update XmlNodeFactoryImpl to create MemorySegment-based NamespaceNode instances - Update NamespaceNodeTest for new implementation All XML tests passing
…alization - Replace delegate pattern with MemorySegment-backed storage using VarHandles - Core layout: 32 bytes (16B NodeDelegate + 16B sibling keys) - Value data stored separately (not in MemorySegment) - Update COMMENT serialization/deserialization in NodeKind - Update XmlNodeFactoryImpl.createCommentNode for MemorySegment creation - Update CommentNodeTest for new implementation All XML tests passing (271 tests, 47 skipped)
…tion - Replace delegate pattern with MemorySegment-backed storage using VarHandles - Core layout: 68 bytes (16B NodeDelegate + 32B StructNode + 20B NameNode) - Optional fields: childCount (8B), hash (8B), descendantCount (8B) - Value data stored separately (not in MemorySegment) - Update PROCESSING_INSTRUCTION serialization/deserialization in NodeKind - Update XmlNodeFactoryImpl.createPINode for MemorySegment creation - Update PINodeTest for new implementation All XML tests passing (271 tests, 47 skipped)
…zation - Replace delegate pattern with MemorySegment-backed storage using VarHandles - Core layout: 32 bytes (16B NodeDelegate + 16B sibling keys) - Text nodes cannot have children, so no child-related fields - Value data stored separately (not in MemorySegment) - Update TEXT serialization/deserialization in NodeKind - Update XmlNodeFactoryImpl.createTextNode for MemorySegment creation - Update TextNodeTest for new implementation All XML tests passing (271 tests, 47 skipped)
- Store compressed bytes directly when value is compressed - Decompress only when getRawValue() is called - Clear compressed data after decompression to save memory - Maintain backward compatibility with uncompressed values Benefits: - Reduces memory pressure for compressed text values - Defers decompression cost until value is actually accessed - Improves deserialization performance for nodes that are never read All XML tests passing (271 tests, 47 skipped)
… package - Renamed JsonNodeTestHelper to NodeTestHelper for broader applicability - Moved from io.sirix.node.json to io.sirix.node package - Now can be used by both JSON and XML node tests - Updated all references in JSON test files (11 files) - Updated all references in XML test files (ElementNodeTest, NamespaceNodeTest) - Added missing import statements to all affected test files Benefits: - Better code organization - test helper is now in the parent package - Can be reused across JSON and XML node tests - Reduces code duplication All tests passing
- PINode: Removed unused QNm import (method returns null anyway, but QNm is needed for return type) - CommentNode: Compression import is actually used for decompression, kept it - TextNode: All imports are used - ElementNode: All imports are used Note: During refactoring to MemorySegment-backed storage, some imports became unnecessary but most are still required for the new implementation. All tests passing
…torage - Remove childCount and descendantCount serialization/deserialization from all value nodes (StringNode, NumberNode, BooleanNode, NullNode, ObjectStringNode, ObjectNumberNode, ObjectBooleanNode, ObjectNullNode) as these are always leaf nodes - Fix ObjectKeyNode to properly store and retrieve descendantCount from MemorySegment instead of returning hardcoded value 1, enabling support for complex subtrees as values - Update memory layouts to place fixed-length fields before variable-length content (StringNode, ObjectStringNode, NumberNode, ObjectNumberNode) - Fix VarHandle offset calculation for descendantCount in ObjectKeyNode - Make increment/decrement methods for childCount and descendantCount no-ops in value nodes - Update JsonNodeFactoryImpl and NodeKind serialization to match new layouts - Update test data creation to match new serialization format - All 852 tests passing
- memory-leak-diagnostic.log - *.jfr (Java Flight Recorder files)
- Gradle: 8.10 → 9.1.0 - Java target: 22 → 25 - Kotlin: 2.2.10 → 2.2.20 - Mockito: 5.13.0 → 5.19.0 - ByteBuddy: 1.15.1 → 1.17.5 - Shadow plugin: 7.0.0 → 8.3.3 (com.gradleup.shadow) - Fix Gradle 9 compatibility: mainClassName → mainClass - Apply Shadow plugin only to sirix-rest-api - Update jvmToolchain to 25 in sirix-kotlin-api
- REMOVED: KeyValueLeafPagePool class and all references - REMOVED: KeyValueLeafPagePoolTest - CHANGED: KeyValueLeafPage now allocates segments directly from allocator - CHANGED: Added close() method to KeyValueLeafPage for segment cleanup - CHANGED: Cache removal listeners now call page.close() directly - CHANGED: RecordPageCache and PageCache use removalListener instead of evictionListener - CHANGED: Direct allocation in NodePageTrx, PageKind, PageUtils - FIXED: Allocator initialization in Databases.initAllocator() and freeAllocatedMemory() - UPDATED: Test files to use allocator directly instead of pool Benefits: - ~600 lines of code removed (pool + test + references) - Simpler architecture: Allocate → Use → Cache eviction → close() → GC - Pages own their segments: No complex pooling layer - Memory management via Caffeine cache + allocator, similar to UmbraDB approach
- Align all cache and allocator files with remote branch - Include diagnostic logging infrastructure - Add FFILz4Compressor implementation - Remove BufferManagerImpl.java.bak backup file - All core Java files now match remote branch exactly
- Set Java toolchain to version 25 across all subprojects - Configure Kotlin modules to use Java 25 toolchain (falls back to Java 24 target) - Add jvmTargetValidationMode.set(WARNING) in subprojects to allow Java 25/Kotlin 24 mismatch - Kotlin 2.2.20 doesn't yet fully support Java 25, so it compiles to Java 24 bytecode - Java modules compile with --enable-preview for Java 25 preview features - sirix-kotlin-api, sirix-rest-api, sirix-kotlin-cli now use jvmToolchain(25)
The Chicago dataset test is disabled for now to speed up test runs during development.
String Templates were removed from Java 25, converted to regular string concatenation. Changed STR."..." syntax to standard string concatenation with + operator.
The JUnit Platform Launcher is required for running JUnit 5 tests. Without it, test execution fails with 'Failed to load JUnit Platform' error.
Added JUnit Jupiter API, Engine, and Platform Launcher dependencies. These are required for running JUnit 5 tests with useJUnitPlatform().
  
    Sign up for free
    to join this conversation on GitHub.
    Already have an account?
    Sign in to comment
  
      
  Add this suggestion to a batch that can be applied as a single commit.
  This suggestion is invalid because no changes were made to the code.
  Suggestions cannot be applied while the pull request is closed.
  Suggestions cannot be applied while viewing a subset of changes.
  Only one suggestion per line can be applied in a batch.
  Add this suggestion to a batch that can be applied as a single commit.
  Applying suggestions on deleted lines is not supported.
  You must change the existing code in this line in order to create a valid suggestion.
  Outdated suggestions cannot be applied.
  This suggestion has been applied or marked resolved.
  Suggestions cannot be applied from pending reviews.
  Suggestions cannot be applied on multi-line comments.
  Suggestions cannot be applied while the pull request is queued to merge.
  Suggestion cannot be applied right now. Please check back later.
  
    
  
    
No description provided.