Skip to content

Conversation

@ming1
Copy link
Collaborator

@ming1 ming1 commented Sep 9, 2025

  • support to handle multiple queues/device from single thread context,
    add examples/multi.rs to show the usage, both io and control use async/.await way

  • add MultiQueueManager for simplifying target implementation

  • unify ctrl async vs. non-async

  • refactor, cleanup

ming1 added 30 commits September 7, 2025 01:13
Implement Phase 2 of io.rs cleanup by consolidating documentation
to reduce redundancy and improve maintainability.

Changes:
- Add comprehensive module-level documentation with complete examples
  for ring initialization (basic/advanced) and unified buffer APIs
- Simplify individual method documentation by removing redundant
  examples and referencing module-level docs instead:
  * ublk_init_task_ring(): Remove 48 lines of duplicate examples
  * submit_io_cmd_unified(): Remove verbose usage examples
  * submit_fetch_commands_unified(): Remove redundant buffer examples
  * complete_io_cmd_unified(): Streamline validation/performance docs
- Clean up outdated comments:
  * Remove historical "Previously had separate TASK_URING" comment
  * Remove outdated "todo: apply io_uring flags" comment

Benefits:
- Single source of truth for documentation examples
- Better discoverability with comprehensive module-level docs
- Reduced maintenance burden from duplicate examples
- Improved readability with focused method documentation
- All functionality preserved with better organization

The documentation now follows Rust best practices with comprehensive
module docs and concise method docs that reference the examples.

Signed-off-by: Ming Lei <tom.leiming@gmail.com>
Update the module-level documentation example for traditional buffer
operations to use IoBuf instead of raw arrays, which is more consistent
with the library's established patterns and best practices.

Changes:
- Replace raw array `[0u8; 4096]` with `IoBuf::<u8>::new(4096)`
- Use `BufDesc::from_io_buf()` helper method for proper conversion
- Add missing `use libublk::helpers::IoBuf;` import

This example now demonstrates the recommended approach for traditional
buffer operations and aligns with how IoBuf is used throughout the
codebase and other examples.

Signed-off-by: Ming Lei <tom.leiming@gmail.com>
This commit implements a complete multi-queue system enabling handling of
multiple ublk queues (from same or different devices) within a single thread
context using slab-based queue management and io_uring event routing.

Key Features:
- Thread-local slab storage for queue references with 10-bit addressing
- Enhanced user_data encoding with queue slab keys (bits 48-57)
- Multi-queue event routing and completion handling
- Unified UblkUringOpFuture API with backward compatibility
- Queue registration/unregistration APIs
- Multi-queue manager for coordinated queue handling

API Changes:
- Added UblkUringOpFuture::new_multi(tag, queue_slab_key, tgt_io)
- Enhanced user_data bit layout: tag(0-15), future_key(16-47), queue_key(48-57), target_io(63)
- Added slab_key module with constants and validation functions
- Add MultiQueueManager for managing queues handled in single thread
  context
- Maintained backward compatibility with existing UblkUringOpFuture::new()

Implementation Details:
- Slab key encoding only applied to async/.await operations
- Internal queue operations use simplified user_data encoding
- 10-bit slab key system: 0-1020 valid, 1021 reserved, 1022 control commands, 1023 unused
- Thread-safe queue lookup and routing via slab-based addressing
- Full integration with existing buffer management and control operations

Signed-off-by: Ming Lei <tom.leiming@gmail.com>
…e-thread handling

This commit adds a new API `ublk_handle_ios_in_current_thread()` that serves as the
multi-queue version of `ublk_wait_and_handle_ios()`, enabling handling of multiple
ublk queues within a single thread context using a single per-task io_uring.

Key Features:
- Multi-queue event loop driven by single per-task io_uring
- Automatic queue routing using slab keys from user_data
- Timeout handling with 20-second idle detection for all managed queues
- Error handling with automatic queue cleanup on failures
- CQE processing with async task wakeup and command handler invocation
- Queue lifecycle management with automatic removal when done

API Changes:
- Added ublk_handle_ios_in_current_thread(manager, exe, cmd_handler)
- Added public wrapper methods to UblkQueue for multi-queue support:
  - enter_queue_idle_multi() - for timeout handling
  - exit_queue_idle_multi() - for state management
  - update_state_multi() - for CQE handling
  - queue_is_done_multi() - for lifecycle management
- Added UblkIOCtx::is_io_command_multi() for command detection
- Added private handle_incoming_cqes() helper for cleaner code organization

Implementation Details:
- Follows the same patterns as UblkQueue::__wait_ios() but handles multiple queues
- Uses existing MultiQueueManager for queue registration and routing
- Integrates with existing ublk_wake_task() and async infrastructure
- Maintains full compatibility with existing single-queue APIs
- Provides proper error handling and resource cleanup

Signed-off-by: Ming Lei <tom.leiming@gmail.com>
This commit implements a comprehensive solution for the multi-queue file
registration vs. single io-uring problem where multiple queues couldn't
share a single io_uring instance due to register_files() and
register_buffers_sparse() only being callable once.

- Each UblkQueue::__new() was calling register_files() individually
- io_uring only allows these registration calls once per instance
- Multi-queue scenarios failed when creating subsequent queues

- QueueResourceManager: Thread-local manager accumulating all queue resources
- QueueResourceRange: Tracks each queue's portion of global file/buffer tables
- Batch registration: Single call to register all accumulated resources

- RESOURCE_MANAGER: Thread-local storage for resource accumulation
- Uses existing QUEUE_RING thread-local io_uring instance
- Index translation: Local queue indices mapped to global table indices

- register_resources(): Triggers batch registration after queue creation
- add_queue_files_and_buffers(): Manual resource accumulation method
- Lifecycle management ensures registration before operations

- Conditional registration: Single-queue uses direct registration (backward compatible)
- Multi-queue mode: Accumulates resources in central manager instead
- Index translation methods: translate_file_index() and translate_buffer_index()
- Resource range storage: Each queue knows its allocated range

- ✅ Single registration: Eliminates "can't register twice" problem
- ✅ Resource efficiency: Consolidated file/buffer tables
- ✅ Index transparency: Queues use local indices (0,1,2...)
- ✅ Backward compatibility: Single-queue usage unchanged
- ✅ Thread safety: Leverages existing thread-local patterns

```rust
let mut manager = MultiQueueManager::new();
for q_id in 0..4 {
    let queue = UblkQueue::new_multi(q_id, dev, &mut manager)?;
}
manager.register_resources()?; // Batch register all resources
```

- src/io.rs: Core resource management infrastructure and UblkQueue updates
- src/multi_queue.rs: MultiQueueManager resource registration methods
- examples/multi_queue.rs: Comprehensive usage example

- Added comprehensive test suite covering resource accumulation
- Index translation verification
- MultiQueueManager integration tests
- Example demonstrates working multi-queue creation

Signed-off-by: Ming Lei <tom.leiming@gmail.com>
- Add new src/uring.rs module with UblkUring struct
- Move UringResourceManager from io.rs to uring.rs
- Implement unified thread-local UBLK_URING instance
- Ensure proper Drop order: resource_manager before ring
- Add accessor methods for ring and resource_manager

Signed-off-by: Ming Lei <tom.leiming@gmail.com>
- Remove QUEUE_RING and URING_RESOURCE_MANAGER thread locals
- Remove UringResourceManager struct from io.rs (moved to uring.rs)
- Update all accessor functions to use UBLK_URING
- Update macros with_queue_ring_internal and with_queue_ring_mut_internal
- Update ublk_init_task_ring to use unified structure
- Preserve all existing API compatibility

Signed-off-by: Ming Lei <tom.leiming@gmail.com>
- Convert examples/multi_queue.rs into comprehensive test suite in src/uring.rs
- Add test_multi_queue_resource_manager_integration() with full validation
- Add test_uring_resource_manager_lifecycle() for basic lifecycle testing
- Add test_ublk_uring_drop_ordering() for cleanup verification
- Remove standalone example file - functionality now tested in unit tests

- Multi-queue resource accumulation and batch registration
- Resource range allocation and non-overlap verification
- Index translation functionality testing
- Resource manager state validation through UBLK_URING
- Queue lookup and cleanup lifecycle testing
- Drop ordering and cleanup verification

- Better test coverage for resource manager functionality
- Integration testing within the crate's test suite
- No external example dependencies
- Comprehensive validation of unified UblkUring implementation

All 42 unit tests + 15 integration tests + 11 doc tests pass.

Signed-off-by: Ming Lei <tom.leiming@gmail.com>
- Add UBLK_F_AUTO_BUF_REG flag to test_multi_queue_resource_manager_integration
- Set queue depth to 64 and verify buffer resource allocation
- Add comprehensive buffer range validation:
  * Each queue gets exactly 64 buffers (queue_depth)
  * Buffer ranges are sequential and non-overlapping
  * Total buffer count is nr_queues * queue_depth (256)
  * Buffer index translation works correctly
- Verify buffer start indices: queue_id * queue_depth
- Test buffer index translation for multiple local indices (0, 10, 63)
- Add detailed assertions for buffer resource management

✅ Buffer ranges properly allocated: 0-63, 64-127, 128-191, 192-255
✅ Index translation verified: local 0->global 0, local 10->global 10, etc.
✅ Total buffer count: 256 (4 queues × 64 buffers)
✅ No buffer range overlaps between queues
✅ Auto buffer registration working correctly

All uring tests pass with enhanced buffer resource verification.

Signed-off-by: Ming Lei <tom.leiming@gmail.com>
Major improvements to the unified io_uring and resource management:

- **Streamlined function organization**: Move with_uring_resource_manager to uring.rs module
- **Simplified imports**: Direct imports in io.rs from uring module
- **Enhanced queue creation**: Separate resource registration logic for single vs multi-queue modes
- **Cleaner API boundaries**: Better separation between io.rs and uring.rs responsibilities

- **Optimized resource management**: Use uring module functions directly in MultiQueueManager
- **Improved error handling**: Better validation for resource registration state
- **Enhanced queue lifecycle**: Cleaner resource allocation and cleanup patterns

- **Reduced duplication**: Eliminate redundant helper functions in io.rs
- **Better encapsulation**: Resource management logic properly contained in uring module
- **Improved maintainability**: Clearer separation of concerns between modules

This cleanup builds on the unified UBLK_URING foundation to provide
a more maintainable and efficient implementation while preserving
all existing functionality and API compatibility.

Signed-off-by: Ming Lei <tom.leiming@gmail.com>
…ions

- Replace hardcoded Fixed() with translate_file_index() in UblkQueue
- Replace hardcoded buf index with translate_buffer_index() in UblkQueue
- Fix submit_io_cmd_unified() to use proper file descriptor index
- Fix submit_uring_cmd() to use proper file descriptor index
- Ensures correct file descriptor mapping in multi-queue scenarios
- Critical for proper resource isolation between queues

This fixes a bug where multi-queue operations would always use the
first registered file descriptor instead of the queue-specific one.

Signed-off-by: Ming Lei <tom.leiming@gmail.com>
This commit consolidates the multi-queue buffer handling and zero-copy
implementation by:

- Refactoring MultiQueueManager to use Slab-based queue registry for stable keys
- Removing thread-local QUEUE_SLAB storage in favor of owned queue management
- Simplifying UblkQueue::new_multi() API by moving resource allocation to manager
- Adding MultiQueueManager::create_queue() for integrated queue creation and registration
- Updating queue lookup patterns to use manager-scoped references
- Improving resource cleanup and lifetime management

This unification reduces complexity in the multi-queue path while maintaining
the same performance characteristics for buffer registration and zero-copy operations.

Signed-off-by: Ming Lei <tom.leiming@gmail.com>
- Implement iter(), iter_mut(), values(), keys() methods
- Add IntoIterator trait implementations for borrowed references
- Add comprehensive documentation with iteration examples
- Add test coverage for iterator functionality
- Support multiple iteration patterns (keys+values, values only, keys only)
- Maintain stable slab key iteration

Signed-off-by: Ming Lei <tom.leiming@gmail.com>
Replace manual queue lookups with direct iterator usage in
test_multi_queue_resource_manager_integration():

- Use manager.iter() instead of get_queue_keys() + get_queue_by_key()
- Eliminate redundant queue lookups and type conversions
- Improve code readability and reduce error-prone manual indexing
- Leverage u16 slab keys returned by the iterator for consistency

This change takes advantage of the comprehensive iterator support
added to MultiQueueManager, making test code more concise and maintainable.

Signed-off-by: Ming Lei <tom.leiming@gmail.com>
Apply MultiQueueManager on test_create_ublk_async() for supporting to
handle multiple queues via single pthread.

Signed-off-by: Ming Lei <tom.leiming@gmail.com>
Break down the monolithic test_multi_queue_resource_manager_integration()
into focused, reusable helper functions:

- verify_queue_creation(): Validates queue creation and resource allocation
- verify_no_resource_overlap(): Checks resource range overlap prevention
- verify_index_translation(): Tests file/buffer index translation
- verify_resource_manager_state(): Verifies UBLK_URING resource manager state
- verify_queue_lookup(): Tests global queue lookup functionality

Benefits:
- Improved readability with clear separation of concerns
- Better maintainability through single-responsibility functions
- Reduced code duplication with reusable verification logic
- Enhanced debuggability with focused test scope per function
- Easier extension for future multi-queue test scenarios

The main test function now follows a clean flow: initialize, create queues,
register resources, and run comprehensive verification tests.

Signed-off-by: Ming Lei <tom.leiming@gmail.com>
…nager

Major architectural refactoring to eliminate thread-local dependencies
and improve resource management encapsulation.

- Add embedded `UringResourceManager` field to own resources directly
- Update constructor to initialize embedded resource manager
- Implement resource registration through manager-owned instance
- Add test-only getter methods for internal state access
- Enhanced Drop implementation with resource state warnings

- Remove UringResourceManager from UblkUring structure
- Make UringResourceManager::new() public for manager usage
- Eliminate thread-local resource management complexity
- Remove obsolete helper functions and cleanup methods
- Update tests to use manager-based patterns

- Remove global resource management functions
- Update queue resource addition to use manager methods
- Simplify queue cleanup to rely on MultiQueueManager
- Remove thread-local function dependencies
- Fix test compilation with new architecture

- Add #[ignore] attribute to integration test requiring kernel support
- Update all tests to use manager-owned resource patterns
- Remove direct access to private resource manager fields
- Ensure all tests compile and pass without hanging

- ✅ Eliminated thread-local dependencies
- ✅ Improved encapsulation with clear ownership semantics
- ✅ Enhanced testability through direct manager access
- ✅ Simplified API without hidden global state
- ✅ Better flexibility for different resource policies

- Maintains backward compatibility for public APIs
- No breaking changes for existing user code
- All tests passing (42 unit + 15 integration + 15 doc tests)
- Fixed hanging test issue with proper #[ignore] annotation

Signed-off-by: Ming Lei <tom.leiming@gmail.com>
Add MultiQueueManager::unregister_resources() and call it in Drop()
automatically.

Meantime add test_multi_queue_resource_manager_integration() back.

Signed-off-by: Ming Lei <tom.leiming@gmail.com>
Signed-off-by: Ming Lei <tom.leiming@gmail.com>
… coverage

- Modified device_handler_async() to accept eventfd parameter and write completion signal
- Updated test_create_ublk_async() to create eventfd and pass to ublk_block_on_ctrl_tasks()
- Enhanced ublk_block_on_ctrl_tasks() to handle eventfd reads with proper CQE filtering
- Consolidated async task management by replacing ublk_join_tasks() with ublk_block_on_ctrl_tasks()
- Added eventfd handling path coverage for async task execution framework

Signed-off-by: Ming Lei <tom.leiming@gmail.com>
Add UblkCtrl::set_thread_affinity() API for setting queue thread
affinity.

Signed-off-by: Ming Lei <tom.leiming@gmail.com>
Export it, which is still helpful for target code.

Signed-off-by: Ming Lei <tom.leiming@gmail.com>
Signed-off-by: Ming Lei <tom.leiming@gmail.com>
Extract common command data preparation logic for ADD_DEV into
prepare_add_cmd() method. Both sync and async variants now use
the same preparation logic, reducing code duplication while
maintaining identical behavior.

Signed-off-by: Ming Lei <tom.leiming@gmail.com>
Extract common command data preparation logic for DEL_DEV into
prepare_del_cmd() method. All three deletion variants now use
the same preparation logic with a force_async parameter,
reducing code duplication while maintaining identical behavior.

Signed-off-by: Ming Lei <tom.leiming@gmail.com>
Extract common command data preparation logic for:
- GET_FEATURES: prepare_get_features_cmd()
- GET_DEV_INFO/GET_DEV_INFO2: prepare_read_dev_info_cmd()

Both sync and async variants now use the same preparation logic,
reducing code duplication while maintaining identical behavior.

Signed-off-by: Ming Lei <tom.leiming@gmail.com>
Extract common command data preparation logic for:
- START_DEV: prepare_start_cmd()
- STOP_DEV: prepare_stop_cmd()
- GET_PARAMS: prepare_get_params_cmd()
- SET_PARAMS: prepare_set_params_cmd()

Both sync and async variants now use the same preparation logic,
reducing code duplication while maintaining identical behavior.

Signed-off-by: Ming Lei <tom.leiming@gmail.com>
Extract common command data preparation logic for:
- GET_QUEUE_AFFINITY: prepare_get_queue_affinity_cmd()
- START_USER_RECOVERY: prepare_start_user_recover_cmd()
- END_USER_RECOVERY: prepare_end_user_recover_cmd()

Both sync and async variants now use the same preparation logic,
reducing code duplication while maintaining identical behavior.

Signed-off-by: Ming Lei <tom.leiming@gmail.com>
Extract common JSON building logic into helper methods:
- build_json_internal() and build_json_internal_async() for data generation
- update_json_queue_tids() for shared TID update logic

Both sync and async versions now share validation and data
preparation logic while maintaining separate execution paths.

Signed-off-by: Ming Lei <tom.leiming@gmail.com>
Extract common parameter validation logic into validate_new_params()
method. Both sync and async constructors now share identical
validation logic, ensuring consistent parameter checking while
maintaining separate initialization paths.

Signed-off-by: Ming Lei <tom.leiming@gmail.com>
Extract common device info printing logic into dump_device_info()
helper method. Both sync and async dump methods now share identical
formatting and display logic while maintaining separate data
collection paths.

Signed-off-by: Ming Lei <tom.leiming@gmail.com>
- async command should run in async environment with UBLK_CTRL_ASYNC_AWAIT
set.

- sync command should run in sync environment without UBLK_CTRL_ASYNC_AWAIT

Signed-off-by: Ming Lei <tom.leiming@gmail.com>
The assertion was incorrectly checking if local_index < buffer_count,
but it should verify that the index is within the valid range starting
from buffer_start_index. This fixes buffer translation for cases where
the buffer range doesn't start at index 0.

Fix incorrect check in case of not having buffer range.

Signed-off-by: Ming Lei <tom.leiming@gmail.com>
Add multi-device example that demonstrates creating multiple ublk devices
in a single process with shared queue threads. Each queue thread handles
queue `i` across all devices using the MultiQueueManager.

Key features:
- Per-thread queue architecture where each thread manages queue `i` for all devices
- Synchronization between main task (creates queue threads) and minor tasks
- Proper queue affinity setting for all device control instances
- Ensures all devices dump complete queue affinity information

Signed-off-by: Ming Lei <tom.leiming@gmail.com>
Remove UblkUring wrapper struct and OnceCell dependency, replacing them
with direct RefCell<Option<IoUring<squeue::Entry>>> thread-local storage.

Key changes:
- Remove UblkUring struct and its methods
- Define UBLK_URING as RefCell<Option<IoUring<squeue::Entry>>> directly
- Add helper functions with_ring(), with_ring_mut(), set_ring()
- Update ublk_init_task_ring() to only initialize when None
- Simplify UblkQueue::new() initialization logic
- Update all usage sites and tests

Benefits:
- Leverages thread_local's built-in lazy initialization
- Custom initialization only allowed when ring is None
- Cleaner semantics: once initialized, ring is used as-is
- Reduced code complexity and better performance

Signed-off-by: Ming Lei <tom.leiming@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants