[QDP] Add a Quantum Data Loader and API refactor#1000
[QDP] Add a Quantum Data Loader and API refactor#1000guan404ming merged 5 commits intoapache:mainfrom
Conversation
|
Before After |
|
cc @viiccwen |
|
Love the idea! When will 0.6.0 release dates be? I think we should land it faster.
|
|
We could prepare it recently I think. We could iterate it fast. |
guan404ming
left a comment
There was a problem hiding this comment.
Design looks nice, left some question about implementation. Thanks!
| _qdp: Optional[object] = None | ||
|
|
||
|
|
||
| def _get_qdp(): |
There was a problem hiding this comment.
I'm not really sure if we need this. Could you explain more?
There was a problem hiding this comment.
yeah , you're right, we don't need it in this part
There was a problem hiding this comment.
I had intended the lazy import to defer loading the Rust extension until first use, but with the current design it doesn’t actually achieve that
There was a problem hiding this comment.
After some investigation, I found this piece of code seems work for this issue. Could you help test it?
from functools import lru_cache
@lru_cache(maxsize=1)
def get_qdp():
import _qdp
return _qdp
There was a problem hiding this comment.
I tested it seems not work. Maybe we could also try to move the import _qdp to if TYPE_CHECKING: and rewrite the type def to def func(x: "_qdp.Type"): and see if it works
There was a problem hiding this comment.
got it! thanks for the advice!
|
Please help resolve ci error thanks! |
e833f41 to
27c6400
Compare
|
It's failed with testing (CI didn't trigger these tests due to lack of CUDA environment). |
qdp/qdp-python/src/lib.rs
Outdated
There was a problem hiding this comment.
We don't need this since PyTorch default stream can report cuda_stream as 0.
I don't know whether ur change cause this or not. Just found.
Then we can pass the testing. : D
| fn create_synthetic_loader( | ||
| &self, | ||
| total_batches: usize, | ||
| batch_size: usize, | ||
| num_qubits: u32, | ||
| encoding_method: &str, | ||
| seed: Option<u64>, | ||
| ) -> PyResult<PyQuantumLoader> { | ||
| let config = PipelineConfig { | ||
| device_id: 0, | ||
| num_qubits, | ||
| batch_size, | ||
| total_batches, | ||
| encoding_method: encoding_method.to_string(), | ||
| seed, | ||
| warmup_batches: 0, | ||
| }; |
There was a problem hiding this comment.
device_id uses hard-coded 0, config should equals to use actually.
testing/qdp/test_benchmark_api.py
Outdated
| @requires_qdp | ||
| def test_qdp_benchmark_validation(): | ||
| """QdpBenchmark.run_throughput() raises if qubits/batches not set.""" | ||
| import api | ||
|
|
||
| with pytest.raises(ValueError, match="qubits and batches"): | ||
| api.QdpBenchmark(device_id=0).run_throughput() |
There was a problem hiding this comment.
lacks of testing of run_latency() if qubits/batches not set.
There was a problem hiding this comment.
In test we also don't have device_idpropagation.
| batch_size: int = 64, | ||
| total_batches: int = 100, | ||
| encoding_method: str = "amplitude", | ||
| seed: Optional[int] = None, |
There was a problem hiding this comment.
seed in Rust interface is: seed: Option<u64>, but in Python is: seed: Optional[int] = None.
maybe we can add validate guard to raise ValueError with clear description.
There was a problem hiding this comment.
Validate every args user enters and raise proper error might be a good one.
There was a problem hiding this comment.
updated! plz take a another look
a4489ae to
445370d
Compare
|
Overall lgtm |
guan404ming
left a comment
There was a problem hiding this comment.
Looks nice, we could follow up to refine this! This is a great starting point.
|
thanks for the review @guan404ming , @ryankert01 and @viiccwen ! |

Purpose of PR
Introduces a stateful Rust iterator (Quantum Data Loader) that yields one encoded batch (DLPack) per step, so Python can drive encoding with
for qt in loader:instead of a closed benchmark loop. The same Rust core supports both benchmark (full pipeline, stats only) and data loader (batch-by-batch iteration). The public Python API is moved out ofbenchmark/into a packagequmat_qdp; benchmark scripts import fromqumat_qdpand remain the primary consumers.qdp-core (Rust)
pipeline_runner.rs(new):PipelineConfig,PipelineRunResult,run_throughput_pipeline,run_latency_pipeline; sharedvector_len,generate_batch,fill_sample. AddsDataSourceenum (Synthetic only; File reserved for Phase 2) andPipelineIteratorwithnew_synthetic(engine, config)andnext_batch(&mut self) -> Result<Option<*mut DLManagedTensor>>reusingencode_batch.lib.rs:QdpEngineimplementsClone; re-exportsDataSource,PipelineIterator,PipelineConfig,PipelineRunResult,run_throughput_pipeline,run_latency_pipeline(Linux only).gpu/encodings/amplitude.rs:run_amplitude_dual_stream_pipelineand exposure for dual-stream encode path used by benchmarks.qdp-python (Rust bindings)
SendPtrwrapper: wraps*mut DLManagedTensorso the raw pointer can crosspy.allow_threads(closure return must beSend). Used only to release GIL during encode.PyQuantumLoader(#[pyclass]): holdsOption<PipelineIterator>. Implements Python iterator protocol:__iter__returnsself;__next__takes iterator out withtake(), runsnext_batch()insidepy.allow_threads, then restores the iterator or returns aQuantumTensor. RaisesStopIterationwhen exhausted. Stub on non-Linux.QdpEngine::create_synthetic_loader(...): buildsPipelineConfig, callsPipelineIterator::new_synthetic(engine.clone(), config), returnsPyQuantumLoader. Stub on non-Linux.run_throughput_pipeline_py: pipeline run is executed insidepy.allow_threadsso the full benchmark loop runs with GIL released (replacing previous detach/allow_threads usage as appropriate).qumat_qdp (Python package)
qumat_qdp/at project root:__init__.py(re-exportsQdpEngine,QuantumTensor,QdpBenchmark,ThroughputResult,LatencyResult,QuantumDataLoader,run_throughput_pipeline_py),api.py(QdpBenchmark, ThroughputResult, LatencyResult; calls_qdp.run_throughput_pipeline_py),loader.py(QuantumDataLoader builder;__iter__callsengine.create_synthetic_loaderand returns the Rust iterator).pyproject.toml:python-source = "."so the rootqumat_qdppackage is included in the wheel.Benchmark scripts
from qumat_qdp import ...(or re-export).benchmark/api.pyandbenchmark/loader.pyonly re-export fromqumat_qdpfor backward compatibility.sys.pathbefore importingqumat_qdp, souv run python benchmark/run_pipeline_baseline.py(and similar) works without extraPYTHONPATH.benchmark_loader_throughput.py(new): runs throughput by iteratingQuantumDataLoader(for qt in loader) and compares withQdpBenchmark.run_throughput()(full Rust pipeline).Behaviour
QdpBenchmark(device_id=0).qubits(16).batches(100, 64).run_throughput()still runs the full pipeline in Rust with GIL released viarun_throughput_pipeline_py.QuantumDataLoader(device_id=0).qubits(16).batches(100, 64).source_synthetic()thenfor qt in loader:yields oneQuantumTensor(batch) per iteration; GIL is released during eachnext_batch()in Rust.Testing
uv run python benchmark/run_pipeline_baseline.py,benchmark_throughput.py,benchmark_latency.py.uv run python benchmark/benchmark_loader_throughput.py(compares loader iteration vs full pipeline).Related Issues or PRs
Related to #969
Changes Made
Breaking Changes
Checklist