feat[python]: support full read/write from object storage#6022
Conversation
Codecov Report❌ Patch coverage is
☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
64382ba to
77ac62a
Compare
CodSpeed Performance ReportMerging this PR will degrade performance by 31.79%Comparing
|
| Mode | Benchmark | BASE |
HEAD |
Efficiency | |
|---|---|---|---|---|---|
| 🆕 | WallTime | 10M_90pct[10000000] |
N/A | 371.9 µs | N/A |
| 🆕 | WallTime | 10M_90pct[10000000] |
N/A | 365 µs | N/A |
| 🆕 | WallTime | 1M_10pct[100000] |
N/A | 45.9 µs | N/A |
| 🆕 | WallTime | 10M_90pct[10000000] |
N/A | 198.2 µs | N/A |
| 🆕 | WallTime | 1M_90pct[1000000] |
N/A | 57.1 µs | N/A |
| 🆕 | WallTime | 10M_50pct[5000000] |
N/A | 281.5 µs | N/A |
| 🆕 | WallTime | 1M_50pct[500000] |
N/A | 49.5 µs | N/A |
| 🆕 | WallTime | 1M_10pct[100000] |
N/A | 45.8 µs | N/A |
| 🆕 | WallTime | 10M_50pct[5000000] |
N/A | 304.9 µs | N/A |
| 🆕 | WallTime | 10M_10pct[1000000] |
N/A | 219.2 µs | N/A |
| 🆕 | WallTime | 1M_50pct[500000] |
N/A | 22.4 µs | N/A |
| 🆕 | WallTime | 1M_10pct[100000] |
N/A | 25.6 µs | N/A |
| 🆕 | WallTime | 1M_50pct[500000] |
N/A | 50 µs | N/A |
| 🆕 | WallTime | 10M_10pct[1000000] |
N/A | 133.7 µs | N/A |
| 🆕 | WallTime | 10M_10pct[1000000] |
N/A | 219.3 µs | N/A |
| 🆕 | WallTime | 1M_90pct[1000000] |
N/A | 62.7 µs | N/A |
| 🆕 | WallTime | 10M_50pct[5000000] |
N/A | 157.9 µs | N/A |
| 🆕 | WallTime | 1M_90pct[1000000] |
N/A | 28.9 µs | N/A |
| ❌ | Simulation | canonical_into_non_nullable[(10000, 1, 0.0)] |
25.7 µs | 28.8 µs | -10.83% |
| ❌ | Simulation | canonical_into_non_nullable[(10000, 10, 0.0)] |
195.5 µs | 286.6 µs | -31.79% |
| ... | ... | ... | ... | ... | ... |
ℹ️ Only the first 20 benchmarks are displayed. Go to the app to view all benchmarks.
Footnotes
-
No successful run was found on
develop(ed75f70) during the generation of this report, so 68130ce was used instead as the comparison base. There might be some changes unrelated to this pull request in this report. ↩ -
1323 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports. ↩
Signed-off-by: Andrew Duffy <andrew@a10y.dev>
35ed5b7 to
190d169
Compare
b4e95cb to
ac3aff7
Compare
|
Just pushed a commit with a modified version of the ObjectStoreRegistry that resolves configurations out of the environment in a more consistent way. This is related to apache/arrow-rs-object-store#529 which we stumbled upon last week |
| // NOTE(aduffy): object_store doesn't let us downcast stores, the only way to verify | ||
| // that a configuration was added was to validate that it ends up in the Debug | ||
| // output :/ | ||
| let mut debug_str = String::new(); | ||
| write!(&mut debug_str, "{store:?}").unwrap(); | ||
|
|
||
| assert!(debug_str.contains("us-east-3")); |
There was a problem hiding this comment.
this is pretty sad but i don't have a better idea
| @@ -0,0 +1,17 @@ | |||
| Object Store support | |||
There was a problem hiding this comment.
Yea I'm so bad with Python/Sphinx tooling. I might just try and get Claude to clean this up in a FLUP if that's ok
Signed-off-by: Andrew Duffy <andrew@a10y.dev>
26a73d3 to
7785aca
Compare
This adds the ability to write in PyVortex directly to object storage, along with more ergonomic configuration of object store credentials for the major PyVortex read/write paths: `vx.open`, `vx.io.read`, and `vx.io.write`, as well as `VortexWriteOptions.write`. This is using the https://docs.rs/pyo3-object_store/latest/pyo3_object_store/ crate, which is what runs the [obstore](https://developmentseed.org/obstore/latest/) Python library. Notably, we can't reuse `obstore` and all of its types directly. This is b/c of the way it works, you're required to actually link in pyo3-object_store and then expose it as a new module **within your Python bundle**. There are some other examples of this pattern: * [nutpie](https://github.com/pymc-devs/nutpie/blob/adb83e24235773d7ff97f86e1858ee9d31db0936/src/wrapper.rs#L1624-L1625) * [geoarrow](https://github.com/geoarrow/geoarrow-rs/blob/main/python/geoarrow-io/src/lib.rs#L46-L48) Sadly there's no nicely distributed types package, so I need to copy all of the pyi files here with their original MIT license. I've added a Python unit test to demonstrate using the new object store builders with simple local file system object store Resolves #5673 --------- Signed-off-by: Andrew Duffy <andrew@a10y.dev>
This adds the ability to write in PyVortex directly to object storage, along with more ergonomic configuration of object store credentials for the major PyVortex read/write paths: `vx.open`, `vx.io.read`, and `vx.io.write`, as well as `VortexWriteOptions.write`. This is using the https://docs.rs/pyo3-object_store/latest/pyo3_object_store/ crate, which is what runs the [obstore](https://developmentseed.org/obstore/latest/) Python library. Notably, we can't reuse `obstore` and all of its types directly. This is b/c of the way it works, you're required to actually link in pyo3-object_store and then expose it as a new module **within your Python bundle**. There are some other examples of this pattern: * [nutpie](https://github.com/pymc-devs/nutpie/blob/adb83e24235773d7ff97f86e1858ee9d31db0936/src/wrapper.rs#L1624-L1625) * [geoarrow](https://github.com/geoarrow/geoarrow-rs/blob/main/python/geoarrow-io/src/lib.rs#L46-L48) Sadly there's no nicely distributed types package, so I need to copy all of the pyi files here with their original MIT license. I've added a Python unit test to demonstrate using the new object store builders with simple local file system object store Resolves #5673 --------- Signed-off-by: Andrew Duffy <andrew@a10y.dev>

This adds the ability to write in PyVortex directly to object storage, along with more ergonomic configuration of object store credentials for the major PyVortex read/write paths:
vx.open,vx.io.read, andvx.io.write, as well asVortexWriteOptions.write.This is using the https://docs.rs/pyo3-object_store/latest/pyo3_object_store/ crate, which is what runs the obstore Python library.
Notably, we can't reuse
obstoreand all of its types directly. This is b/c of the way it works, you're required to actually link in pyo3-object_store and then expose it as a new module within your Python bundle.There are some other examples of this pattern:
Sadly there's no nicely distributed types package, so I need to copy all of the pyi files here with their original MIT license.
I've added a Python unit test to demonstrate using the new object store builders with simple local file system object store
Resolves #5673