v1.5.0
What's Changed
Warning
The LH5 I/O routines have been refactored! Some function names have changed and new methods for loading and viewing data have been added. Read the migration guide below for more details. This release is fully backward compatible, but deprecation warnings will show up when using the old methods. Upgrade to the new recommended syntax to suppress them.
NEW: the package now offers support for viewing LGDO data (Tables, in particular) as Awkward arrays through the LGDO.view_as()
interface. Awkward Array is a library for nested, variable-sized data, including arbitrary-length lists, records, mixed types, and missing data, using NumPy-like idioms.
Please consult the API documentation on https://legend-pydataobj.readthedocs.io to learn about the new methods.
Migration Guide
Imports
LH5 I/O related routines have been moved to a dedicated subpackage: lgdo.lh5
Old syntax:
from lgdo.lh5_store import LH5Store, ls
store = LH5Store()
ls("file.lh5")
New recommended syntax:
from lgdo import lh5
store = lh5.LH5Store()
lh5.ls("file.lh5")
Read/write LGDOs to disk
Old syntax:
store = LH5Store()
obj, _ = store.read_object("obj", "file.lh5")
store.write_object(obj, "obj", "file.lh5")
New syntax:
store = lh5.LH5Store()
obj, _ = store.read("obj", "file.lh5")
store.write(obj, "obj", "file.lh5")
Convert LGDO to another format
LGDO.view_as()
is the new recommended way to view (i.e. without performing a copy) LGDOs in alternative formats (Pandas, Numpy, Awkward...)
Old syntax:
table = Table(...)
table.get_dataframe()
New syntax:
table.view_as("pd")
Old syntax:
from lgdo.lh5_store import load_nda, load_dfs
load_nda("file.lh5", ["obj"])
load_dfs("file.lh5", ["tbl"])
New syntax:
from lgdo import lh5
lh5.read_as("obj", "file.lh5", library="np")
lh5.read_as("obj", "file.lh5", library="pd")
New syntax (longer alternative):
from lgdo import lh5
store = lh5.LH5Store()
obj, _ = store.read("obj", "file.lh5")
obj.view_as("np")
tbl, _ = store.read("tbl", "file.lh5")
tbl.view_as("pd")
Full list of changes
- Fixed bug in LH5Iterator when number of entries for file is zero by @iguinn in #39
- Refactor of LH5 I/O routines, deprecation of existing methods by @MoritzNeuberger in #24
- Support (environment) variables for tweaking Numba at runtime by @gipert in #44
- Add vectorized operations to VectorOfVectors by @iguinn in #42
- Add LGDO format conversion utilities by @MoritzNeuberger in #30
- Added depth option to show and lh5ls by @iguinn in #52
- Reimplement
Table.eval()
, now handlingVectorOfVectors
by @gipert in #53 - Deprecate
load_nda()
andload_dfs()
in favour of.view_as()
by @gipert in #56 - Support setting a fill value when "exploding"
VectorOfVectors
into NumPy arrays in.view_as("np")
by @gipert in #57 - Migrate to pyproject.toml, upgrade pre-commit config by @gipert in #59
- Fix for reading just first row of VectorOfVectors by @ggmarshall in #63
- Feature:
lh5.read_as()
to read LH5 data straight into third party data views by @gipert in #62 - Added warning when adding a column to a table with different length by @MoritzNeuberger in #58
- Add first version of CITATION.cff by @gipert in #64
- Bug fix in
LH5Store.read()
: check forn_rows
longer thanidx
s before dropping by @ggmarshall in #65 - Bugfix for varlen error msgs and specify nda in view_as "ak" so dtype correctly inferred by @ggmarshall in #67
- Add Patrick to CITATION.cff by @gipert in #68
Table.view_as()
performance fixes by @gipert in #70
New Contributors
- @MoritzNeuberger made their first contribution in #24
Full Changelog: v1.4.2...v1.5.0