Breaking: Include refdims as columns in DimTable#1119
Conversation
src/tables.jl
Outdated
| end | ||
| function DimTable(xs::Vararg{AbstractDimArray}; layernames=nothing, mergedims=nothing) | ||
| function DimTable( | ||
| xs::Vararg{AbstractDimArray}; |
There was a problem hiding this comment.
It's not clear from the documentation what assumptions this method makes about xs, so it's possible there's a mistake in this method.
|
|
||
| end | ||
|
|
||
| _dims(t::DimTable) = getfield(t, :dims) |
There was a problem hiding this comment.
Is it at all weird that dims(t::DimTable) will return fewer dims than the actual dims included in the table (because it just forwards to the parent)?
There was a problem hiding this comment.
I didn't even know it did that. We can fix these things and merge to breaking instead?
There was a problem hiding this comment.
It didn't do that until this PR, since previously the table's dims were the parent dims, but now additional dims may be included.
There was a problem hiding this comment.
Ahh you mean the refdims? Yeah hkw do we keep that separate.
There was a problem hiding this comment.
2 ways I can think of:
- Keep
dims(::DimTable)forwarding to the parent and add the methodrefdims(t::DimTable) = otherdims(dims(t), _dims(t)). Maybe not the right way to go if the parent is a slice of aDimMatrixwhere both dimensions have the same dim, but that kind of thing in general may not be well supported. - Remove the
dimsfield and add arefdimsandrefdimarraysfield. Thenrefdims(t::DimTable) = getfield(t, :refdims). I started implementing this version originally and abandoned it because it makes the column indexing more complicated, but I could bring it back.
There was a problem hiding this comment.
Can we not just do refdims(dt::DimTable) = refdims(parent(dt)) ? Maybe I'm missing something
There was a problem hiding this comment.
I guess it depends on what dims(::DimTable) and refdims(::DimTable) should mean. With this PR, a user can provide arbitrary refdims to the DimTable, so they may not even be in the parent. But should refdims(::DimTable) return those user-provided refdims (currently only stored in (::DimTable).dims) or return those of the parent? Should dims(::DimTable) return just the dims of the parent, or should it return all dimensions corresponding to columns (with this PR, also includes user-provided refdims).
There was a problem hiding this comment.
Yeah, it's kind of weird that we need getfield(::DimTable, :dims) calls everywhere. I think I'd like to change it so that dims(::DimTable) is the one accessor that doesn't forward to the parent.
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## breaking #1119 +/- ##
============================================
+ Coverage 82.43% 82.78% +0.34%
============================================
Files 57 57
Lines 5647 5668 +21
============================================
+ Hits 4655 4692 +37
+ Misses 992 976 -16 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
src/tables.jl
Outdated
| - `refdims`: Additional reference dimensions to add to the table, defaults to the reference | ||
| dimensions of the input. |
There was a problem hiding this comment.
I think I would prefer if this was a bool that determines whether or not refdims are included as columns
There was a problem hiding this comment.
Makes sense, though the current approach would support only including a subset of refdims, which maybe is useful in some cases? What other methods implemented here produce refdims besides slicing?
There was a problem hiding this comment.
I don't mind the current syntax with a tuple as it's the same as the constructor syntax, and its flexible.
But to merge this to main () would need to be the default.
We can then change that on breaking.
There was a problem hiding this comment.
@sethaxen CF standards has its own refdims concept so Rasters.jl Raster can load with refdims already in place.
To make feature non-breaking
|
@sethaxen the conflicts against breaking are pretty extreme, sorry about that. But we should try getting it merged there for the next breaking release (which is very soon) |
Currently the default behavior is to not add refdims to columns. What makes this breaking? And, if this will be considered breaking anyways, should I then restore the default behavior being to include refdims in columns? |
|
Oh right I misunderstood. So yes its fine to merge to main. But: then there will be the other problem that I cant easily do the merge because neither side of the Tables.jl changes are my code, and its a lot. If I merge to main can you commit to merging main into breaking soon afterwards? I will make sure its only your changes to merge. |
Oh I see what you mean. Oof, both PRs rewrite the tables tests. But most of the changes to tests in this PR are I think it will actually be easier to redo this PR onto breaking. Okay with me opening a new PR against breaking with these changes (with default being to include refdims as columns)? |
|
Yes changing the default would be fine on breaking. But this PR is already moved to breaking so no need for a new one. |
* include DataType in CategoricalEltypes (#876) * Breaking: `DimVector` of `NamedTuple` is a `NamedTuple` `DimTable` (#839) * DimVector of NamedTuple is a NamedTuple table * bugfix * remove show * fix ambiguity * Breaking: add `combine` method for `groupby` output, fixing `similar` for `AbstractDimStack` (#903) * add combine method * test groupby and similar * docs entry * Breaking: `preservedims` in tables (#917) * add preservedims keyword to DimTable * add tests * Apply suggestions from code review Co-authored-by: Anshul Singhvi <anshulsinghvi@gmail.com> * tests, and fix DimSlices * better table docs * cleanup * test * indexing overhaul * fix similar and broadcast for basicdimarray * bugfix rebuildsliced * more indexing cleanup * cleanup similar and gubfix indexing * bugfixes * uncomment * fix doctests * just dont doctest unreproducable failures, for now * combine new Tables integrations * bugfix and cleanup show * bugfix and more tests for preservedims and mergedims --------- Co-authored-by: Anshul Singhvi <anshulsinghvi@gmail.com> * Remove deprecations (#1009) * typo * add missing reference docs * fix DimSlices doc * Breaking: skipmissing on a dimstack (#1041) * iterate values where no layer is missing * add tests * add skipmissing to reference * Breaking: Materialize `DimArray` or `DimStack` From a Table (#739) * Table Materializer Methods * Made col Optional for DimArray * Apply suggestions from code review Co-authored-by: Rafael Schouten <rafaelschouten@gmail.com> * Handle coordinates with different loci * replaced At() with Contains() in _coords_to_ords * Added optional selectors and public methods for table materializer * Updated table constructors for DimArray and DimStack * Updated DimArray and DimStack docs to include table materializer methods * Table materializer test cases * export table materializer methods * Added Random to tables.jl test cases * Update src/array/array.jl Co-authored-by: Rafael Schouten <rafaelschouten@gmail.com> * Update src/table_ops.jl Co-authored-by: Rafael Schouten <rafaelschouten@gmail.com> * Removed exports * Update src/table_ops.jl Co-authored-by: Rafael Schouten <rafaelschouten@gmail.com> * Update src/table_ops.jl Co-authored-by: Rafael Schouten <rafaelschouten@gmail.com> * Update src/table_ops.jl Co-authored-by: Rafael Schouten <rafaelschouten@gmail.com> * Update src/table_ops.jl Co-authored-by: Rafael Schouten <rafaelschouten@gmail.com> * Update src/table_ops.jl Co-authored-by: Rafael Schouten <rafaelschouten@gmail.com> * Replaced selector type with instance. * Table materializer can now infer dimensions from the coordinates. * Update src/stack/stack.jl Co-authored-by: Rafael Schouten <rafaelschouten@gmail.com> * Update src/table_ops.jl Co-authored-by: Rafael Schouten <rafaelschouten@gmail.com> * Update src/table_ops.jl Co-authored-by: Rafael Schouten <rafaelschouten@gmail.com> * Update src/table_ops.jl Co-authored-by: Rafael Schouten <rafaelschouten@gmail.com> * Update src/table_ops.jl Co-authored-by: Rafael Schouten <rafaelschouten@gmail.com> * Update src/table_ops.jl Co-authored-by: Rafael Schouten <rafaelschouten@gmail.com> * Update src/table_ops.jl Co-authored-by: Rafael Schouten <rafaelschouten@gmail.com> * Update src/array/array.jl Co-authored-by: Rafael Schouten <rafaelschouten@gmail.com> * Update src/table_ops.jl Co-authored-by: Rafael Schouten <rafaelschouten@gmail.com> * Added support for guessing the dimension ordering and span for Dates and DateTimes * Replaced LinRange with StepRangeLen in _build_dim * Added Tables.istable check to DimArray constructor * Update src/array/array.jl * merge materialize2 * fix scuffed merge * filter instead of indexing in test for clarity * fix DimSlices doc * fix ambiguities * bugfixes * do checks and call Tables.columns before constructing stack from table * test dimensions are automatically detected when constructing dimstack * comments not docstrings for internals * check for columnaccess if dims are passed * add type argument to dimarray_from_table * allow passing name to DimStack * add a section to the documentation * use Tables.columnnames instead of keys * make DimArray work with all tables that are abstractarrays * do not treat dimvectors as tables * simplify get_column --------- Co-authored-by: Rafael Schouten <rafaelschouten@gmail.com> Co-authored-by: Tiem van der Deure <tiemvanderdeure@gmail.com> * start a CHANGELOG * bump minor version to 0.30.0 * document Changelog.jl usage * use rebuild for similar of dimarray with new axes (#1082) * add _similar dispatch for abstractdimarray * update tests * Update src/array/array.jl Co-authored-by: Rafael Schouten <rafaelschouten@gmail.com> * Update src/array/array.jl Co-authored-by: Rafael Schouten <rafaelschouten@gmail.com> --------- Co-authored-by: Rafael Schouten <rafaelschouten@gmail.com> * Breaking: standardise interface methods and remove `index` (#1083) * standardise interface methods and remove index * update Changelog * cleanup * move const * cleanup * remove index from test * dont export index * last index * tweaks * more tweaks * fix tests --------- Co-authored-by: Raf Schouten <schoutenr@ull-pf39vwmc.mobility.unimelb.net.au> * move abstract constructors to DimArray constructors (#1087) * Forward name keyword in groupby (#1084) * Forward name keyword in groupby * Add test for setting groupby name explicitly * Update src/groupby.jl Co-authored-by: Rafael Schouten <rafaelschouten@gmail.com> * Update test/groupby.jl Co-authored-by: Rafael Schouten <rafaelschouten@gmail.com> * Add Changelog entry * Mention name keyword in docstring --------- Co-authored-by: Rafael Schouten <rafaelschouten@gmail.com> * Remove rtol from At selector (#1062) * Remove rtol from At selector * Remove explicit rtol from test * Remove unused type parameter * fix At constructors --------- Co-authored-by: Rafael Schouten <rafaelschouten@gmail.com> * fix selector tests * fix At in dimindices * Breaking: remove methods that are hardly uesd and cause many invalidations (#1113) * do 0.6, 0.7.2 broke for us (#1099) * Fix tests on julia 1.12 (#1110) * use isequal instead of === to compare NaN * drop all and broadcast * specify DimensionalData.Dimensions to make reference unique in docs * drop convert method for name to abstractstring * remove `merge` method for dimstack with iterators of pairs * add to changelog --------- Co-authored-by: Lazaro Alonso <lazarus.alon@gmail.com> * Implement `Base.instantiate` - take 2 (#1118) * implement `instantiate` - get rid of BasicDimensionalStyle * fix setindex! for opaquearray to make some error messages clearer * fix materialize! * StandardIndices methods should be last (#1129) * add `broadcastable` for abstractdimstack (#1127) * Add the `D` parameter to `AbstractDimStack` (#1128) * put D parameter in AbstractDimStack * update CHANGELOG.md * remove deprecated LookupArrays (#1132) * Extend matmul with DimUnitRange axes (#1124) * Add dims fallback * Update matmul for mixtures of normal arrays and dimarrays * Fix bugs * Add and remove some comparedims checks * Add tests for matmul with DimUnitRange axes * Test that dims is nothing for an array * Test that dims returns dims if all axes DimUnitRange * Check lookup of anon dims * Breaking: Include refdims as columns in DimTable (#1119) * Complete unfinished docstring example * Support refdims in DimTable * Test refdims in DimTable * By default add no refdims to Tables To make feature non-breaking * Update tables tests * Test getcolumn for DimStack/DimArray * Remove unused variable * Update constructor calls * Correctly compute dimnums * Use all dims to compute colnames * By default include refdims * Make sure data-array is duplicated if necessary * Add preservedims/refdims test * Fix some bugs * Add more joint compatibility tests with refdims * Support AoG selection of refdims * Only extended array with dims if needed * Test AoG with refdims * refdims doc line --------- Co-authored-by: Rafael Schouten <rafaelschouten@gmail.com> * Add a DimStackArray generator (#1131) * add DimStackArray * export DimStackArray * add tests * add a docstring * drop inner constructor Co-authored-by: Rafael Schouten <rafaelschouten@gmail.com> * fix missing } * fix DimStackArray type definition * add tests for broadcast over stack * add to the docs --------- Co-authored-by: Rafael Schouten <rafaelschouten@gmail.com> * uncomment tests * unccomend dimindices tests Co-authored-by: Felix Cremer <felix.cremer@dlr.de> * Swap dims test and value test in == (#1111) * Swap dims test and value test in == * Add isequal test * Add broken test for dimarray isequal with different axes * Make the same swap in == also for DimStack * Add isequal for AbstractDimArray and AbstractDimstack * Fix stack test * Update test/stack.jl Co-authored-by: Tiem van der Deure <tiemvanderdeure@gmail.com> * Add Changelog entry --------- Co-authored-by: Rafael Schouten <rafaelschouten@gmail.com> Co-authored-by: Tiem van der Deure <tiemvanderdeure@gmail.com> * Breaking: Make reduction methods error when passed missing dimensions (#1162) * Fix NearestNeighbor extension Description of the fix from Claude: The issue was in ext/DimensionalDataNearestNeighborsExt.jl at line 35: Problem: The distance vector was being created with the wrong element type: distvec = Vector{NN.get_T(eltype(points))}(undef, 1) NN.get_T(eltype(points)) was returning SVector{2, Float64} (the point type) instead of Float64 (the scalar distance type). This caused knn! to fail with: ArgumentError: dists must have eltype Float64, got StaticArraysCore.SVector{2, Float64} Fix: Changed to use eltype(eltype(points)) which correctly extracts the scalar type: distvec = Vector{eltype(eltype(points))}(undef, 1) * Disable broken inference tests Claude blames the implementation of `_sortdims()`, which is called by `_dims()` and is apparently not type-stable. * Disable CondaPkg verbosity This otherwise spams the terminal. * Revert "fix `sum(da; dims = :notadim)` (#1116)" This reverts commit 2245221. * Improve error messages for reduction methods with missing dims This uses the new `_missingdims()` helper function to select all the missing dimensions, and we now don't call `basetypeof` in `_extradimsmsg()` since `extradims` may not all be `Dim`s. * Fix Makie tests * Breaking: split `set` into `unsafe_set` and `set` (#926) * tweaks * some ambiguities * more set * bugfix unsafe and reorder * set tweaks * Update src/Dimensions/set.jl * Update src/set.jl * Update src/set.jl * Update src/Dimensions/set.jl * Add a few test cases * Try to fix some ambiguities and some wrong variable names * Fix ambiguities * Reimport _astuple * Add roundtrip tests * more set tests * merge breaking * more tests * checkaxes => checkaxis * fix doctests * doc unsafe set --------- Co-authored-by: Felix Cremer <fcremer@bgc-jena.mpg.de> * fix ambiguities * update CHANGELOG.md for set changes * Breaking: Extent passthrough for multidimensional lookups (#991) * hasmultipledimensions * expand the definition of extent * bump cairomakie compat * Add tests for hasmultipledimensions trait and extent passthrough (PR #991) - Add comprehensive tests for hasmultipledimensions trait in merged.jl - Test that regular lookups return false for hasmultipledimensions - Test that MergedLookup returns true for hasmultipledimensions - Test extent passthrough for merged dimensions - Test mixed regular and merged dimensions - Test that operations preserve the hasmultipledimensions trait - Add fallback methods to handle edge cases: - hasmultipledimensions(::Any) = false for non-Lookup types - bounds(x::AbstractArray) for raw arrays - Import Extents in merged.jl test file - Fix bounds ambiguity with explicit module qualification * Refactor to `hasinternaldimensions` * Add changelog * Tweak metadata on dimensions.md docs * Add yet more tests * Fix StackOverflow in mergedims for single dimension Wrap old_dims in _astuple() when constructing MergedLookup so that a bare Dimension doesn't cause infinite recursion in combinedims. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Fix BoundsError in bounds for empty MergedLookup Add isempty guard so that bounds on an empty MergedLookup returns () instead of throwing when calling first on an empty collection. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Fix multidimensional_lookups test assertions - Wrap merged_dims in tuple for Extents.extent call - Assert ArgumentError for unsupported Near selector - Use Z type instead of Dim{:Z} for @dim-declared dimension Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Update CHANGELOG.md * Add docstring for `hasinternaldimensions` * Describe what is going on in new extent code * Add hasinternaldimensions docs to doc pages --------- Co-authored-by: Rafael Schouten <rafaelschouten@gmail.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> * Add removal of LookupArrays into Changelog and remove LookupArrays reference in docs (#1173) --------- Co-authored-by: Tiem van der Deure <tiemvanderdeure@gmail.com> Co-authored-by: Anshul Singhvi <anshulsinghvi@gmail.com> Co-authored-by: Felix Cremer <fcremer@bgc-jena.mpg.de> Co-authored-by: Joshua Billson <61667893+JoshuaBillson@users.noreply.github.com> Co-authored-by: Raf Schouten <schoutenr@ull-pf39vwmc.mobility.unimelb.net.au> Co-authored-by: Lazaro Alonso <lazarus.alon@gmail.com> Co-authored-by: Seth Axen <seth@sethaxen.com> Co-authored-by: Felix Cremer <felix.cremer@dlr.de> Co-authored-by: James Wrigley <JamesWrigley@users.noreply.github.com>
This PR adds a keyword
refdimstoDimTable, defaulting to the refdims of the input. All refdims are included as table columns. Fixes #884Potentially breaking changes:
dimsfield added toDimTablerefdimscolumns included by defaultExample