diff --git a/docs/src/broadcasts.md b/docs/src/broadcasts.md index 897cae49d..3279510f5 100644 --- a/docs/src/broadcasts.md +++ b/docs/src/broadcasts.md @@ -1,21 +1,20 @@ # Dimensional broadcasts with `@d` and `broadcast_dims` Broadcasting over AbstractDimArray works as usual with Base Julia broadcasts, -except that dimensions are checked for compatibility with eachother, and that -values match. Strict checks can be turned of globally with +except that dimensions are checked for compatibility with each other, and that +values match. Strict checks can be turned off globally with `strict_broadcast!(false)`. To avoid even dimension name checks, broadcast over `parent(dimarray)`. -The [`@d`](@ref) macro is a dimension-aware extension to regular dot brodcasting. -[`broadcast_dims`](@ref) and [`broadcast_dims`](@ref) are analagous to Base -julia `broadcast`. +The [`@d`](@ref) macro is a dimension-aware extension to regular dot broadcasting. +[`broadcast_dims`](@ref) is analogous to Base Julia's `broadcast`. -Because we know the names of the dimensions, there is no ambiguity in which one +Because we know the names of the dimensions, there is no ambiguity in which ones we mean to broadcast together. This means we can permute and reshape dims so that broadcasts that would fail with a regular `Array` just work with a `DimArray`. -As an added bonus, `broadcast_dims` even works on `DimStack`s. Currently `@d` +As an added bonus, `broadcast_dims` even works on `DimStack`s. Currently, `@d` does not work on `DimStack`. ## Example: scaling along the time dimension diff --git a/docs/src/cuda.md b/docs/src/cuda.md index 993ae9026..5b2a026f9 100644 --- a/docs/src/cuda.md +++ b/docs/src/cuda.md @@ -13,7 +13,7 @@ A = rand(Float32, X(1.0:1000.0), Y(1.0:2000.0)) cuA = modify(CuArray, A) ``` -The result of a GPU broadcast is still a DimArray: +The result of a GPU broadcast is still a `DimArray`: ```julia-repl julia> cuA2 = cuA .* 2 @@ -50,20 +50,20 @@ CuArray{Float32, 2, CUDA.Mem.DeviceBuffer} DimensionalData.jl has two GPU-related goals: -1. Work seamlessly with Base julia broadcasts and other operations that already +1. Work seamlessly with `Base` Julia broadcasts and other operations that already work on GPU. 2. Work as arguments to custom GPU kernel functions. This means any `AbstractDimArray` must be automatically moved to the GPU and its -fields converted to GPU friendly forms whenever required, using [Adapt.jl](https://github.com/JuliaGPU/Adapt.jl)). +fields converted to GPU-friendly forms whenever required, using [Adapt.jl](https://github.com/JuliaGPU/Adapt.jl). -- The array data must converts to the correct GPU array backend +- The array data must convert to the correct GPU array backend when `Adapt.adapt(dimarray)` is called. - All DimensionalData.jl objects, except the actual parent array, need to be immutable `isbits` or convertible to them. This is one reason DimensionalData.jl uses `rebuild` and a functional style, rather than in-place modification of fields. -- Symbols need to be moved to the type system `Name{:layer_name}()` replaces `:layer_name` -- Metadata dicts need to be stripped, they are often too difficult to convert, +- Symbols need to be moved to the type system, so `Name{:layer_name}()` replaces `:layer_name`. +- Metadata dictionaries need to be stripped, as they are often too difficult to convert and not needed on GPU. As an example, [DynamicGrids.jl](https://github.com/cesaraustralia/DynamicGrids.jl) uses `AbstractDimArray` for auxiliary diff --git a/docs/src/dimarrays.md b/docs/src/dimarrays.md index 66da94c26..d187e104e 100644 --- a/docs/src/dimarrays.md +++ b/docs/src/dimarrays.md @@ -84,7 +84,7 @@ order of our objects axes. These are the same: da[X(2), Y(1)] == da[Y(1), X(2)] ``` -We also can use Tuples of dimensions like `CartesianIndex`, +We also can use `Tuples` of dimensions, like `CartesianIndex`, but they don't have to be in order of consecutive axes. ```@ansi dimarray diff --git a/docs/src/dimensions.md b/docs/src/dimensions.md index d9a8ac43c..0fd71bf8e 100644 --- a/docs/src/dimensions.md +++ b/docs/src/dimensions.md @@ -11,7 +11,7 @@ X(1) X(1), Y(2), Z(3) ``` -You can also make [`Dim`](@ref) dimensions with any name: +You can also create [`Dim`](@ref) dimensions with any name: ```@ansi dimensions Dim{:a}(1), Dim{:b}(1) @@ -25,14 +25,14 @@ val(X(1)) DimensionalData.jl uses `Dimensions` everywhere: -- `Dimension` are returned from `dims` to specify the names of the dimensions of an object -- they wrap [`Lookups`](@ref) to associate the lookups with those names -- to index into these objects, they wrap indices like `Int` or a `Selector` +- `Dimension`s are returned from `dims` to specify the names of the dimensions of an object +- They can wrap [`Lookups`](@ref) to associate the lookups with those names +- To index into these objects, they can wrap indices like `Int` or a `Selector` -This symmetry means we can ignore how data is organised, +This symmetry means we can ignore how data is organized, and label and access it by name, letting DD work out the details for us. -Dimensions are defined in the [`Dimensions`](@ref) submodule, some +Dimensions are defined in the [`Dimensions`](@ref) submodule, and some Dimension-specific methods can be brought into scope with: ```julia diff --git a/docs/src/diskarrays.md b/docs/src/diskarrays.md index 257600829..9125e8727 100644 --- a/docs/src/diskarrays.md +++ b/docs/src/diskarrays.md @@ -10,13 +10,13 @@ It is rarely used directly, but is present in most disk and cloud based spatial data packages in julia, including: -ArchGDAL.jl, NetCDF.jl, Zarr.jl, NCDatasets.lj, GRIBDatasets.jl and CommonDataModel.jl +ArchGDAL.jl, NetCDF.jl, Zarr.jl, NCDatasets.jl, GRIBDatasets.jl and CommonDataModel.jl -The combination of DiskArrays.jl and DimensionalData.jl is Julias answer to -pythons [xarray](https://xarray.dev/). Rasters.jl and YAXArrays.jl are user-facing +The combination of DiskArrays.jl and DimensionalData.jl is Julia's answer to +python's [xarray](https://xarray.dev/). Rasters.jl and YAXArrays.jl are user-facing tools building on this combination. -They have no direct dependency relationships, with but are intentionally +They have no direct dependency relationships, but are intentionally designed to integrate via both adherence to Julia's `AbstractArray` interface, and by coordination during development of both packages. diff --git a/docs/src/extending_dd.md b/docs/src/extending_dd.md index d94705b2c..a4dcc5cce 100644 --- a/docs/src/extending_dd.md +++ b/docs/src/extending_dd.md @@ -2,9 +2,9 @@ Nearly everything in DimensionalData.jl is designed to be extensible. -- `AbstractDimArray` are easily extended to custom array types. `Raster` or +- `AbstractDimArray` is easily extended to custom array types. `Raster` or `YAXArray` are examples from other packages. -- `AbstractDimStack` are easily extended to custom mixed array dataset. +- `AbstractDimStack` is easily extended to custom mixed array datasets. `RasterStack` or `ArViZ.Dataset` are examples. - `Lookup` can have new types added, e.g. to `AbstractSampled` or `AbstractCategorical`. `Rasters.Projected` is a lookup that knows @@ -20,7 +20,7 @@ a `Tuple` of constructed `Dimension`s from `dims(obj)`. ### `Dimension` axes -Dimensions return from `dims` should hold a `Lookup` or in some cases +Dimensions returned from `dims` should hold a `Lookup` or in some cases just an `AbstractArray` (like with `DimIndices`). When attached to multi-dimensional objects, lookups must be the _same length_ as the axis of the array it represents, and `eachindex(A, i)` and `eachindex(dim)` must diff --git a/docs/src/get_info.md b/docs/src/get_info.md index a0db5a036..524fdb63d 100644 --- a/docs/src/get_info.md +++ b/docs/src/get_info.md @@ -2,9 +2,9 @@ DimensionalData.jl defines consistent methods to retrieve information from objects like `DimArray`, `DimStack`, `Tuple`s of `Dimension`, -`Dimension` and `Lookup`. +`Dimension`, and `Lookup`. -First we will define an example `DimArray`. +First, we will define an example `DimArray`. ```@example getters using DimensionalData @@ -22,8 +22,8 @@ A = rand(x, y) `dims` retrieves dimensions from any object that has them. -What makes it so useful is you can filter which dimensions -you want in what order, using any `Dimension`, `Type{Dimension}` +What makes it so useful is that you can filter which dimensions +you want, and specify in what order, using any `Dimension`, `Type{Dimension}` or `Symbol`. ```@ansi getters @@ -113,8 +113,8 @@ span(lookup(A, Y)) Get the locus of a `Lookup`, or a `Tuple` from a `DimArray` or `DimTuple`. -(locus is our term for distinguishing if an lookup value -specifies the start, center or end of an interval) +(`locus` is our term for distinguishing if an lookup value +specifies the start, center, or end of an interval) ```@ansi getters locus(A) @@ -166,7 +166,7 @@ extent(dims(A, Y)) ## Predicates These always return `true` or `false`. With multiple -dimensions, `fale` means `!all` and `true` means `all`. +dimensions, `false` means `!all` and `true` means `all`. `dims` and all other methods listed above can use predicates to filter the returned dimensions. diff --git a/docs/src/groupby.md b/docs/src/groupby.md index 433ba1f22..b3326745c 100644 --- a/docs/src/groupby.md +++ b/docs/src/groupby.md @@ -1,19 +1,19 @@ # Group By DimensionalData.jl provides a `groupby` function for dimensional -grouping. This guide will cover: +grouping. This guide covers: - simple grouping with a function - grouping with `Bins` -- grouping with another existing `AbstractDimArry` or `Dimension` +- grouping with another existing `AbstractDimArray` or `Dimension` ## Grouping functions -Lets look at the kind of functions that can be used to group `DateTime`. +Let's look at the kind of functions that can be used to group `DateTime`. Other types will follow the same principles, but are usually simpler. -First load some packages: +First, load some packages: ````@example groupby using DimensionalData @@ -29,9 +29,9 @@ Now create a demo `DateTime` range tempo = range(DateTime(2000), step=Hour(1), length=365*24*2) ```` -Lets see how some common functions work. +Let's see how some common functions work. -The `hour` function will transform values to hour of the day - the integers `0:23` +The `hour` function will transform values to the hour of the day - the integers `0:23` :::tabs @@ -86,7 +86,7 @@ yearmonthday.(tempo) == custom -We can create our own function that return tuples +We can create our own function that returns tuples ````@example groupby yearday(x) = (year(x), dayofyear(x)) @@ -104,7 +104,7 @@ yearday.(tempo) ## Grouping and reducing -Lets define an array with a time dimension of the times used above: +Let's define an array with a time dimension of the times used above: ````@ansi groupby A = rand(X(1:0.01:2), Ti(tempo)) @@ -195,7 +195,7 @@ mean.(groupby(A, Ti=>Bins(month, [1, 3, 5]))) == bin groups We can also specify an `AbstractArray` of grouping `AbstractArray`: -Her we group by month, and bin the summer and winter months: +Here we group by month, and bin the summer and winter months: ````@ansi groupby groupby(A, Ti => Bins(month, [[12, 1, 2], [6, 7, 8]]; labels=x -> string.(x))) @@ -203,7 +203,7 @@ groupby(A, Ti => Bins(month, [[12, 1, 2], [6, 7, 8]]; labels=x -> string.(x))) == range bins -First, lets see what [`ranges`](@ref) does: +First, let's see what [`ranges`](@ref) does: ````@ansi groupby ranges(1:8:370) @@ -285,7 +285,7 @@ We can also select by `Dimension`s and any objects with `dims` methods. == groupby dims -Trivially, grouping by an objects own dimension is similar to `eachslice`: +Trivially, grouping by an object's own dimension is similar to `eachslice`: ````@ansi groupby groupby(A, dims(A, Ti)) @@ -293,7 +293,7 @@ groupby(A, dims(A, Ti)) == groupby AbstractDimArray -But we can also group by other objects dimensions: +But we can also group by other objects' dimensions: ````@ansi groupby B = A[:, 1:3:100] diff --git a/docs/src/integrations.md b/docs/src/integrations.md index a67d5156a..70f726db4 100644 --- a/docs/src/integrations.md +++ b/docs/src/integrations.md @@ -2,33 +2,22 @@ ## Rasters.jl -[Rasters.jl](https://rafaqz.github.io/Rasters.jl/stable) extends DD -for geospatial data manipulation, providing file load/save for -a wide range of raster data sources and common GIS tools like -polygon rasterization and masking. `Raster` types are aware -of `crs` and their `missingval` (which is often not `missing` -for performance and storage reasons). +[Rasters.jl](https://rafaqz.github.io/Rasters.jl/stable) extends DimensionalData for geospatial data manipulation, providing file load/save capabilities for a wide range of raster data sources and common GIS tools like polygon rasterization and masking. `Raster` types are aware of their `crs` and their `missingval` (which is often not `missing` for performance and storage reasons). -Rasters.jl is also the reason DimensionalData.jl exists at all! -But it always made sense to separate out spatial indexing from -GIS tools and dependencies. +Rasters.jl is also the reason DimensionalData.jl exists at all! But it always made sense to separate out spatial indexing from GIS tools and dependencies. -A `Raster` is a `AbstractDimArray`, a `RasterStack` is a `AbstractDimStack`, -and `Projected` and `Mapped` are `AbstractSample` lookups. +A `Raster` is a `AbstractDimArray`, a `RasterStack` is a `AbstractDimStack`, and `Projected` and `Mapped` are `AbstractSampled` lookups. ## YAXArrays.jl -[YAXArrays.jl](https://juliadatacubes.github.io/YAXArrays.jl/dev/) is another -spatial data package aimed more at (very) large datasets. It's functionality -is slowly converging with Rasters.jl (both wrapping DiskArray.jl/DimensionalData.jl) -and we work closely with the developers. +[YAXArrays.jl](https://juliadatacubes.github.io/YAXArrays.jl/dev/) is another spatial data package aimed more at (very) large datasets. Its functionality is slowly converging with Rasters.jl (both wrapping DiskArrays.jl/DimensionalData.jl) and we work closely with the developers. `YAXArray` is a `AbstractDimArray` and inherits its behaviours. ## ClimateBase.jl [ClimateBase.jl](https://juliaclimate.github.io/ClimateBase.jl/dev/) -Extends DD with methods for analysis of climate data. +Extends DimensionalData.jl with methods for analysis of climate data. ## ArviZ.jl diff --git a/docs/src/object_modification.md b/docs/src/object_modification.md index 27d9f5e8b..bbf5c48d5 100644 --- a/docs/src/object_modification.md +++ b/docs/src/object_modification.md @@ -1,4 +1,4 @@ -# Modifying objects +# Modifying Objects DimensionalData.jl objects are all `struct` rather than `mutable struct`. The only things you can modify in-place @@ -28,7 +28,7 @@ parent(A_mod) == stack -For a stack this applied to all layers, and is where `modify` +For a stack, this applies to all layers, and is where `modify` starts to be more powerful: ````@ansi helpers diff --git a/docs/src/selectors.md b/docs/src/selectors.md index d3de21cd5..78aac4bf9 100644 --- a/docs/src/selectors.md +++ b/docs/src/selectors.md @@ -1,9 +1,8 @@ # Selectors -As well as choosing dimensions by name, we can also select values in them. +In addition to choosing dimensions by name, we can also select values within them. -First, we can create `DimArray` with lookup values as well as -dimension names: +First, we can create a `DimArray` with lookup values as well as dimension names: ````@example selectors using DimensionalData @@ -13,13 +12,13 @@ using DimensionalData A = rand(X(1.0:0.2:2.0), Y([:a, :b, :c])) ```` -Then we can use [`Selector`](@ref) to select values from the array: +Then we can use the [`Selector`](@ref) to select values from the array: ::: tabs == At -[`At(x)`](@ref) gets the index or indices exactly matching the passed in value/s. +The [`At(x)`](@ref) selector gets the index or indices exactly matching the passed in value(s). ````@ansi selectors A[X=At(1.2), Y=At(:c)] @@ -39,8 +38,7 @@ A[X=At(1.2:0.2:1.5), Y=At([:a, :c])] == Near -[`Near(x)`](@ref) gets the closest index to the passed in value(s), -indexing with an `Int`. +The [`Near(x)`](@ref) selector gets the closest index to the passed in value(s), indexing with an `Int`. ````@ansi selectors A[X=Near(1.245)] @@ -54,9 +52,9 @@ A[X=Near(1.1:0.25:1.5)] == Contains -[`Contains(x)`](@ref) get indices where the value x falls within an interval in the lookup. +The [`Contains(x)`](@ref) selector gets indices where the value x falls within an interval in the lookup. -First set the `X` axis to be `Intervals`: +First, set the `X` axis to be `Intervals`: ````@ansi selectors using DimensionalData.Lookups @@ -64,13 +62,13 @@ A_intervals = set(A, X => Intervals(Start())) intervalbounds(A_intervals, X) ```` -With a single value it is like indexing with `Int` +With a single value, it is like indexing with `Int` ````@ansi selectors A_intervals[X=Contains(1.245)] ```` -`Contains` can also take vectors and ranges, which is lick indexing with `Vector{Int}` +`Contains` can also take vectors and ranges, which is like indexing with `Vector{Int}` ````@ansi selectors A_intervals[X=Contains(1.1:0.25:1.5)] @@ -78,7 +76,7 @@ A_intervals[X=Contains(1.1:0.25:1.5)] == .. -`..` or `IntervalSets.Interval` selects a range of values: +The `..` or `IntervalSets.Interval` selector selects a range of values: `..` is like indexing with a `UnitRange`: ````@ansi selectors @@ -96,10 +94,9 @@ A[X=Interval{:close,:open}(1.2 .. 1.6)] == Touches -[`Touches`](@ref) is like `..`, but for `Intervals` it will include -intervals touched by the selected interval, not inside it. +The [`Touches`](@ref) selector is like `..`, but for `Intervals`, it will include intervals touched by the selected interval, not inside it. -This usually means including zero, one or two cells more than `..` +This usually means including zero, one, or two cells more than `..` `Touches` is like indexing with a `UnitRange` ````@ansi selectors @@ -109,7 +106,7 @@ A_intervals[X=1.1 .. 1.5] == Where -[`Where(f)`](@ref) filter the array axis by a function of the dimension index values. +The [`Where(f)`](@ref) selector filters the array axis by a function of the dimension index values. `Where` is like indexing with a `Vector{Bool}`: ````@ansi selectors @@ -118,7 +115,7 @@ A[X=Where(>=(1.5)), Y=Where(x -> x in (:a, :c))] == Not -`Not(x)` get all indices _not_ selected by `x`, which can be another selector. +The `Not(x)` selector gets all indices _not_ selected by `x`, which can be another selector. `Not` is like indexing with a `Vector{Bool}`. ````@ansi selectors @@ -130,9 +127,7 @@ A[X=Not(Near(1.3)), Y=Not(Where(in((:a, :c))))] ## Lookups Selectors find indices in the `Lookup` of each dimension. -Lookups wrap other `AbstractArray` (often `AbstractRange`) but add -additional traits to facilitate fast lookups or specifying point or interval -behaviour. These are usually detected automatically. +Lookups wrap other `AbstractArray` (often `AbstractRange`) but add additional traits to facilitate fast lookups or specifying point or interval behaviour. These are usually detected automatically. ````@example selectors @@ -142,18 +137,17 @@ using DimensionalData.Lookups == Sampled lookups -[`Sampled(x)`](@ref) lookups hold values sampled along an axis. +The [`Sampled(x)`](@ref) lookup holds values sampled along an axis. They may be `Ordered`/`Unordered`, `Intervals`/`Points`, and `Regular`/`Irregular`. Most of these properties are usually detected automatically, -but here we create a [`Sampled`](@ref) lookup manually: +but here we create a `Sampled` lookup manually: ````@ansi selectors l = Sampled(10.0:10.0:100.0; order=ForwardOrdered(), span=Regular(10.0), sampling=Intervals(Start())) ```` -To specify `Irregular` `Intervals` we should include the outer bounds of the -lookup, as we cant determine them from the vector. +To specify `Irregular` `Intervals`, we should include the outer bounds of the lookup, as we can't determine them from the vector. ````@ansi selectors l = Sampled([13, 8, 5, 3, 2, 1]; order=ForwardOrdered(), span=Irregular(1, 21), sampling=Intervals(Start())) @@ -161,7 +155,7 @@ l = Sampled([13, 8, 5, 3, 2, 1]; order=ForwardOrdered(), span=Irregular(1, 21), == Categorical lookup -[`Categorical(x)`](@ref) a categorical lookup that holds categories, +The [`Categorical(x)`](@ref) lookup is a categorical lookup that holds categories, and may be ordered. Create a [`Categorical`](@ref) lookup manually @@ -172,7 +166,7 @@ l = Categorical(["mon", "tue", "weds", "thur", "fri", "sat", "sun"]; order=Unord == Cyclic lookups -[`Cyclic(x)`](@ref) an `AbstractSampled` lookup for cyclical values. +The [`Cyclic(x)`](@ref) lookup is an `AbstractSampled` lookup for cyclical values. Create a [`Cyclic`](@ref) lookup that cycles over 12 months. @@ -181,8 +175,7 @@ using Dates l = Cyclic(DateTime(2000):Month(1):DateTime(2000, 12); cycle=Month(12), sampling=Intervals(Start())) ```` -There is a shorthand to make a `DimArray` from a `Dimension` with a function -of the lookup values. Here we convert the values to the month names: +There is a shorthand to make a `DimArray` from a `Dimension` with a function of the lookup values. Here we convert the values to the month names: ````@ansi selectors A = DimArray(monthabbr, X(l)) @@ -197,9 +190,9 @@ A[At(DateTime(3047, 9))] == NoLookup -[`NoLookup(x)`](@ref) no lookup values provided, so `Selector`s will not work. +The [`NoLookup(x)`](@ref) lookup has no lookup values provided, so `Selector`s will not work. When you create a `DimArray` without a lookup array, `NoLookup` will be used. -It is also not show in REPL printing. +It is also not shown in REPL printing. Here we create a [`NoLookup`](@ref): @@ -230,10 +223,10 @@ This array has a `Sampled` lookup with `ForwardOrdered` `Regular` Most lookup types and properties are detected automatically like this from the arrays and ranges used. -- Arrays and ranges of `String`, `Symbol` and `Char` are set to `Categorical` lookup. - - `order` is detected as `Unordered`, `ForwardOrdered` or `ReverseOrdered` -- Arrays and ranges of `Number`, `DateTime` and other things are set to `Sampled` lookups. - - `order` is detected as `Unordered`, `ForwardOrdered` or `ReverseOrdered`. +- Arrays and ranges of `String`, `Symbol`, and `Char` are set to `Categorical` lookup. + - `order` is detected as `Unordered`, `ForwardOrdered`, or `ReverseOrdered` +- Arrays and ranges of `Number`, `DateTime`, and other things are set to `Sampled` lookups. + - `order` is detected as `Unordered`, `ForwardOrdered`, or `ReverseOrdered`. - `sampling` is set to `Points()` unless the values are `IntervalSets.Interval`, then `Intervals(Center())` is used. - `span` is detected as `Regular(step(range))` for `AbstractRange` and @@ -247,8 +240,8 @@ from the arrays and ranges used. ## `DimSelector` We can also index with arrays of selectors [`DimSelectors`](@ref). -These are like `CartesianIndices` or [`DimIndices`](@ref) but holding -`Selectors` `At`, `Near` or `Contains`. +These are like `CartesianIndices` or [`DimIndices`](@ref), but holding +the `Selectors` `At`, `Near`, or `Contains`. ````@ansi selectors A = rand(X(1.0:0.2:2.0), Y(10:2:20)) @@ -266,11 +259,10 @@ And we can simply select values from `B` with selectors from `A`: B[DimSelectors(A)] ```` -If the lookups aren't aligned we can use `Near` instead of `At`, -which like doing a nearest neighbor interpolation: +If the lookups aren't aligned, we can use `Near` instead of `At`, +which is like doing a nearest neighbor interpolation: ````@ansi selectors C = rand(X(1.0:0.007:2.0), Y(10.0:0.9:30)) C[DimSelectors(A; selectors=Near)] ```` - diff --git a/docs/src/stacks.md b/docs/src/stacks.md index 8358f8b0d..129b37c8e 100644 --- a/docs/src/stacks.md +++ b/docs/src/stacks.md @@ -2,7 +2,7 @@ An `AbstractDimStack` represents a collection of `AbstractDimArray` layers that share some or all dimensions. For any two layers, a dimension -of the same name must have the identical lookup - in fact only one is stored +of the same name must have the identical lookup - in fact, only one is stored for all layers to enforce this consistency. @@ -12,8 +12,8 @@ x, y = X(1.0:10.0), Y(5.0:10.0) st = DimStack((a=rand(x, y), b=rand(x, y), c=rand(y), d=rand(x))) ```` -The behaviour of a `DimStack` is at times like a `NamedTuple` of -`DimArray` and, others an `AbstractArray` of `NamedTuple`. +The behavior of a `DimStack` is at times like a `NamedTuple` of +`DimArray` and, at other times, an `AbstractArray` of `NamedTuple`. ## NamedTuple-like indexing @@ -191,7 +191,7 @@ dropdims(sum_st; dims=Y) ::: -`broadcast_dims` broadcasts functions over any mix of `AbstractDimStack` and +[`broadcast_dims`](@ref) broadcasts functions over any mix of `AbstractDimStack` and `AbstractDimArray` returning a new `AbstractDimStack` with layers the size of the largest layer in the broadcast. This will work even if dimension permutation does not match in the objects. @@ -254,8 +254,8 @@ PermutedDimsArray(st, (Y, X)) ## Performance -Indexing stack is fast - indexing a single value return a `NamedTuple` from all -layers is usually measures in nanoseconds, and no slower than manually indexing +Indexing a stack is fast - indexing a single value and returning a `NamedTuple` from all +layers is usually measured in nanoseconds, and no slower than manually indexing into each parent array directly. There are some compilation overheads to this though, and stacks with very many diff --git a/docs/src/tables.md b/docs/src/tables.md index bab1e985b..d872ccce8 100644 --- a/docs/src/tables.md +++ b/docs/src/tables.md @@ -1,18 +1,10 @@ # Tables and DataFrames -[Tables.jl](https://github.com/JuliaData/Tables.jl) provides an -ecosystem-wide interface to tabular data in Julia, giving interoperability with -[DataFrames.jl](https://dataframes.juliadata.org/stable/), -[CSV.jl](https://csv.juliadata.org/stable/) and hundreds of other -packages that implement the standard. +[Tables.jl](https://github.com/JuliaData/Tables.jl) provides an ecosystem-wide interface to tabular data in Julia, ensuring interoperability with [DataFrames.jl](https://dataframes.juliadata.org/stable/), [CSV.jl](https://csv.juliadata.org/stable/), and hundreds of other packages that implement the standard. -DimensionalData.jl implements the Tables.jl interface for -`AbstractDimArray` and `AbstractDimStack`. `DimStack` layers -are unrolled so they are all the same size, and dimensions -loop to match the length of the largest layer. +DimensionalData.jl implements the Tables.jl interface for `AbstractDimArray` and `AbstractDimStack`. `DimStack` layers are unrolled so they are all the same size, and dimensions loop to match the length of the largest layer. -Columns are given the [`name`](@ref) or the array or the stack layer key. -`Dimension` columns use the `Symbol` version (the result of `DD.name(dimension)`). +Columns are given the [`name`](@ref) of the array or stack layer, and the result of `DD.name(dimension)` for `Dimension` columns. Looping of dimensions and stack layers is done _lazily_, and does not allocate unless collected. @@ -33,13 +25,13 @@ x, y, c = X(1:10), Y(1:10), Dim{:category}('a':'z') ::: tabs -== create a `DimArray` +== Create a `DimArray` ````@ansi dataframe A = rand(x, y, c; name=:data) ```` -== create a `DimStack` +== Create a `DimStack` ````@ansi dataframe st = DimStack((data1 = rand(x, y), data2=rand(x, y, c))) @@ -51,7 +43,7 @@ st = DimStack((data1 = rand(x, y), data2=rand(x, y, c))) ::: tabs -== array default +== Array Default Arrays will have columns for each dimension, and only one data column @@ -59,10 +51,9 @@ Arrays will have columns for each dimension, and only one data column DataFrame(A) ```` -== stack default +== Stack Default -Stacks will become a table with a column for each dimension, -and one for each layer: +Stacks will become a table with a column for each dimension, and one for each layer: ````@ansi dataframe DataFrame(st) @@ -91,8 +82,7 @@ DataFrame(DimTable(st; mergedims=(:X, :Y)=>:XY)) ## Converting to CSV -We can also write arrays and stacks directly to CSV.jl, or -any other data type supporting the Tables.jl interface. +We can also write arrays and stacks directly to CSV.jl, or any other data type supporting the Tables.jl interface. ````@example dataframe using CSV