Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What is the recommended approach for storing and plotting complicated layers and stacks of data? #826

Closed
kapple19 opened this issue Oct 21, 2024 · 2 comments

Comments

@kapple19
Copy link

kapple19 commented Oct 21, 2024

Sorry to tag you again @jkrumbiegel but we may need your insights on this as well, particularly question 3.

I have datasets that are structured over varied dimension values in a layered nature.

Here's some code that demonstrates the structure.
Basically each bearing θ is for a different set of range r values,
and each range r has a different set of depth z values.

using Base: range as linrange
using DimensionalData

function rands_sorted_dim(dim::Symbol; N::Int = rand(5:12))
    return [0; rand(N)] |> sort! |> unique! |> Dim{dim}
end

"Set of depth-sound speed value pairs."
function rand_1D_profile()
    z = rands_sorted_dim(:z)
    return rebuild(
        rand(z);
        name = "1D Profile"
    )
end

"One 1D profile per range."
function rand_2D_profile()
    r = rands_sorted_dim(:r)
    return DimArray(
        [
            rand_1D_profile()
            for r_ in r
        ],
        r;
        name = "2D Profile"
    )
end

"One 2D profile per bearing."
function rand_3D_profile()
    θ = Dim{:θ}(linrange(0, step = 45, length = 8))
    return DimArray(
        [
            rand_2D_profile()
            for θ_ in θ
        ],
        θ;
        name = "3D Profile"
    )
end

prof3D = rand_3D_profile()

I'm opening this issue to ask the following questions:

  1. Inspection: Running prof3D in the REPL to take a look at its structure prints out a lot of information, is there a better way to view a nice summary of its structure? Even typeof(prof3D) still prints a lot. I currently do keys(prof3D). Maybe an AbstractTrees.jl interface?

  2. Structure: Some of my data (not demonstrated above) would be like DimStack but not share the same values over the same dimensions, e.g. sound speed could be for depths 0 : 100 : 200, but density could be for depths 0 : 50 : 200. What's the best way to store such?

  3. Visualisation: Do I have to use AlgebraOfGraphics.pregrouped for all this data? Or will there be supported syntax conveniences for such layers and stacks of data? I've found DimArrays and DimStacks work nicely with AlgebraOfGraphics.data so I don't have to use pregrouped. So it's just the semantics for nested DimArrays and DimStacks that I suppose I'm asking about. Especially for syntax like mapping(layout = :θ, color = :r).

I'm happy to try and contribute. Primarily asking to know if any thoughts on this already exist, and if so, what direction, and if I can help.

@rafaqz
Copy link
Owner

rafaqz commented Oct 21, 2024

I don't quite get the storing part in 2...
DimStacks can have any mix of dimensions, but the ones that are shared have to be identical, that's kind of the definition of what they are! Imagine how insane I would go programming here if selectors like Near had to work on possibly mixed dimensions for each layer.

Plotting DimStack with Makie just isn't implemented. Partly because multi axis plots are not easy in Makie recipes, and often multi axis is what you would expect to get. And again I don't know AlgebraOfGraphics at all!

For 1 yes nested DimArray show kinda sucks. In my packages like DynamicGtids and Rasters where there are single layers of nesting I actually hack show to fix it. (I kind of want to know why you need more that one layer of nesting now!)

We could do that more generically here, contributions appreciated.

@kapple19
Copy link
Author

kapple19 commented Oct 22, 2024

DimStacks can have any mix of dimensions, but the ones that are shared have to be identical, that's kind of the definition of what they are! Imagine how insane I would go programming here if selectors like Near had to work on possibly mixed dimensions for each layer.

I figured that was the case, and just thought I'd ask anyway. I'm not asking for DimensionalData to accommodate for mixed dimensions for each layer, I'm just checking on if there've been any features I've missed or misunderstood that would help my case.

I don't quite get the storing part in 2...

(I kind of want to know why you need more that one layer of nesting now!)

So from a single lat-lon, my data's first layer is of bearings θ.
For each bearing, there is a series of distances r from the lat-lon in the direction of the bearing.
And the innermost layer are pairs of depth z and sound speed c.
Here's a minimal example of my data.

[
    (; θ = 0) => [
        (; r = 0) => (; z = [0, 50, 200], c = [1500, 1480, 1520])
        (; r = 100) => (; z = [0, 100, 150, 210], c = [1510, 1490, 1515, 1520])
        (; r = 300) => (; z = [0, 120], c = [1510, 1495])
    ],
    (; θ = 45) => [
        (; r = 0) => (; z = [0, 205], c = [1505, 1525])
        (; r = 50) => (; z = [0, 100, 190], c = [1510, 1490, 1515])
        (; r = 200) => (; z = [0, 90, 173, 225], c = [1515, 1486, 1505, 1515])
    ]
    (; θ = 90) => [
        (; r = 0) => (; z = [0, 50, 200, 210], c = [1499, 1490, 1505, 1506])
        (; r = 90) => (; z = [0, 265], c = [1510, 1490])
        (; r = 250) => (; z = [0, 110, 300], c = [1525, 1495, 1500])
    ],
]

I hope I've explained it clearly.
I need the multiple layers because e.g. the range values r aren't the same for each bearing,
and the depth values z aren't the same across all bearings and ranges.

where there are single layers of nesting I actually hack show to fix it.

We could do that more generically here, contributions appreciated.

I'll see what I can do while keeping the spirit of the existing show method 😊

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants