Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some small printing upgrades #2344

Merged
merged 5 commits into from
Oct 26, 2024
Merged

Some small printing upgrades #2344

merged 5 commits into from
Oct 26, 2024

Conversation

mcabbott
Copy link
Member

@mcabbott mcabbott commented Oct 13, 2023

This is intended for FluxML/Fluxperimental.jl#20 but probably a good idea anyway. Motivating example is something like this (where bigger arrays will print pages of numbers):

julia> struct Tmp2; x; y; end; Flux.@functor Tmp2

julia> Chain(Tmp2([Dense(2,3), randn(3,4)'], (x=1:3, y=Dense(3,4), z=rand(3))))
Chain(
  Tmp2(
    Array(
      Dense(2 => 3),                    # 9 parameters
      [0.351978391016603 0.6408681372462821 -1.326533184688648; 0.09481930831795712 
1.430103476272605 0.7250467613675332; 2.03372151428719 -0.015879812799495713 
1.9499692162118236; -1.6346846180722918 -0.8364610153059454 -1.2907265737483433],  # 12 parameters
    ),
    NamedTuple(
      1:3,                              # 3 parameters
      Dense(3 => 4),                    # 16 parameters
      [0.9666158193429335, 0.01613900990539574, 0.0205920186127464],  # 3 parameters
    ),
  ),
)                   # Total: 7 arrays, 43 parameters, 644 bytes.

Notice that Array() and NamedTuple() aren't actually valid syntax. Also that it prints whole arrays if they aren't inside a layer the way it expects. After this PR:

julia> Chain(Tmp2([Dense(2,3), randn(3,4)'], (x=1:3, y=Dense(3,4), z=rand(3))))
Chain(
  Tmp2(
    [
      Dense(2 => 3),                    # 9 parameters
      4×3 Adjoint,                      # 12 parameters
    ],
    (;
      x = 3-element UnitRange,          # 3 parameters
      y = Dense(3 => 4),                # 16 parameters
      z = 3-element Array,              # 3 parameters
    ),
  ),
)                   # Total: 7 arrays, 43 parameters, 644 bytes.

@ToucheSir
Copy link
Member

Is having the summary instead of the actual contents important for @compact? If not, my preference would be to keep the more accurate printing but drop that part.

@mcabbott
Copy link
Member Author

mcabbott commented Oct 14, 2023

The idea is that printing the values of a huge matrix isn't helpful, but showing the size is. This is what the present printing of e.g. Dense achieves.

But @compact seems to often want to store the arrays themselves without some layer.

(It's not exactly summary as this gets quite long for e.g. CuArrays)

@ToucheSir
Copy link
Member

Revisiting this, it would be nice to retain some more info such as eltype if the contents aren't printed. Though it's far from perfect, could we avoid a lot of complexity by setting the :limit and :compact IOContext properties when showing these leaf nodes?

@mcabbott
Copy link
Member Author

mcabbott commented Nov 28, 2023

Latest commit keeps eltype as suggested (or really, the first type parameter):

julia> struct Tmp2; x; y; end; Flux.@functor Tmp2

julia> Chain(Tmp2([Dense(2,3), randn(3,4)'], (x=1:3, y=Dense(3,4), z=rand(3))))
Chain(
  Tmp2(
    [
      Dense(2 => 3),                    # 9 parameters
      4×3 Adjoint{Float64,...},         # 12 parameters
    ],
    (;
      x = 3-element UnitRange{Int64},   # 3 parameters
      y = Dense(3 => 4),                # 16 parameters
      z = 3-element Vector{Float64},    # 3 parameters
    ),
  ),
)                   # Total: 7 arrays, 43 parameters, 780 bytes.

Note also that no model using existing layers is affected by this. The goal is mainly to make @compact layers print more like the existing ones, by describing their parameter arrays, not printing the actual numbers.

@darsnack
Copy link
Member

darsnack commented Dec 1, 2023

We were discussing this during our call this week. The consensus was that using IOContext(..., :limit => true) is preferable to a custom summary for arrays. This would keep things succinct while still matching Base standards/expectations.

There are times when the type (e.g. Adjoint) is desired. This case might be better addressed by offering something like Flax's tabulate.

Copy link

codecov bot commented Oct 23, 2024

Codecov Report

Attention: Patch coverage is 57.14286% with 6 lines in your changes missing coverage. Please review.

Project coverage is 71.06%. Comparing base (c9bab66) to head (3e09cb3).

Files with missing lines Patch % Lines
src/layers/show.jl 53.84% 6 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##           master    #2344       +/-   ##
===========================================
+ Coverage   33.46%   71.06%   +37.60%     
===========================================
  Files          31       31               
  Lines        1829     1911       +82     
===========================================
+ Hits          612     1358      +746     
+ Misses       1217      553      -664     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@mcabbott mcabbott merged commit 93e1de7 into FluxML:master Oct 26, 2024
6 of 9 checks passed
@mcabbott mcabbott deleted the show_array branch October 26, 2024 02:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants