Skip to content

Checkpointer cannot serialize functions to disk with JLD. #141

@ali-ramadhan

Description

@ali-ramadhan

I cannot get the checkpointing test running in PR #140 as JLD is not able to serialize the model to disk with forcing functions. We can go back to forcing arrays but we I think that's a bad idea as we should avoid increasing GPU memory usage.

I believe that JLD2.jl might be able to serialize functions to disk but it's not actively maintained anymore and their README says "If your tolerance for data loss is low, JLD may be a better choice at this time."

If we can fix this and figure out how to serialize functions to disk, then we may also be able to serialize the FFTW and CuFFT plans to disk (although we might still want to reconstruct them as in case the model is restored on a different computer with a different architecture).

Stacktrace:

Deserializing model from disk: test_model_checkpoint_5.jld
error parsing type string Oceananigans.Forcing{Oceananigans.#zero_func,Oceananigans.#zero_func,Oceananigans.#zero_func,Oceananigans.#zero_func,Oceananigans.#zero_func}
Checkpointing: Error During Test at D:\Home\Git\Oceananigans.jl\test\runtests.jl:246
  Got exception outside of a @test
  syntax: incomplete: premature end of input
  Stacktrace:
   [1] eval at .\boot.jl:328 [inlined]
   [2] eval at C:\Users\Ali\.julia\packages\JLD\1BoSz\src\JLD.jl:3 [inlined]
   [3] _julia_type(::String) at C:\Users\Ali\.julia\packages\JLD\1BoSz\src\JLD.jl:983
   [4] julia_type(::String) at C:\Users\Ali\.julia\packages\JLD\1BoSz\src\JLD.jl:30
   [5] jldatatype(::JLD.JldFile, ::HDF5.HDF5Datatype) at C:\Users\Ali\.julia\packages\JLD\1BoSz\src\jld_types.jl:701
   [6] read(::JLD.JldDataset) at C:\Users\Ali\.julia\packages\JLD\1BoSz\src\JLD.jl:370
   [7] read_ref(::JLD.JldFile, ::HDF5.HDF5ReferenceObj) at C:\Users\Ali\.julia\packages\JLD\1BoSz\src\JLD.jl:502
   [8] jlconvert(::Type{Model}, ::JLD.JldFile, ::Ptr{UInt8}) at C:\Users\Ali\.julia\packages\JLD\1BoSz\src\jld_types.jl:387
   [9] read_scalar(::JLD.JldDataset, ::HDF5.HDF5Datatype, ::Type) at C:\Users\Ali\.julia\packages\JLD\1BoSz\src\JLD.jl:398
   [10] read(::JLD.JldDataset) at C:\Users\Ali\.julia\packages\JLD\1BoSz\src\JLD.jl:370
   [11] read(::JLD.JldFile, ::String) at C:\Users\Ali\.julia\packages\JLD\1BoSz\src\JLD.jl:346
   [12] restore_from_checkpoint(::String) at D:\Home\Git\Oceananigans.jl\src\output_writers.jl:77
   [13] run_basic_checkpointer_tests() at D:\Home\Git\Oceananigans.jl\test\test_output_writers.jl:34
   [14] top-level scope at D:\Home\Git\Oceananigans.jl\test\runtests.jl:247
   [15] top-level scope at C:\cygwin\home\Administrator\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.1\Test\src\Test.jl:1083
   [16] top-level scope at D:\Home\Git\Oceananigans.jl\test\runtests.jl:247
   [17] top-level scope at C:\cygwin\home\Administrator\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.1\Test\src\Test.jl:1083
   [18] top-level scope at D:\Home\Git\Oceananigans.jl\test\runtests.jl:244
   [19] top-level scope at C:\cygwin\home\Administrator\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.1\Test\src\Test.jl:1083
   [20] top-level scope at D:\Home\Git\Oceananigans.jl\test\runtests.jl:243
   [21] include at .\boot.jl:326 [inlined]
   [22] include_relative(::Module, ::String) at .\loading.jl:1038
   [23] include(::Module, ::String) at .\sysimg.jl:29
   [24] include(::String) at .\client.jl:403
   [25] top-level scope at none:0
   [26] eval(::Module, ::Any) at .\boot.jl:328
   [27] exec_options(::Base.JLOptions) at .\client.jl:243
   [28] _start() at .\client.jl:436

Metadata

Metadata

Assignees

No one assigned

    Labels

    bug 🐞Even a perfect program still has bugshelp wanted 🦮plz halp (guide dog provided)

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions