Skip to content

Let nx-datasets respect XDG cache locations #77

@tmattio

Description

@tmattio

Why it matters

Right now Dataset_utils.get_cache_dir always expands to ~/.cache/ocaml-nx/datasets/... (see nx-datasets/lib/dataset_utils.ml:54-62). That ignores both XDG_CACHE_HOME and any custom cache path users set for sandboxed environments. Outreachy applicants working on shared machines or CI systems need to redirect downloads to writable directories without patching the code.

How to reproduce

  1. Run XDG_CACHE_HOME=/tmp/outreachy-cache dune utop nx-datasets.
  2. Evaluate Nx_datasets.Dataset_utils.get_cache_dir "iris";; — it still prints a path under ~/.cache/ocaml-nx/ instead of /tmp/outreachy-cache/....
  3. The behaviour is the same even if you set NX_DATASETS_CACHE, because the code never checks that variable.

Your task

  • Update nx-datasets/lib/dataset_utils.ml so cache resolution prefers, in order:
    1. NX_DATASETS_CACHE (if set)
    2. XDG_CACHE_HOME (if set)
    3. Fall back to $HOME/.cache
  • Make sure the calculation happens each time we call get_cache_dir (a helper function returning a string is fine) so tests can override the env at runtime.
  • Keep the trailing slash in the returned path, and keep creating directories the same way.
  • Document the new lookup order in nx-datasets/lib/dataset_utils.mli.
  • Add an Alcotest case (e.g. test_dataset_utils.ml) that temporarily sets the env vars with Unix.putenv, calls get_cache_dir, and asserts the path prefix matches the env you set. Remember to restore the old values at the end of the test so other cases are not polluted.

Tips

  • Sys.getenv_opt (available since OCaml 4.06) is handy for optional environment variables.
  • Prefer Filename.concat when stitching parts so Windows paths keep working.
  • In tests, wrap env changes with Fun.protect to restore the original values even if the assertion fails.
  • Run dune runtest nx-datasets to make sure the new test and the existing generator suite stay green.

Done when

  • Calling get_cache_dir honours NX_DATASETS_CACHE/XDG_CACHE_HOME when they are set, and still falls back to the old behaviour otherwise.
  • The updated .mli explains the precedence clearly.
  • The new unit test fails on the current main branch and passes after your changes.

Metadata

Metadata

Assignees

No one assigned

    Labels

    good first issueGood for newcomersoutreachyIssues targeted at Outreachy applicants

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions