Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch back to lazy loading data #110

Closed
egouldo opened this issue Aug 14, 2024 · 2 comments · Fixed by #133
Closed

Switch back to lazy loading data #110

egouldo opened this issue Aug 14, 2024 · 2 comments · Fixed by #133
Assignees
Labels
upkeep maintenance, infrastructure, and similar

Comments

@egouldo
Copy link
Owner

egouldo commented Aug 14, 2024

From https://r-pkgs.org/data.html

If the DESCRIPTION contains LazyData: true, then datasets will be lazily loaded. This means that they won’t occupy any memory until you use them. The following example shows memory usage before and after loading the nycflights13 package. You can see that memory usage doesn’t change significantly until you inspect the flights dataset stored inside the package.

lobstr::mem_used()
#> 58.74 MB
library(nycflights13)
lobstr::mem_used()
#> 60.68 MB

invisible(flights)
lobstr::mem_used()
#> 101.39 MB

We recommend that you include LazyData: true in your DESCRIPTION if you are shipping .rda files below data/.

I think the package loading was taking forever previously because I hadn't compressed the data... now that we have written the data with usethis::use_data(compress = "gzip") package load time might be ok. See also #90, #91.

@egouldo egouldo added the upkeep maintenance, infrastructure, and similar label Aug 14, 2024
@egouldo egouldo added this to the Software Manuscript Submit milestone Aug 14, 2024
@egouldo egouldo self-assigned this Aug 14, 2024
@egouldo
Copy link
Owner Author

egouldo commented Aug 14, 2024

  • Also vignette, example, or use this::use_data() scripts to call data objects using namespace rather than data(object).

@egouldo
Copy link
Owner Author

egouldo commented Aug 29, 2024

  • consider using butcher:: to reduce package data size (i.e. of models), see https://butcher.tidymodels.org/index.html also consider reducing duplicate storage of models across multiple objects (i.e. ManyEcoEvo_*viz and ManyEcoEvo_results).

@egouldo egouldo closed this as completed Aug 29, 2024
This was referenced Aug 29, 2024
@egouldo egouldo linked a pull request Aug 29, 2024 that will close this issue
egouldo added a commit to egouldo/ManyAnalysts that referenced this issue Sep 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
upkeep maintenance, infrastructure, and similar
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant