Skip to content
This repository was archived by the owner on Nov 22, 2022. It is now read-only.

Skip loading extra data for export #1051

Closed
wants to merge 1 commit into from

Conversation

mwu1993
Copy link
Contributor

@mwu1993 mwu1993 commented Oct 14, 2019

Summary: When performing a standalone model export, we just need one batch of data to pass through. But we currently get the data via data.batches(...) which could use PoolingBatcher, which loads many many examples, or could have in_memory=True, which loads all examples. This can take hours for a large dataset (especially since we also tokenize the data). Add a flag for batches() that forcibly skips loading all data into memory and using pooling.

Differential Revision: D17920179

Summary: When performing a standalone model export, we just need one batch of data to pass through. But we currently get the data via `data.batches(...)` which could use PoolingBatcher, which loads many many examples, or could have `in_memory=True`, which loads all examples. This can take hours for a large dataset (especially since we also tokenize the data). Add a flag for `batches()` that forcibly skips loading all data into memory and using pooling.

Differential Revision: D17920179

fbshipit-source-id: b878c9594d4f41752a8f0fb835d706602e68c1eb
@facebook-github-bot facebook-github-bot added the CLA Signed Do not delete this pull request or issue due to inactivity. label Oct 14, 2019
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D17920179

@facebook-github-bot
Copy link
Contributor

This pull request has been merged in a0f6002.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
CLA Signed Do not delete this pull request or issue due to inactivity. Merged
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants