-
Notifications
You must be signed in to change notification settings - Fork 287
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make use of data more by devising subsampling #426
Comments
The information is based on the runs on dev branch. mem_limit, N, D = 3000, 500, 10000 # 3GB
mem_limit, N, D = 4000, 2500, 10000 # 4GB
mem_limit, N, D = 5000, 4500, 10000 # 5GB
mem_limit, N, D = 6000, 6500, 10000 # 6GB
mem_limit, N, D = 7000, 8500, 10000 # 7GB
mem_limit, N, D = 8000, 11500, 10000 # 8GB
mem_limit, N, D = 9000, 15000, 10000 # 9GB Note that since neural networks become larger when using a larger input size, I used a large fixed Since when we use memory_allocation = (memory_limit - 3000) / 1000.0 * 160 + 40 |
The information is based on the runs on common modification branch. mem_limit, N, D = 3000, 500, 10000 # 3GB
mem_limit, N, D = 4000, 3000, 10000 # 4GB
mem_limit, N, D = 5000, 5500, 10000 # 5GB
mem_limit, N, D = 6000, 8500, 10000 # 6GB
mem_limit, N, D = 7000, 11500, 10000 # 7GB
mem_limit, N, D = 8000, 15500, 10000 # 8GB The training ratio was 0.75, so we might need to take the 75% of those values. memory_allocation = (memory_limit - 3000) / 1000.0 * 150 + 30 |
When we use a certain
memory_allocation
1 insubsampling
, we reduce the number of samples until we reach the memory limit.However, we need to come up with an appropriate value for this as when we set it too high, the training fails due to memory error while when we set it too low, we waste memory.
For now, we circumvent this issue by measuring the memory consumption when using the default config.
Footnotes
The definition of the
memory_allocation
is the following:Absolute memory in MB, e.g. 10MB is
"memory_allocation": 10
.The memory used by the dataset is checked after each reduction method is performed.
If the dataset fits into the allocated memory, any further methods listed in
"methods"
will not be performed. ↩The text was updated successfully, but these errors were encountered: