-
Notifications
You must be signed in to change notification settings - Fork 772
Description
I'm running out of memory trying to fit a ExplainableBoostingClassifier.
The order of magnitude of the features is 10, of the samples is
Estimate memory usage of the model.
Is this the memory necessary to store the model (unlikely, as X and y are arguments), the memory necessary to fit the model (most likely, as y is an argument), or the memory necessary to make predictions (unlikely, predictions aren't memory intensive).
However, ExplainableBoostingClassifier.extimate_mem indicates tiny memory usage while in reality I run out of memory (more than 100GiB needed). The functions indicate memory usage independent on the number of classes, while for 'small' numbers of 1 to 100 classes, I observe the memory consumption to be roughly an affine function in the number of classes. For larger number of classes it seems to 'saturate', I guess the system starts caching to disk.
So far, I used resource.getrusage(resource.RUSAGE_SELF).ru_maxrss to measure the memory consumption, so I am not really experienced in measuring memory. For once, I am not sure if this includes the memory of the parallel processes used by default for each bags.
But for both variants
ebm = ExplainableBoostingClassifier(interactions=0)
with joblib.parallel_config(backend="loki"):
ebm.fit(X, y)ebm = ExplainableBoostingClassifier(interactions=0)
with joblib.parallel_config(backend="threading"):
ebm.fit(X, y)I observed a rather linear increase of memory consumption with classes.
To me this seems not surprising, as for each class a shape functions is fit.
The question is threefold: Is estimate_mem correct? What is the required memory to fit a classifier? What can I do to fit the model with limited memory (largest machine I can get has around 512 GiB)?
Probably it's best to use threading instead of the default of multiprocessing, or are you making use of smart shared memory or memory mapping data from disk?
Example script for minimal tests:
import resource
import joblib
import numpy as np
from interpret.glassbox import ExplainableBoostingClassifier
backend = "loky"
n_samples = 10_000
n_features = 10
n_classes = 100
rng = np.random.default_rng(42)
X = rng.random([n_samples, n_features], dtype=np.float32)
y = rng.integers(n_classes, size=[n_samples], dtype=np.int32)
ebm = ExplainableBoostingClassifier(interactions=0, random_state=42, max_rounds=3, n_jobs=16) # small max rounds for fast testing
print(f"Pre: {resource.getrusage(resource.RUSAGE_SELF).ru_maxrss * 1024**-2:.3f} GiB")
with joblib.parallel_config(backend=backend):
ebm.fit(X, y)
print(f"Post: {resource.getrusage(resource.RUSAGE_SELF).ru_maxrss * 1024**-2:.3f} GiB")