Skip to content

Profiling Code Runtime

Júlio Arend edited this page Apr 18, 2023 · 2 revisions

Setup

Follow these steps from pytest-profiling for a visualization of the code profile.

  1. pip install pytest-profiling
  2. pytest_plugins = ['pytest_profiling'] (not needed for NeuralProphet as we use setuptools entry points)
  3. Generate a test file that runs a sample model with a configuration of your choice, e.g. bottlenecks.py
  4. Run the file in your terminal using pytest path/to/file/bottlenecks.py --profile --profile-svg to profile the code (i.e. measure runtimes of the different code sections) and create a visualization (helpful to identify bottlenecks)
  5. pstats files (one per test item) are retained for later analysis in prof directory, along with the combined.prof and combined.svg files

Profiling NeuralProphet

One possible model configuration (single or panel time series) to be profiled:

    start_date = '2019-01-01'
    end_date = '2021-01-01'
    date_range = pd.date_range(start=start_date, end=end_date, freq='H')
    y = np.random.randint(0,1000, size=(len(date_range),))
    df = pd.DataFrame({"ds": date_range, "y": y, "ID": "df1"})
    df_c = df.copy()
    #for i in range(2,101):
    #    df_c['ID'] = f'df{i}'
    #    df = pd.concat((df,df_c))

    m = NeuralProphet(
        n_forecasts=24,
        n_lags=7*24,
        weekly_seasonality=True,
        daily_seasonality=True,
        yearly_seasonality=True,
        num_hidden_layers=4,
        d_hidden=64,
        epochs=10,
        batch_size=448,
        learning_rate=0.001,
    )
    df["A"] = df["y"].rolling(7, min_periods=1).mean()
    df["B"] = df["y"].rolling(30, min_periods=1).mean()
    df['C'] = df['y'].rolling(24, min_periods=1).mean()
    m = m.add_lagged_regressor(names="A", n_lags=24)
    m = m.add_lagged_regressor(names="B")
    m = m.add_lagged_regressor(names="C", n_lags=24, num_hidden_layers=2, d_hidden=48)
    metrics_df = m.fit(df, freq="H", num_workers=64, minimal=True)
    forecast = m.predict(df)

Profile with one ID

1_LBA

Profile with 10 IDs

10_LBA

Profile with 100 IDs

100_LBA

Findings

  • Runtime increases linearly with number of time series added to the dataset (as of now, April 16, 2023).
  • TimeDataset and its subfunctions are slow
  • drop_nan_after_init is always called, even without specifying drop_missing=True
  • Consider vectorization (or multiprocessing?) whenever there is for df_name, df in df.groupby('ID') in m.fit() and m.predict()
  • Consider FastTensorDataloaders instead of pytorch Dataloader https://towardsdatascience.com/better-data-loading-20x-pytorch-speed-up-for-tabular-data-e264b9e34352
  • regardless of normalization type selected ('global' or 'local'), we always compute both normalization params
  • pd.concat() in m.predict() is very slow for large datasets
  • TimeNet covariates are more time consuming than AR net. They are called one after another. Vectorize?
  • Understand num_workers= parameter, as the max number of CPU cores does not seem to be the best choice in all cases
  • preprocessing for large datasets takes a long time, reason unknown
  • Implement Ray Lightning? https://speakerdeck.com/anyscale/faster-time-series-forecasting-using-ray-and-anyscale