Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add DetectSeasonality as a Helper function in TimeSeries ExtensionDialog #5231

Merged
merged 17 commits into from
Jun 22, 2020

Conversation

lisahua
Copy link
Contributor

@lisahua lisahua commented Jun 11, 2020

We are excited to review your PR.

So we can do the best job, please check:

  • There's a descriptive title that will make sense to other developers some time from now.
  • There's associated issues. All PR's should have issue(s) associated - unless a trivial self-evident change such as fixing a typo. You can use the format Fixes #nnnn in your description to cause GitHub to automatically close the issue(s) when your PR is merged.
  • Your change description explains what the change does, why you chose your approach, and anything else that reviewers should know.
  • You have included any necessary tests in the same PR.

This PR is part of Feature Request: #5230 : Add Seasonality Detection for Time-Series Data

Description

In time series data, seasonality is the presence of variations that occur at specific regular intervals less than a year, such as weekly, monthly, or quarterly.

In this PR, we propose to provide Seasonality Detection Support for Time-Series Data based on fourier analysis.

  • DetectSeasonality API:
        /// <summary>
        /// Obtain the period by adopting techniques of spectral analysis. which is founded by
        /// the fourier analysis. returns -1 means there's no significant period. otherwise, a period
        /// is returned.
        /// </summary>
        /// <param name="catalog">The detect seasonality catalog.</param>
        /// <param name="input">Input DataView.The data is an instance of <see cref="Microsoft.ML.IDataView"/>.</param>
        /// <param name="inputColumnName">Name of column to process. The column data must be <see cref="System.Double"/>.</param>
        /// <param name="seasonalityWindowSize">An upper bound on the largest relevant seasonality in the input time-series.
        /// When set to -1, use the whole input to fit model, when set to a positive integer, use this number as batch size.
        /// Default value is -1.</param>
        /// <returns>The detected period if seasonality period exists, otherwise return -1.</returns>
        public static int DetectSeasonality(this AnomalyDetectionCatalog catalog, IDataView input, string inputColumnName, int seasonalityWindowSize = -1)

This PR introduced:

  1. DetectSeasonality API to ExtensionDialog in TimeSeries project
  2. Internal class SeasonalityDetector that implements the actual logic based on fourier tranform
  3. Sample DetectSeasonality in docs/sample/timeseries folder.
  4. Unit Tests in TimeSeriesDirectApi file
  5. Change MedianDblAggregator to be BestFriend and use it in SeasonalityDetector

@lisahua lisahua requested a review from a team as a code owner June 11, 2020 16:55
@dnfadmin
Copy link

dnfadmin commented Jun 11, 2020

CLA assistant check
All CLA requirements met. #Resolved

@mstfbl mstfbl linked an issue Jun 11, 2020 that may be closed by this pull request
@codecov
Copy link

codecov bot commented Jun 15, 2020

Codecov Report

Merging #5231 into master will decrease coverage by 4.16%.
The diff coverage is 72.94%.

@@            Coverage Diff             @@
##           master    #5231      +/-   ##
==========================================
- Coverage   73.47%   69.30%   -4.17%     
==========================================
  Files        1010      771     -239     
  Lines      187988   145018   -42970     
  Branches    20262    18461    -1801     
==========================================
- Hits       138118   100502   -37616     
+ Misses      44385    39343    -5042     
+ Partials     5485     5173     -312     
Flag Coverage Δ
#Debug 69.30% <72.94%> (-4.17%) ⬇️
#production 69.30% <72.94%> (+0.01%) ⬆️
#test ?
Impacted Files Coverage Δ
src/Microsoft.ML.TimeSeries/SeasonalityDetector.cs 71.77% <71.77%> (ø)
...Microsoft.ML.Data/Transforms/NormalizeColumnDbl.cs 70.72% <100.00%> (ø)
src/Microsoft.ML.TimeSeries/ExtensionsCatalog.cs 93.61% <100.00%> (-1.39%) ⬇️
src/Microsoft.ML.AutoML/Utils/BestResultUtil.cs 53.84% <0.00%> (-3.69%) ⬇️
....ML.AutoML/PipelineSuggesters/PipelineSuggester.cs 79.83% <0.00%> (-3.37%) ⬇️
src/Microsoft.ML.TimeSeries/RootCauseAnalyzer.cs 54.20% <0.00%> (-2.25%) ⬇️
...or/CodeGenerator/CSharp/TrainerGeneratorFactory.cs 81.25% <0.00%> (-2.09%) ⬇️
...rc/Microsoft.ML.Featurizers/DateTimeTransformer.cs 87.15% <0.00%> (-1.81%) ⬇️
...crosoft.ML.TimeSeries/RootCauseLocalizationType.cs 49.42% <0.00%> (-1.77%) ⬇️
... and 273 more

1. Change Randomness threshold to [0, 1] range as confidence internal and map to inverse normal cumulative distribution
2. Update unit tests to use sin(2pi + x)
3. Other formatting issues
Copy link
Contributor

@harishsk harishsk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@harishsk harishsk merged commit bb13d62 into dotnet:master Jun 22, 2020
@ghost ghost locked as resolved and limited conversation to collaborators Mar 18, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Feature Request] Add Seasonality Detection for Time-Series Data
5 participants