-
Notifications
You must be signed in to change notification settings - Fork 166
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ambiguity between n_samples
and n_timestamps
#74
Comments
Hi, Thank you for your interest in pyts. The usual input is a set of time series, represented as a 2D-array with shape
If I understood correctly, you have a single time series of geometric brownian motion with 1000 time points, so the expected shape is Regarding the documentation of the I hope this answers you question! |
Thanks for the quick reply! Ok, I think I understand now. So the library is equipped to handle multivariate time series of shape |
Yes and no ^^ A dataset of multivariate time series is represented as 3D-array with shape
The library is mainly focused on time series classification. To train an algorithm, you usually need several samples (time series) from each class, which is why we always consider a set of time series and not a single time series. The corresponding axis is always the first axis. We consider that a multivariate time series is different from several univariate time series because a multivariate has a single class, while several univariate time series have a class for each time series. But in the case of recurrence plots, we don't use the classes to perform the transformation. For multivariate time series, you can have a look at Cross and Joint Recurrence Plots. pyts provides only an implementation of joint recurrence plots: pyts.multivariate.image.JointRecurrencePlot |
I have a dataset with multivariate time series data, with available knowledge that values of some of the features if greater than a threshold then it is strongly associated with a specific class. I am trying to incorporate that information in the threshold and percentage arguments of JointRecurrencePlot(). But not clearly understanding what does the Distance and dimension really mean? |
There is a small description (for the univariate case, but the idea is similar in the multivariate case) of a recurrence plot in the user guide (https://pyts.readthedocs.io/en/stable/modules/image.html#recurrence-plot). The idea of a recurrence plot is to compare trajectories in a time series. A trajectory is defined by:
If you only want to compare single time points, you just have to set the dimension to 1 (which is the default value). Regarding the threshold used to binarize the image, you can set the value (by providing a float) or you can automatically compute the threshold given a strategy. For instance:
Finally, a joint recurrence plot is simply the Hadamard (element-wise) product between all the recurrence plots (one recurrence plot for each feature). You can set different values for the Let me know if this is clearer now! |
Hi, thanks for this great library.
I am trying to create recurrence plots for a time series of geometric brownian motion, but however I try to set the parameters in
RecurrencePlot
, I keep getting errors.I have
dimension
as an integer, it is set to 20, so it is greater than or equal to 1. What I don't understand is how 20 isn't lower thann_timestamps
, because I do not fully understand whatn_timestamps
is.As far as I understand my data
x
, it is shaped(10000, 1)
which is(n_samples, n_features)
i.e. each row is a unique 'sample' ordered chronologically, and it only has one column (one 'feature') as the time series is univariate. In addition, the documentation forRecurrencePlot.fit_transform
states that the inputX
must have shape[n_samples, n_features]
, which as far as I understand, my data does have that shape.What am I doing wrong here? What is the difference between
n_samples
andn_timestamps
? Thanks in advanceThe text was updated successfully, but these errors were encountered: