Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issues getting WEASEL transform of just one single time series #27

Open
robotdude17 opened this issue Aug 2, 2019 · 1 comment
Open

Comments

@robotdude17
Copy link

I'm trying to get a WEASEL transform of just one single time series and am running into issues, see below.

Please advise.

Thanks.

from pyts.transformation import WEASEL

Parameters

n_samples, n_timestamps = 1, 100
n_classes = 1

Toy dataset

rng = np.random.RandomState(41)
X = rng.randn(n_samples, n_timestamps)
y = rng.randint(n_classes, size=n_samples)

WEASEL transformation

weasel = WEASEL(word_size = 2, n_bins = 2, window_sizes=[12, 36])

X_weasel = weasel.fit_transform(X, y).toarray()

X_weasel = weasel.fit_transform(X, y)

X_weasel = weasel.fit_transform(np.array(X), np.array(y)).toarray()

Visualize the transformation for the first time series

plt.figure(figsize=(12, 8))
vocabulary_length = len(weasel.vocabulary_)
width = 0.3
plt.bar(np.arange(vocabulary_length) - width / 2, X_weasel[0],
width=width, label='First time series')
plt.xticks(np.arange(vocabulary_length),
np.vectorize(weasel.vocabulary_.get)(np.arange(X_weasel[0].size)),
fontsize=12, rotation=60)
plt.yticks(np.arange(np.max(X_weasel[:2] + 1)), fontsize=12)
plt.xlabel("Words", fontsize=18)
plt.ylabel("Frequencies", fontsize=18)
plt.title("WEASEL transformation", fontsize=20)
plt.legend(loc='best')
plt.show()


ValueError Traceback (most recent call last)
in
13 weasel = WEASEL(word_size = 2, n_bins = 2, window_sizes=[12, 36])
14 # X_weasel = weasel.fit_transform(X, y).toarray()
---> 15 X_weasel = weasel.fit_transform(X, y)
16 # X_weasel = weasel.fit_transform(np.array(X), np.array(y)).toarray()
17

~/anaconda3/envs/tf36/lib/python3.6/site-packages/pyts/transformation/weasel.py in fit_transform(self, X, y)
258 )
259 y_repeated = np.repeat(y, n_windows)
--> 260 X_sfa = sfa.fit_transform(X_windowed, y_repeated)
261
262 X_word = np.asarray([''.join(X_sfa[i])

~/anaconda3/envs/tf36/lib/python3.6/site-packages/pyts/approximation/sfa.py in fit_transform(self, X, y)
157 )
158 self.pipeline = Pipeline([('dft', dft), ('mcb', mcb)])
--> 159 X_sfa = self.pipeline.fit_transform(X, y)
160 self.support
= self.pipeline.named_steps['dft'].support
161 self.bin_edges
= self.pipeline.named_steps['mcb'].bin_edges

~/anaconda3/envs/tf36/lib/python3.6/site-packages/sklearn/pipeline.py in fit_transform(self, X, y, **fit_params)
391 return Xt
392 if hasattr(last_step, 'fit_transform'):
--> 393 return last_step.fit_transform(Xt, y, **fit_params)
394 else:
395 return last_step.fit(Xt, y, **fit_params).transform(Xt)

~/anaconda3/envs/tf36/lib/python3.6/site-packages/sklearn/base.py in fit_transform(self, X, y, **fit_params)
554 else:
555 # fit method of arity 2 (supervised transformation)
--> 556 return self.fit(X, y, **fit_params).transform(X)
557
558

~/anaconda3/envs/tf36/lib/python3.6/site-packages/pyts/approximation/mcb.py in fit(self, X, y)
113 self.check_constant(X)
114 self.bin_edges
= self._compute_bins(
--> 115 X, y, n_timestamps, self.n_bins, self.strategy)
116 return self
117

~/anaconda3/envs/tf36/lib/python3.6/site-packages/pyts/approximation/mcb.py in _compute_bins(self, X, y, n_timestamps, n_bins, strategy)
207 )
208 else:
--> 209 bins_edges = self._entropy_bins(X, y, n_timestamps, n_bins)
210 return bins_edges
211

~/anaconda3/envs/tf36/lib/python3.6/site-packages/pyts/approximation/mcb.py in _entropy_bins(self, X, y, n_timestamps, n_bins)
221 "The number of bins is too high for feature {0}. "
222 "Try with a smaller number of bins or remove "
--> 223 "this feature.".format(i)
224 )
225 bins[i] = threshold

ValueError: The number of bins is too high for feature 0. Try with a smaller number of bins or remove this feature.

@johannfaouzi
Copy link
Owner

Hi,

The WEASEL transformation is not suited for one single time series: it uses a binning procedure, and binning is pointless when there is one single data point. You need more samples to make it work.

Here is the paper describing the Symbolic Fourier Approximation, which is used in WEASEL. Figure 2 shows the binning process. It cannot work with a single time series.

I hope that it helps you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants