Description
I was looking at the internals of CWT to understand why it takes the integral of the wavelet here:
Lines 125 to 126 in 20f67ab
while SciPy's implementation does not. This appears to be because the PyWavelets implementation is doing things the way Matlab's original cwt
function was implemented. Specifically it is following the algorithm listed in this old version of their toolbox documentation (The algorithm does not appear to be listed in the current version of Matlab's online documentation).
Here is a concrete example comparing to scipy.signal.cwt
with the morlet2
wavelet from scipy/scipy#7076 to illustrate the issue:
import numpy as np
import matplotlib.pyplot as plt
import pywt
time, sst = pywt.data.nino()
dt = time[1] - time[0]
# Taken from http://nicolasfauchereau.github.io/climatecode/posts/wavelet-analysis-in-python/
fb = 2.0
fc = 1.0
wavelet = pywt.ContinuousWavelet('cmor{}-{}'.format(fb, fc))
scales = np.arange(1, 64)
[cfs, frequencies] = pywt.cwt(sst, scales, wavelet, dt)
power = (abs(cfs)) ** 2
from functools import partial
wav = partial(morlet2, w=2*np.pi * wavelet.center_frequency)
cfs_scipy = cwt(sst, wav, widths=scales, dtype=np.complex128)
fig, axes = plt.subplots(2, 1)
axes[0].imshow(np.abs(cfs))
axes[1].imshow(np.abs(cfs_scipy))
This gives the following results:
Note that there is quite a bit of zipper-like artifact in the pywt.cwt
output. It seems that increasing the precision
argument in the call to intwave
resolves the issue. I think the value of 10 highlighted above was chosen to match Matlab, but I think we should probably switch to a larger value. For most signals, the call to intwave is probably substantially shorter than the convolution itself, so I think it should not be problematic from a computation time standpoint to increase it a bit. (The length of int_psi
will be 2**precision
, but this will not change the eventual downsampled int_psi_scale
signal that is used during the convolutions.
The reason the zipper-like artifact occurs seems to be because this int_psi
waveform is computed once, but then integer indices are used to get versions at different scales:
Lines 149 to 154 in 20f67ab
The actual indices corresponding to the scales would actually be floating point, not int, so rounding them to integers gives a non-uniform step size across int_psi
when computing int_psi_scale
. The more int_psi
is oversampled, the less this is an issue.
Increasing precision
from 10 to 12 reduces the artifact:
and further increasing to 14 makes it no longer visible:
A separate issue from the artifact is the normalization convention used
In the figures above, the overall pattern looks the same, but there is some intensity difference across scales. I think this may be due to a different convention used for normalization of the wavelets. PyWavelets (and Matlab) use a normalization constant chosen to give unit L1 norm of the wavelet, while SciPy and some textbook/literature definitions use unit L2 norm. Matlab explains the rationale for their choice here. So, I wouldn't say either toolbox is "wrong", it just seems to be a matter of the convention used. We should probably make this a bit more explicit in the docs, though.