Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixing some bugs, updating the docs and requirements #312

Merged
merged 6 commits into from
Mar 15, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
33 changes: 19 additions & 14 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -113,7 +113,13 @@ Considering the future workload, PyPOTS tutorials are released in a single repo,
and you can find them in [BrewPOTS](https://github.com/WenjieDu/BrewPOTS).
Take a look at it now, and learn how to brew your POTS datasets!

☕️ Welcome to the universe of PyPOTS. Enjoy it and have fun!
<p align="center">
<a href="https://pypots.com/ecosystem/">
<img src="https://pypots.com/figs/pypots_logos/Ecosystem/PyPOTS_Ecosystem_Pipeline.png" width="95%"/>
</a>
<br>
<b> ☕️ Welcome to the universe of PyPOTS. Enjoy it and have fun!</b>
</p>


## ❖ Installation
Expand Down Expand Up @@ -165,7 +171,7 @@ X = X.reshape(num_samples, 48, -1)
X_ori = X # keep X_ori for validation
X = mcar(X, 0.1) # randomly hold out 10% observed values as ground truth
dataset = {"X": X} # X for model input
print(X.shape) # (11988, 48, 37), 11988 samples, 48 time steps, 37 features
print(X.shape) # (11988, 48, 37), 11988 samples and each sample has 48 time steps, 37 features

# Model training. This is PyPOTS showtime.
saits = SAITS(n_steps=48, n_features=37, n_layers=2, d_model=256, d_inner=128, n_heads=4, d_k=64, d_v=64, dropout=0.1, epochs=10)
Expand Down Expand Up @@ -213,6 +219,16 @@ This functionality is implemented with the [Microsoft NNI](https://github.com/mi


## ❖ Citing PyPOTS
> [!TIP]
> **[Updates in Feb 2024]** 😎 Our survey paper [Deep Learning for Multivariate Time Series Imputation: A Survey](https://arxiv.org/abs/2402.04059) has been released on arXiv.
The code is open source in the GitHub repo [Awesome_Imputation](https://github.com/WenjieDu/Awesome_Imputation).
We comprehensively review the literature of the state-of-the-art deep-learning imputation methods for time series,
provide a taxonomy for them, and discuss the challenges and future directions in this field.
>
> **[Updates in Jun 2023]** 🎉 A short version of the PyPOTS paper is accepted by the 9th SIGKDD international workshop on
Mining and Learning from Time Series ([MiLeTS'23](https://kdd-milets.github.io/milets2023/))).
**Additionally**, PyPOTS has been included as a [PyTorch Ecosystem](https://pytorch.org/ecosystem/) project.

The paper introducing PyPOTS is available on arXiv at [this URL](https://arxiv.org/abs/2305.18811),
and we are pursuing to publish it in prestigious academic venues, e.g. JMLR (track for
[Machine Learning Open Source Software](https://www.jmlr.org/mloss/)). If you use PyPOTS in your work,
Expand All @@ -222,7 +238,7 @@ There are scientific research projects using PyPOTS and referencing in their pap
Here is [an incomplete list of them](https://scholar.google.com/scholar?as_ylo=2022&q=%E2%80%9CPyPOTS%E2%80%9D&hl=en>).

``` bibtex
@article{du2023PyPOTS,
@article{du2023pypots,
title={{PyPOTS: a Python toolbox for data mining on Partially-Observed Time Series}},
author={Wenjie Du},
year={2023},
Expand All @@ -239,17 +255,6 @@ doi={10.48550/arXiv.2305.18811},
> arXiv, abs/2305.18811.https://arxiv.org/abs/2305.18811


> [!TIP]
> **[Updates in Feb 2024]** 😎 Our survey paper [Deep Learning for Multivariate Time Series Imputation: A Survey](https://arxiv.org/abs/2402.04059) has been released on arXiv.
The code is open source in the GitHub repo [Awesome_Imputation](https://github.com/WenjieDu/Awesome_Imputation).
We comprehensively review the literature of the state-of-the-art deep-learning imputation methods for time series,
provide a taxonomy for them, and discuss the challenges and future directions in this field.
>
> **[Updates in Jun 2023]** 🎉 A short version of the PyPOTS paper is accepted by the 9th SIGKDD international workshop on
Mining and Learning from Time Series ([MiLeTS'23](https://kdd-milets.github.io/milets2023/))).
Besides, PyPOTS has been included as a [PyTorch Ecosystem](https://pytorch.org/ecosystem/) project.


## ❖ Contribution
You're very welcome to contribute to this exciting project!

Expand Down
12 changes: 4 additions & 8 deletions docs/install.rst
Original file line number Diff line number Diff line change
Expand Up @@ -26,13 +26,14 @@ Alternatively, you can install from the latest source code which may be not offi
Required Dependencies
"""""""""""""""""""""
* Python >=3.7
* h5py
* numpy
* scipy
* pandas
* matplotlib
* tensorboard
* scikit-learn
* pandas <2.0.0
* torch >=1.10.0
* tensorboard
* h5py
* tsdb >=0.2
* pygrinder >=0.2

Expand All @@ -56,11 +57,6 @@ In addition, note that Python v.3.7 has also been in the end-of-life status sinc
Hence, we will raise the minimum support Python version to v3.8 in the future.
Please use Python v3.8 or above if possible also for the security of your development environment.

* **Why we need pandas <2.0.0?**

Because v2 may cause ``ModuleNotFoundError: No module named 'pandas.core.indexes.numeric'``,
see https://stackoverflow.com/questions/75953279/modulenotfounderror-no-module-named-pandas-core-indexes-numeric-using-metaflo.

* **Why we need PyTorch >=1.10?**

Because of pytorch_sparse, please refer to https://github.com/rusty1s/pytorch_sparse/issues/207#issuecomment-1065549338.
Expand Down
27 changes: 13 additions & 14 deletions environment-dev.yml
Original file line number Diff line number Diff line change
Expand Up @@ -7,21 +7,20 @@ channels:
- nodefaults

dependencies:
## basic
#- conda-forge::python
#- conda-forge::pip
#- conda-forge::scipy
#- conda-forge::numpy
#- conda-forge::scikit-learn
#- conda-forge::pandas <2.0.0
#- conda-forge::h5py
#- conda-forge::tensorboard
#- conda-forge::pygrinder >=0.4
#- conda-forge::tsdb >=0.2
#- conda-forge::matplotlib
#- pytorch::pytorch >=1.10.0
# basic
- conda-forge::pip
- conda-forge::h5py
- conda-forge::numpy
- conda-forge::scipy
- conda-forge::python
- conda-forge::pandas
- conda-forge::matplotlib
- conda-forge::tensorboard
- conda-forge::scikit-learn
- conda-forge::pygrinder >=0.4
- conda-forge::tsdb >=0.2
- pytorch::pytorch >=1.10.0
## Below we install the latest pypots because we need pypots-cli in it for development.
## PyPOTS itself already includes all basic dependencies.
- conda-forge::pypots

# optional
Expand Down
10 changes: 8 additions & 2 deletions pypots/utils/visual/data.py
Original file line number Diff line number Diff line change
Expand Up @@ -88,7 +88,10 @@ def plot_data(
plt.setp(axes[row, 0], ylabel="value")
if row == -1:
plt.setp(axes[-1, col], xlabel="time")
plt.show()

logger.info(
"Plotting finished. Please invoke matplotlib.pyplot.show() to display the plot."
)


def plot_missingness(
Expand Down Expand Up @@ -166,4 +169,7 @@ def plot_missingness(
axes[1].set_xlabel(r"Sequence length", fontsize=7)
axes[1].set_ylabel("Frequency", fontsize=7)
axes[1].tick_params(axis="both", labelsize=7)
plt.show()

logger.info(
"Plotting finished. Please invoke matplotlib.pyplot.show() to display the plot."
)
12 changes: 6 additions & 6 deletions requirements.txt
Original file line number Diff line number Diff line change
@@ -1,13 +1,13 @@
# This requirements.txt file only include the basic dependencies for PyPOTS.
# Please refer to setup.cfg for more dependency details.

h5py
numpy
scikit-learn
matplotlib
scipy
torch>=1.10.0
pandas
matplotlib
tensorboard
pandas<2.0.0
pygrinder>=0.4
scikit-learn
torch>=1.10.0
tsdb>=0.2
h5py
pygrinder>=0.4
17 changes: 11 additions & 6 deletions setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@
"Download": "https://github.com/WenjieDu/PyPOTS/archive/main.zip",
},
keywords=[
"data science",
"data mining",
"neural networks",
"machine learning",
Expand All @@ -31,6 +32,7 @@
"time-series analysis",
"time series",
"imputation",
"interpolation",
"classification",
"clustering",
"forecasting",
Expand All @@ -44,16 +46,16 @@
packages=find_packages(exclude=["tests"]),
include_package_data=True,
install_requires=[
"h5py",
"numpy",
"scikit-learn",
"matplotlib",
"scipy",
"torch>=1.10.0",
"pandas",
"matplotlib",
"tensorboard",
"pandas<2.0.0",
"pygrinder>=0.4",
"scikit-learn",
"torch>=1.10.0",
"tsdb>=0.2",
"h5py",
"pygrinder>=0.4",
],
python_requires=">=3.7.0",
setup_requires=["setuptools>=38.6.0"],
Expand All @@ -63,13 +65,16 @@
"Intended Audience :: Developers",
"Intended Audience :: Education",
"Intended Audience :: Science/Research",
"Intended Audience :: Healthcare Industry",
"License :: OSI Approved :: BSD License",
"Operating System :: OS Independent",
"Programming Language :: Python :: 3",
"Programming Language :: Python :: 3.7",
"Programming Language :: Python :: 3.8",
"Programming Language :: Python :: 3.9",
"Programming Language :: Python :: 3.10",
"Programming Language :: Python :: 3.11",
"Topic :: Scientific/Engineering :: Artificial Intelligence",
"Topic :: Software Development :: Libraries :: Application Frameworks",
],
)
Loading