Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wenjie Updates #4

Merged
merged 83 commits into from
Apr 11, 2023
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
83 commits
Select commit Hold shift + click to select a range
425fc81
feat: enable auto reply on PRs created by new contributors;
WenjieDu Nov 29, 2022
144c4bf
feat: simplify requirements to speed up the installation process of P…
WenjieDu Dec 1, 2022
905ee65
feat: remove torch_geometric from the setup file as well to speed up …
WenjieDu Dec 1, 2022
3f119f8
doc: update README to add the usage example;
WenjieDu Dec 4, 2022
c9955a2
feat: print all outputs during test with pytest;
WenjieDu Dec 20, 2022
0d2f36a
Merge pull request #28 from WenjieDu/dev
WenjieDu Dec 20, 2022
fee98c0
feat: add MANIFEST.in to remove the test dir from the released package;
WenjieDu Dec 21, 2022
87ff2a1
fix: the bug of separating the code-coverage report;
WenjieDu Dec 21, 2022
2d478fd
fix: capture the error caused by singular matrix existence in VaDER;
WenjieDu Dec 21, 2022
154d014
doc: update the documentation;
WenjieDu Dec 21, 2022
196d1e9
doc: add the doc of all implemented modules;
WenjieDu Dec 25, 2022
601bd81
fix: add the dependencies of PyPOTS into the doc building requirement…
WenjieDu Dec 29, 2022
c53f6fb
doc: update README;
WenjieDu Jan 8, 2023
11e3ac7
Merge pull request #29 from WenjieDu/dev
WenjieDu Jan 13, 2023
f8da4f6
feat: add the lazy-loading strategy for BaseDataset;
WenjieDu Jan 16, 2023
456293a
doc: update README;
WenjieDu Feb 8, 2023
e0bb1b7
feat: add limitations on lib dependencies;
WenjieDu Jan 17, 2023
4116399
Merge pull request #33 from WenjieDu/dev
WenjieDu Feb 9, 2023
cf51ce2
feat: add class Logger to help present logs better;
WenjieDu Feb 12, 2023
7ec0f03
feat: replace print with logger;
WenjieDu Feb 12, 2023
5509128
feat: add the func create_dir_if_not_exist() in pypots.utils.files;
WenjieDu Feb 15, 2023
82126a2
fix: TypeError when using logger with mistake;
WenjieDu Feb 15, 2023
8e70636
refactor: update the logger;
WenjieDu Feb 15, 2023
cf0acde
feat: add the test cases for logging;
WenjieDu Feb 16, 2023
7c76e3a
feat: add the attribute __all__ into __init__ files;
WenjieDu Feb 16, 2023
8c584b5
doc: update README;
WenjieDu Feb 18, 2023
df2414b
feat: add the file lazy-loading strategy for classes derived from Bas…
WenjieDu Feb 19, 2023
831e9d4
doc: fix the reference ;
WenjieDu Feb 24, 2023
dc3c005
fix: update the dependencies;
WenjieDu Mar 9, 2023
818e7ef
Merge pull request #37 from WenjieDu/dev
WenjieDu Mar 20, 2023
e7b72bd
doc: update README to add pypots installation with conda;
WenjieDu Mar 28, 2023
0611df1
feat: separate the input data assembling functions of training, valid…
WenjieDu Mar 29, 2023
7cdd393
Merge pull request #38 from WenjieDu/dev
WenjieDu Mar 29, 2023
19c5bb3
doc: update the reference info;
WenjieDu Mar 29, 2023
343c8d8
Merge branch 'lazy_loading_dataset' into dev
WenjieDu Mar 30, 2023
3c56ce2
fix: imputation models applying MIT do not need use DatasetForMIT on …
WenjieDu Mar 30, 2023
5927909
fix: only import h5py when needed;
WenjieDu Mar 30, 2023
4a9c5be
feat: move check_input() to BaseDataset;
WenjieDu Mar 30, 2023
c71c8fa
fix: correct mistaken operator from & to ^;
WenjieDu Mar 30, 2023
af4586a
fix: turn imputation to numpy.ndarray in the validation stage;
WenjieDu Mar 30, 2023
fababb1
feat: update the data given and input logic to support loading datase…
WenjieDu Mar 30, 2023
7dfbf87
fix: bugs in Dataset classes' functions with lazy-loading strategy;
WenjieDu Mar 31, 2023
fdc1459
fix: update the dependencies;
WenjieDu Mar 31, 2023
ee5270a
feat: add testing cases for lazy-loading datasets;
WenjieDu Mar 31, 2023
8a4f682
doc: update README;
WenjieDu Mar 31, 2023
0fb57d4
feat: v0.0.10 is ready;
WenjieDu Mar 31, 2023
72eaf20
fix: running testing cases for forecasting models and lazy-loading da…
WenjieDu Mar 31, 2023
fa5f5b6
fix: running testing cases for logging;
WenjieDu Mar 31, 2023
e9aea74
fix: try to fix the BlockingIOError, see below message for details;
WenjieDu Mar 31, 2023
46fca41
refactor: test scripts;
WenjieDu Mar 31, 2023
13a7cd1
fix: use annotation @pytest.mark.xdist_group to help pytest-dist exec…
WenjieDu Mar 31, 2023
9ad9c7e
fix: fix some warnings while running VaDER;
WenjieDu Mar 31, 2023
e7bee57
fix: move dataset saving into test steps;
WenjieDu Mar 31, 2023
235c607
fix: the error file name of test_data.py;
WenjieDu Mar 31, 2023
f64dda9
Merge pull request #39 from WenjieDu/dev
WenjieDu Mar 31, 2023
f7fa13e
doc: update the documentation;
WenjieDu Apr 4, 2023
634c25a
doc: update the documentation;
WenjieDu Apr 4, 2023
3ac3185
Merge `dev` into `main` to update the documentation and add doc-gener…
WenjieDu Apr 4, 2023
8a856d8
refactor: preprocessing functions of specific dataset now move to mod…
WenjieDu Apr 6, 2023
88780e6
fix: solve the problem of circular import;
WenjieDu Apr 6, 2023
5def912
refactor: don't save data into h5 files if the datasets already exit;
WenjieDu Apr 6, 2023
6a103c7
feat: add issue templates of bug report, feature request, and model a…
WenjieDu Apr 7, 2023
4654961
Add issue templates (#41)
WenjieDu Apr 7, 2023
629ad6c
feat: turn the given device (str or torch.device) into torch.device;
WenjieDu Apr 7, 2023
e887390
feat: enable save training logs into `tb_file_saving_path` in BaseMod…
WenjieDu Apr 7, 2023
37b54ea
feat: enable set num_workers of DataLoader and typing annotation;
WenjieDu Apr 8, 2023
acd262e
feat: add typing annotations in the functions in `data` and `utils`;
WenjieDu Apr 8, 2023
034a298
feat: add python version 3.11 of all three platforms in the testing w…
WenjieDu Apr 8, 2023
11a7529
fix: numpy.float is deprecated;
WenjieDu Apr 8, 2023
ebe9cec
Merge branch 'main' into dev
WenjieDu Apr 8, 2023
fc01480
Decrease testing python version 3.11 to 3.10, and remove fixed depend…
WenjieDu Apr 8, 2023
4646f5c
Merge pull request #42 from WenjieDu/dev
WenjieDu Apr 8, 2023
823f0af
feat: add daily testing workflow;
WenjieDu Apr 9, 2023
5c6dfa4
feat: make imputation models val_X_intact and val_indicating_mask sho…
WenjieDu Apr 9, 2023
4aff213
fix: invalid attribute;
WenjieDu Apr 9, 2023
0b89440
fix: invalid `cron` attribute, 7 is not standard, should use 0 to rep…
WenjieDu Apr 9, 2023
8ca4952
doc: update README, split the table of the available algos according …
WenjieDu Apr 9, 2023
e55325b
Merge pull request #44 from WenjieDu/dev
WenjieDu Apr 9, 2023
538b4c3
refactor: move gene_incomplete_random_walk_dataset and gene_physionet…
WenjieDu Apr 10, 2023
52dc756
fix: correct the mistaken path to environment_for_pip_test.txt;
WenjieDu Apr 10, 2023
a8066ee
fix: fix error the caused by renaming file `test_logging` to `test_ut…
WenjieDu Apr 10, 2023
12d0a2a
feat: remove `pull_request` trigger to avoid duplicate CI running;
WenjieDu Apr 10, 2023
8a91b7b
Merge pull request #45 from WenjieDu/dev
WenjieDu Apr 10, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
feat: replace print with logger;
  • Loading branch information
WenjieDu committed Feb 12, 2023
commit 7ec0f031f1f5ba5590d44940a7ab663cdf3528b8
44 changes: 34 additions & 10 deletions pypots/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,11 +4,15 @@

# Created by Wenjie Du <wenjay.du@gmail.com>
# License: GLP-v3

import os
from abc import ABC

import numpy as np
import torch
from torch.utils.tensorboard import SummaryWriter

from pypots.logging import logger
from pypots.utils.check import create_dir_if_not_exist


class BaseModel(ABC):
Expand All @@ -24,7 +28,7 @@ def __init__(self, device):
if torch.cuda.is_available() and torch.cuda.device_count() > 0
else "cpu"
)
print("No given device, using default device:", self.device)
logger.info("No given device, using default device:", self.device)
else:
self.device = device

Expand Down Expand Up @@ -136,21 +140,41 @@ def save_logs_to_tensorboard(self, saving_path):
# tb_summary_writer = SummaryWriter(saving_path)
# tb_summary_writer.add_custom_scalars(self.logger)
# tb_summary_writer.close()
# print(f'Log saved successfully to {saving_path}.')
# logger.info(f'Log saved successfully to {saving_path}.')

def save_model(self, saving_path):
def save_model(self, saving_dir, name, overwrite=False):
"""Save the model to a disk file.

A .pypots extension will be appended to the filename if it does not already have one.
Please note that such an extension is not necessary, but to indicate the saved model is from PyPOTS framework so people can distinguish.

Parameters
----------
saving_path : str,
The given path to save the model.
saving_dir : str,
The given directory to save the model.

name : str,
The file name of the model to be saved.

overwrite : bool,

"""
name = name + ".pypots" if name.split(".")[-1] != "pypots" else name
saving_path = os.path.join(saving_dir, name)
if os.path.exists(saving_path):
if overwrite:
logger.warning(
f"File {saving_path} exists. Argument `overwrite` is True. Overwriting now..."
)
else:
logger.error(f"File {saving_path} exists. Saving operation aborted.")
return
try:
create_dir_if_not_exist(saving_dir)
torch.save(self.model, saving_path)
logger.info(f"Saved successfully to {saving_path}.")
except Exception as e:
print(e)
print(f"Saved successfully to {saving_path}.")
raise RuntimeError(f'{e} Failed to save the model to "{saving_path}"!')

def load_model(self, model_path):
"""Load the saved model from a disk file.
Expand All @@ -174,7 +198,7 @@ def load_model(self, model_path):
self.model = loaded_model.model
except Exception as e:
raise e
print(f"Model loaded successfully from {model_path}.")
logger.info(f"Model loaded successfully from {model_path}.")


class BaseNNModel(BaseModel):
Expand Down Expand Up @@ -202,6 +226,6 @@ def __init__(
def _print_model_size(self):
"""Print the number of trainable parameters in the initialized NN model."""
num_params = sum(p.numel() for p in self.model.parameters() if p.requires_grad)
print(
logger.info(
f"Model initialized successfully. Number of the trainable parameters: {num_params}"
)
11 changes: 6 additions & 5 deletions pypots/classification/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@
import torch

from pypots.base import BaseModel, BaseNNModel
from pypots.logging import logger


class BaseClassifier(BaseModel):
Expand Down Expand Up @@ -116,12 +117,12 @@ def _train_model(self, training_loader, val_loader=None):

mean_val_loss = np.mean(epoch_val_loss_collector)
self.logger["validating_loss"].append(mean_val_loss)
print(
logger.info(
f"epoch {epoch}: training loss {mean_train_loss:.4f}, validating loss {mean_val_loss:.4f}"
)
mean_loss = mean_val_loss
else:
print(f"epoch {epoch}: training loss {mean_train_loss:.4f}")
logger.info(f"epoch {epoch}: training loss {mean_train_loss:.4f}")
mean_loss = mean_train_loss

if mean_loss < self.best_loss:
Expand All @@ -131,12 +132,12 @@ def _train_model(self, training_loader, val_loader=None):
else:
self.patience -= 1
if self.patience == 0:
print(
logger.info(
"Exceeded the training patience. Terminating the training procedure..."
)
break
except Exception as e:
print(f"Exception: {e}")
logger.info(f"Exception: {e}")
if self.best_model_dict is None:
raise RuntimeError(
"Training got interrupted. Model was not get trained. Please try fit() again."
Expand All @@ -151,4 +152,4 @@ def _train_model(self, training_loader, val_loader=None):
if np.equal(self.best_loss, float("inf")):
raise ValueError("Something is wrong. best_loss is Nan after training.")

print("Finished training.")
logger.info("Finished training.")
11 changes: 6 additions & 5 deletions pypots/classification/raindrop.py
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,10 @@
from torch.nn.parameter import Parameter
from torch.utils.data import DataLoader

from pypots.classification.base import BaseNNClassifier
from pypots.data.dataset_for_grud import DatasetForGRUD
from pypots.logging import logger

try:
from torch_geometric.nn.conv import MessagePassing
from torch_geometric.nn.inits import glorot
Expand All @@ -35,15 +39,12 @@
from torch_scatter import scatter
from torch_sparse import SparseTensor
except ImportError as e:
print(
logger.error(
f"{e}\n"
"torch_geometric is missing, "
"please install it with 'pip install torch_geometric' or 'conda install -c pyg pyg'"
)

from pypots.classification.base import BaseNNClassifier
from pypots.data.dataset_for_grud import DatasetForGRUD


class PositionalEncodingTF(nn.Module):
"""Generate positional encoding according to time information."""
Expand Down Expand Up @@ -96,7 +97,7 @@ def __init__(
edge_dim: Optional[int] = None,
bias: bool = True,
root_weight: bool = True,
**kwargs
**kwargs,
):
kwargs.setdefault("aggr", "add")
super().__init__(node_dim=0, **kwargs)
Expand Down
11 changes: 6 additions & 5 deletions pypots/clustering/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@
import torch

from pypots.base import BaseModel, BaseNNModel
from pypots.logging import logger


class BaseClusterer(BaseModel):
Expand Down Expand Up @@ -110,12 +111,12 @@ def _train_model(self, training_loader, val_loader=None):

mean_val_loss = np.mean(epoch_val_loss_collector)
self.logger["validating_loss"].append(mean_val_loss)
print(
logger.info(
f"epoch {epoch}: training loss {mean_train_loss:.4f}, validating loss {mean_val_loss:.4f}"
)
mean_loss = mean_val_loss
else:
print(f"epoch {epoch}: training loss {mean_train_loss:.4f}")
logger.info(f"epoch {epoch}: training loss {mean_train_loss:.4f}")
mean_loss = mean_train_loss

if mean_loss < self.best_loss:
Expand All @@ -125,12 +126,12 @@ def _train_model(self, training_loader, val_loader=None):
else:
self.patience -= 1
if self.patience == 0:
print(
logger.info(
"Exceeded the training patience. Terminating the training procedure..."
)
break
except Exception as e:
print(f"Exception: {e}")
logger.info(f"Exception: {e}")
if self.best_model_dict is None:
raise RuntimeError(
"Training got interrupted. Model was not get trained. Please try fit() again."
Expand All @@ -145,4 +146,4 @@ def _train_model(self, training_loader, val_loader=None):
if np.equal(self.best_loss, float("inf")):
raise ValueError("Something is wrong. best_loss is Nan after training.")

print("Finished training.")
logger.info("Finished training.")
9 changes: 5 additions & 4 deletions pypots/clustering/crli.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@

from pypots.clustering.base import BaseNNClusterer
from pypots.data.dataset_for_grud import DatasetForGRUD
from pypots.logging import logger
from pypots.utils.metrics import cal_mse

RNN_CELL = {
Expand Down Expand Up @@ -437,7 +438,7 @@ def _train_model(self, training_loader, val_loader=None):
) # mean training loss of the current epoch
self.logger["training_loss_generator"].append(mean_train_G_loss)
self.logger["training_loss_discriminator"].append(mean_train_D_loss)
print(
logger.info(
f"epoch {epoch}: "
f"training loss_generator {mean_train_G_loss:.4f}, "
f"train loss_discriminator {mean_train_D_loss:.4f}"
Expand All @@ -451,12 +452,12 @@ def _train_model(self, training_loader, val_loader=None):
else:
self.patience -= 1
if self.patience == 0:
print(
logger.info(
"Exceeded the training patience. Terminating the training procedure..."
)
break
except Exception as e:
print(f"Exception: {e}")
logger.info(f"Exception: {e}")
if self.best_model_dict is None:
raise RuntimeError(
"Training got interrupted. Model was not get trained. Please try fit() again."
Expand All @@ -471,7 +472,7 @@ def _train_model(self, training_loader, val_loader=None):
if np.equal(self.best_loss, float("inf")):
raise ValueError("Something is wrong. best_loss is Nan after training.")

print("Finished training.")
logger.info("Finished training.")

def cluster(self, X):
X = self.check_input(self.n_steps, self.n_features, X)
Expand Down
11 changes: 6 additions & 5 deletions pypots/clustering/vader.py
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@

from pypots.clustering.base import BaseNNClusterer
from pypots.data.dataset_for_grud import DatasetForGRUD
from pypots.logging import logger
from pypots.utils.metrics import cal_mse


Expand Down Expand Up @@ -478,12 +479,12 @@ def _train_model(self, training_loader, val_loader=None):

mean_val_loss = np.mean(epoch_val_loss_collector)
self.logger["validating_loss"].append(mean_val_loss)
print(
logger.info(
f"epoch {epoch}: training loss {mean_train_loss:.4f}, validating loss {mean_val_loss:.4f}"
)
mean_loss = mean_val_loss
else:
print(f"epoch {epoch}: training loss {mean_train_loss:.4f}")
logger.info(f"epoch {epoch}: training loss {mean_train_loss:.4f}")
mean_loss = mean_train_loss

if mean_loss < self.best_loss:
Expand All @@ -493,12 +494,12 @@ def _train_model(self, training_loader, val_loader=None):
else:
self.patience -= 1
if self.patience == 0:
print(
logger.info(
"Exceeded the training patience. Terminating the training procedure..."
)
break
except Exception as e:
print(f"Exception: {e}")
logger.info(f"Exception: {e}")
if self.best_model_dict is None:
raise RuntimeError(
"Training got interrupted. Model was not get trained. Please try fit() again."
Expand All @@ -513,7 +514,7 @@ def _train_model(self, training_loader, val_loader=None):
if np.equal(self.best_loss, float("inf")):
raise ValueError("Something is wrong. best_loss is Nan after training.")

print("Finished training.")
logger.info("Finished training.")

def cluster(self, X):
X = self.check_input(self.n_steps, self.n_features, X)
Expand Down
5 changes: 3 additions & 2 deletions pypots/data/load_specific_datasets.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@

import pandas as pd
import tsdb
from pypots.logging import logger

SUPPORTED_DATASETS = [
"physionet_2012",
Expand Down Expand Up @@ -80,7 +81,7 @@ def load_specific_dataset(dataset_name, use_cache=True):
e.g. standardizing and splitting.

"""
print(
logger.info(
f"Loading the dataset {dataset_name} with TSDB (https://github.com/WenjieDu/Time_Series_Database)..."
)
assert dataset_name in SUPPORTED_DATASETS, (
Expand All @@ -89,7 +90,7 @@ def load_specific_dataset(dataset_name, use_cache=True):
f"please create an issue on GitHub "
f"https://github.com/WenjieDu/PyPOTS/issues"
)
print(f"Starting preprocessing {dataset_name}...")
logger.info(f"Starting preprocessing {dataset_name}...")
data = tsdb.load_dataset(dataset_name, use_cache)
data = PREPROCESSING[dataset_name](data)
return data
12 changes: 6 additions & 6 deletions pypots/forecasting/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@
import torch

from pypots.base import BaseModel, BaseNNModel

from pypots.logging import logger

class BaseForecaster(BaseModel):
"""Abstract class for all forecasting models."""
Expand Down Expand Up @@ -102,12 +102,12 @@ def _train_model(self, training_loader, val_loader=None):

mean_val_loss = np.mean(epoch_val_loss_collector)
self.logger["validating_loss"].append(mean_val_loss)
print(
logger.info(
f"epoch {epoch}: training loss {mean_train_loss:.4f}, validating loss {mean_val_loss:.4f}"
)
mean_loss = mean_val_loss
else:
print(f"epoch {epoch}: training loss {mean_train_loss:.4f}")
logger.info(f"epoch {epoch}: training loss {mean_train_loss:.4f}")
mean_loss = mean_train_loss

if mean_loss < self.best_loss:
Expand All @@ -117,12 +117,12 @@ def _train_model(self, training_loader, val_loader=None):
else:
self.patience -= 1
if self.patience == 0:
print(
logger.info(
"Exceeded the training patience. Terminating the training procedure..."
)
break
except Exception as e:
print(f"Exception: {e}")
logger.info(f"Exception: {e}")
if self.best_model_dict is None:
raise RuntimeError(
"Training got interrupted. Model was not get trained. Please try fit() again."
Expand All @@ -137,4 +137,4 @@ def _train_model(self, training_loader, val_loader=None):
if np.equal(self.best_loss, float("inf")):
raise ValueError("Something is wrong. best_loss is Nan after training.")

print("Finished training.")
logger.info("Finished training.")
Loading