Skip to content

Commit dc4b120

Browse files
committed
bump package versions
1 parent 4f16978 commit dc4b120

File tree

4 files changed

+30
-13
lines changed

4 files changed

+30
-13
lines changed

README.md

Lines changed: 25 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -37,23 +37,42 @@ deactivate
3737
source .venv/bin/activate
3838
```
3939

40-
41-
### How to list existing Datasets (in Jupyter)
40+
### How to load a dataset (in Jupyter)
4241

4342
```
4443
sys.path.insert(0, '<project_directory>/example')
4544
from hypergol import HypergolProject
46-
from data_models.example_datamodel_class import ExampleDatamodelClass
45+
from data_models.data_type import DataType
4746
project = HypergolProject(
4847
projectDirectory='<project_directory>/example',
49-
dataDirectory='<data_directory>'
48+
dataDirectory='<data_directory>',
49+
force=True
5050
)
51-
ds = project.datasetFactory.get(dataType=ExampleDatamodelClass, name='sentences')
52-
# project.list_datasets(pattern='.*', asCode=True);
51+
52+
dataTypeDataset = project.datasetFactory.get(dataType=DataType, name='data_types')
53+
with dataTypeDataset.open('r') as dsr:
54+
dataTypes = [value.to_data() for value in islice(dsr, 10)]
55+
56+
# Or convert straight into pandas
57+
import pandas as pd
58+
dataTypeDataframe = pd.DataFrame([value.to_data() for value in islice(dataTypeDataset.open('r'), 10)])
5359
```
5460

61+
`<project_directory>` is the repo's directory.
62+
`<data_directory>` is the *parent* data directory.
63+
64+
If the project is called `my_project` and the code is located in `~/my_project` and the project data is in `~/data/my_project`, `<data_directory>` is `~/data`.
65+
Set `branch` argument in `datasetFactory.get()` if you need anything else other than the current branch.
66+
67+
The `force` argument allows you to load the data even if your repo has uncommitted code, this is usually not a problem unless you plan to write into dataframes from Jupyter.
68+
69+
### How to list existing Datasets
70+
5571
This will list all existing datasets that matches `pattern` as self contained executable code.
5672

73+
```
74+
project.list_datasets(pattern='.*', asCode=True);
75+
```
5776

5877
### How to start Tensorboard
5978

@@ -65,7 +84,6 @@ source .venv/bin/activate
6584
tensorboard --logdir=<data_directory>/example/tensorboard/
6685
```
6786

68-
6987
### How to train your model
7088

7189
After implementing all components and required functions:
@@ -90,7 +108,6 @@ then start serving with (port and host can be set in the shell script):
90108
./serve_example.sh
91109
```
92110

93-
94111
### How to call your model from python with requests
95112

96113
```

make_venv.sh

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
python3 -m venv .venv
22
source .venv/bin/activate
33
pip3 install --upgrade pip
4-
pip3 install setuptools==47.1.1
4+
pip3 install setuptools==57.1.0
55
pip3 install wheel
66
pip3 install -r requirements.txt
77
# setup here

pipelines/process_blogposts.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@
1212
from data_models.sentence import Sentence
1313

1414

15-
def process_blogposts(threads=1, force=False):
15+
def process_blogposts(threads=1, force=False, onlyTasks=None):
1616
project = HypergolProject(dataDirectory=f'{os.environ["BASE_DIR"]}/tempdata', force=force)
1717
SOURCE_PATTERN = f'{os.environ["BASE_DIR"]}/data/blogposts/pages_*.pkl'
1818
articles = project.datasetFactory.get(dataType=Article, name='articles')
@@ -49,7 +49,7 @@ def process_blogposts(threads=1, force=False):
4949
createSentencesTask,
5050
]
5151
)
52-
pipeline.run(threads=threads)
52+
pipeline.run(threads=threads, onlyTasks=onlyTasks)
5353

5454

5555
if __name__ == '__main__':

requirements.txt

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -4,8 +4,8 @@ spacy
44
GitPython==3.1.3
55
nose2==0.9.2
66
pylint==2.5.3
7-
hypergol
7+
hypergol==0.1.20
88
tensorflow==2.5.0
99
pydantic==1.6.2
10-
fastapi==0.61.0
10+
fastapi==0.65.2
1111
uvicorn==0.11.8

0 commit comments

Comments
 (0)