Skip to content

Commit 000b814

Browse files
committed
Initial Commit
0 parents  commit 000b814

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

57 files changed

+5053
-0
lines changed

.gitignore

Lines changed: 159 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,159 @@
1+
##########################
2+
# KEDRO PROJECT
3+
4+
# ignore all local configuration
5+
conf/local/**
6+
!conf/local/.gitkeep
7+
.telemetry
8+
9+
# ignore potentially sensitive credentials files
10+
conf/**/*credentials*
11+
12+
# ignore everything in the following folders
13+
data/**
14+
logs/**
15+
16+
# except their sub-folders
17+
!data/**/
18+
!logs/**/
19+
20+
# also keep all .gitkeep files
21+
!.gitkeep
22+
23+
# also keep the example dataset
24+
!data/01_raw/iris.csv
25+
26+
27+
##########################
28+
# Common files
29+
30+
# IntelliJ
31+
.idea/
32+
*.iml
33+
out/
34+
.idea_modules/
35+
36+
### macOS
37+
*.DS_Store
38+
.AppleDouble
39+
.LSOverride
40+
.Trashes
41+
42+
# Vim
43+
*~
44+
.*.swo
45+
.*.swp
46+
47+
# emacs
48+
*~
49+
\#*\#
50+
/.emacs.desktop
51+
/.emacs.desktop.lock
52+
*.elc
53+
54+
# JIRA plugin
55+
atlassian-ide-plugin.xml
56+
57+
# C extensions
58+
*.so
59+
60+
### Python template
61+
# Byte-compiled / optimized / DLL files
62+
__pycache__/
63+
*.py[cod]
64+
*$py.class
65+
66+
# Distribution / packaging
67+
.Python
68+
build/
69+
develop-eggs/
70+
dist/
71+
downloads/
72+
eggs/
73+
.eggs/
74+
lib/
75+
lib64/
76+
parts/
77+
sdist/
78+
var/
79+
wheels/
80+
*.egg-info/
81+
.installed.cfg
82+
*.egg
83+
MANIFEST
84+
85+
# PyInstaller
86+
# Usually these files are written by a python script from a template
87+
# before PyInstaller builds the exe, so as to inject date/other infos into it.
88+
*.manifest
89+
*.spec
90+
91+
# Installer logs
92+
pip-log.txt
93+
pip-delete-this-directory.txt
94+
95+
# Unit test / coverage reports
96+
htmlcov/
97+
.tox/
98+
.coverage
99+
.coverage.*
100+
.cache
101+
nosetests.xml
102+
coverage.xml
103+
*.cover
104+
.hypothesis/
105+
106+
# Translations
107+
*.mo
108+
*.pot
109+
110+
# Django stuff:
111+
*.log
112+
.static_storage/
113+
.media/
114+
local_settings.py
115+
116+
# Flask stuff:
117+
instance/
118+
.webassets-cache
119+
120+
# Scrapy stuff:
121+
.scrapy
122+
123+
# Sphinx documentation
124+
docs/_build/
125+
126+
# PyBuilder
127+
target/
128+
129+
# Jupyter Notebook
130+
.ipynb_checkpoints
131+
132+
# IPython
133+
.ipython/profile_default/history.sqlite
134+
.ipython/profile_default/startup/README
135+
136+
# pyenv
137+
.python-version
138+
139+
# celery beat schedule file
140+
celerybeat-schedule
141+
142+
# SageMath parsed files
143+
*.sage.py
144+
145+
# Environments
146+
.env
147+
.envrc
148+
.venv
149+
env/
150+
venv/
151+
ENV/
152+
env.bak/
153+
venv.bak/
154+
155+
# mkdocs documentation
156+
/site
157+
158+
# mypy
159+
.mypy_cache/

README.md

Lines changed: 122 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,122 @@
1+
# Decoding Pipeline
2+
3+
## Overview
4+
5+
This is your new Kedro project, which was generated using `Kedro 0.18.3`.
6+
7+
Take a look at the [Kedro documentation](https://kedro.readthedocs.io) to get started.
8+
9+
## Rules and guidelines
10+
11+
In order to get the best out of the template:
12+
13+
* Don't remove any lines from the `.gitignore` file we provide
14+
* Make sure your results can be reproduced by following a [data engineering convention](https://kedro.readthedocs.io/en/stable/faq/faq.html#what-is-data-engineering-convention)
15+
* Don't commit data to your repository
16+
* Don't commit any credentials or your local configuration to your repository. Keep all your credentials and local configuration in `conf/local/`
17+
18+
## How to install dependencies
19+
20+
Declare any dependencies in `src/requirements.txt` for `pip` installation and `src/environment.yml` for `conda` installation.
21+
22+
To install them, run:
23+
24+
```
25+
pip install -r src/requirements.txt
26+
```
27+
28+
## How to run your Kedro pipeline
29+
30+
You can run your Kedro project with:
31+
32+
```
33+
kedro run
34+
```
35+
36+
## How to test your Kedro project
37+
38+
Have a look at the file `src/tests/test_run.py` for instructions on how to write your tests. You can run your tests as follows:
39+
40+
```
41+
kedro test
42+
```
43+
44+
To configure the coverage threshold, go to the `.coveragerc` file.
45+
46+
## Project dependencies
47+
48+
To generate or update the dependency requirements for your project:
49+
50+
```
51+
kedro build-reqs
52+
```
53+
54+
This will `pip-compile` the contents of `src/requirements.txt` into a new file `src/requirements.lock`. You can see the output of the resolution by opening `src/requirements.lock`.
55+
56+
After this, if you'd like to update your project requirements, please update `src/requirements.txt` and re-run `kedro build-reqs`.
57+
58+
[Further information about project dependencies](https://kedro.readthedocs.io/en/stable/kedro_project_setup/dependencies.html#project-specific-dependencies)
59+
60+
## How to work with Kedro and notebooks
61+
62+
> Note: Using `kedro jupyter` or `kedro ipython` to run your notebook provides these variables in scope: `context`, `catalog`, and `startup_error`.
63+
>
64+
> Jupyter, JupyterLab, and IPython are already included in the project requirements by default, so once you have run `pip install -r src/requirements.txt` you will not need to take any extra steps before you use them.
65+
66+
### Jupyter
67+
To use Jupyter notebooks in your Kedro project, you need to install Jupyter:
68+
69+
```
70+
pip install jupyter
71+
```
72+
73+
After installing Jupyter, you can start a local notebook server:
74+
75+
```
76+
kedro jupyter notebook
77+
```
78+
79+
### JupyterLab
80+
To use JupyterLab, you need to install it:
81+
82+
```
83+
pip install jupyterlab
84+
```
85+
86+
You can also start JupyterLab:
87+
88+
```
89+
kedro jupyter lab
90+
```
91+
92+
### IPython
93+
And if you want to run an IPython session:
94+
95+
```
96+
kedro ipython
97+
```
98+
99+
### How to convert notebook cells to nodes in a Kedro project
100+
You can move notebook code over into a Kedro project structure using a mixture of [cell tagging](https://jupyter-notebook.readthedocs.io/en/stable/changelog.html#release-5-0-0) and Kedro CLI commands.
101+
102+
By adding the `node` tag to a cell and running the command below, the cell's source code will be copied over to a Python file within `src/<package_name>/nodes/`:
103+
104+
```
105+
kedro jupyter convert <filepath_to_my_notebook>
106+
```
107+
> *Note:* The name of the Python file matches the name of the original notebook.
108+
109+
Alternatively, you may want to transform all your notebooks in one go. Run the following command to convert all notebook files found in the project root directory and under any of its sub-folders:
110+
111+
```
112+
kedro jupyter convert --all
113+
```
114+
115+
### How to ignore notebook output cells in `git`
116+
To automatically strip out all output cell contents before committing to `git`, you can run `kedro activate-nbstripout`. This will add a hook in `.git/config` which will run `nbstripout` before anything is committed to `git`.
117+
118+
> *Note:* Your output cells will be retained locally.
119+
120+
## Package your Kedro project
121+
122+
[Further information about building project documentation and packaging your project](https://kedro.readthedocs.io/en/stable/tutorial/package_a_project.html)

conf/README.md

Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,26 @@
1+
# What is this for?
2+
3+
This folder should be used to store configuration files used by Kedro or by separate tools.
4+
5+
This file can be used to provide users with instructions for how to reproduce local configuration with their own credentials. You can edit the file however you like, but you may wish to retain the information below and add your own section in the [Instructions](#Instructions) section.
6+
7+
## Local configuration
8+
9+
The `local` folder should be used for configuration that is either user-specific (e.g. IDE configuration) or protected (e.g. security keys).
10+
11+
> *Note:* Please do not check in any local configuration to version control.
12+
13+
## Base configuration
14+
15+
The `base` folder is for shared configuration, such as non-sensitive and project-related configuration that may be shared across team members.
16+
17+
WARNING: Please do not put access credentials in the base configuration folder.
18+
19+
## Instructions
20+
21+
22+
23+
24+
25+
## Find out more
26+
You can find out more about configuration from the [user guide documentation](https://kedro.readthedocs.io/en/stable/04_user_guide/03_configuration.html).

conf/base/catalog.yml

Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,27 @@
1+
# Here you can define all your data sets by using simple YAML syntax.
2+
#
3+
# Documentation for this file format can be found in "The Data Catalog"
4+
# Link: https://kedro.readthedocs.io/en/stable/05_data/01_data_catalog.html
5+
6+
center_out_dat:
7+
type: PartitionedDataSet
8+
dataset: decoding_pipeline.io.dat_dataset.DATDataset
9+
path: data/01_raw/CC01/ccCenterOut
10+
filename_suffix: ".dat"
11+
layer: raw
12+
13+
center_out_hdf5:
14+
type: PartitionedDataSet
15+
dataset: decoding_pipeline.io.hdf5_dataset.HDF5Dataset
16+
path: data/01_raw/CC01/ccCenterOut
17+
filename_suffix: ".hdf5"
18+
layer: raw
19+
20+
prefixed_channels:
21+
type: json.JSONDataSet
22+
filepath: data/03_primary/CC01/ccCenterOut/prefixed_channels.json
23+
layer: primary
24+
25+
# center_out_dat:
26+
# type: decoding_pipeline.io.dat_dataset.DATDataset
27+
# filepath: data/01_raw/ccCenterOut/ccCenterOut_962022_S01.dat

conf/base/logging.yml

Lines changed: 47 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,47 @@
1+
version: 1
2+
3+
disable_existing_loggers: False
4+
5+
formatters:
6+
simple:
7+
format: "%(asctime)s - %(name)s - %(levelname)s - %(message)s"
8+
9+
handlers:
10+
console:
11+
class: logging.StreamHandler
12+
level: INFO
13+
formatter: simple
14+
stream: ext://sys.stdout
15+
16+
info_file_handler:
17+
class: logging.handlers.RotatingFileHandler
18+
level: INFO
19+
formatter: simple
20+
filename: logs/info.log
21+
maxBytes: 10485760 # 10MB
22+
backupCount: 20
23+
encoding: utf8
24+
delay: True
25+
26+
error_file_handler:
27+
class: logging.handlers.RotatingFileHandler
28+
level: ERROR
29+
formatter: simple
30+
filename: logs/errors.log
31+
maxBytes: 10485760 # 10MB
32+
backupCount: 20
33+
encoding: utf8
34+
delay: True
35+
36+
rich:
37+
class: rich.logging.RichHandler
38+
39+
loggers:
40+
kedro:
41+
level: INFO
42+
43+
decoding_pipeline:
44+
level: INFO
45+
46+
root:
47+
handlers: [rich, info_file_handler, error_file_handler]

0 commit comments

Comments
 (0)