Skip to content

Commit c4ce92a

Browse files
committed
Merge pull request #8 from arokem/kpolimis-documentation
Kpolimis documentation
2 parents 8429939 + a359ad2 commit c4ce92a

File tree

13 files changed

+120
-276
lines changed

13 files changed

+120
-276
lines changed

README.md

Lines changed: 19 additions & 210 deletions
Original file line numberDiff line numberDiff line change
@@ -4,226 +4,35 @@
44
[![Coveralls Status](https://coveralls.io/repos/uwescience/sklearn-forest-ci/badge.svg?branch=master&service=github)](https://coveralls.io/r/uwescience/sklearn-forest-ci)
55
[![CircleCI Status](https://circleci.com/gh/uwescience/sklearn-forest-ci.svg?style=shield&circle-token=:circle-token)](https://circleci.com/gh/uwescience/sklearn-forest-ci/tree/master)
66

7-
**project-template** is a template project for
8-
[scikit-learn](http://scikit-learn.org/)
9-
compatible extensions.
7+
`sklearn-forest-ci` is a Python module for calculating variance and adding
8+
confidence intervals to scikit-learn random forest regression or classification
9+
objects. The core functions calculate an in-bag and error bars for
10+
random forest objects
1011

11-
It aids development of estimators that can be used in scikit-learn pipelines
12-
and (hyper)parameter search, while facilitating testing (including some API
13-
compliance), documentation, open source development, packaging, and continuous
14-
integration.
12+
Compatible with Python2.7 and Python3.5
1513

16-
## Important Links
17-
HTML Documentation - http://contrib.scikit-learn.org/project-template/
18-
19-
## Installation and Usage
20-
The package by itself comes with a single module and an estimator. Before
21-
installing the module you will need `numpy` and `scipy`.
22-
To install the module execute:
23-
```shell
24-
$ python setup.py install
25-
```
26-
or
27-
```
28-
pip install sklearn-template
29-
```
30-
31-
If the installation is successful, and `scikit-learn` is correctly installed,
32-
you should be able to execute the following in Python:
33-
```python
34-
>>> from skltemplate import TemplateEstimator
35-
>>> estimator = TemplateEstimator()
36-
>>> estimator.fit(np.arange(10), np.arange(10))
37-
```
38-
39-
`TemplateEstimator` by itself does nothing useful, but it serves as an example
40-
of how other Estimators should be written. It also comes with its own unit
41-
tests under `template/tests` which can be run using `nosetests`.
42-
43-
## Creating your own library
14+
This module is based on R code from Stefan Wager (see important links below)
15+
and is licensed under the MIT open source license (see [LICENSE](LICENSE))
4416

45-
### 1. Cloning
46-
Clone the project into your computer by executing
47-
```shell
48-
$ git clone https://github.com/scikit-learn-contrib/project-template.git
49-
```
50-
You should rename the `project-template` folder to the name of your project.
51-
To host the project on Github, visit https://github.com/new and create a new
52-
repository. To upload your project on Github execute
53-
```shell
54-
$ git remote set-url origin https://github.com/username/project-name.git
55-
$ git push origin master
56-
```
57-
58-
### 2. Modifying the Source
59-
You are free to modify the source as you want, but at the very least, all your
60-
estimators should pass the [`check_estimator`](http://scikit-learn.org/stable/modules/generated/sklearn.utils.estimator_checks.check_estimator.html#sklearn.utils.estimator_checks.check_estimator)
61-
test to be scikit-learn compatible.
62-
(If there are valid reasons your estimator cannot pass `check_estimator`, please
63-
[raise an issue](https://github.com/scikit-learn/scikit-learn/issues/new) at
64-
scikit-learn so we can make `check_estimator` more flexible.)
65-
66-
This template is particularly useful for publishing open-source versions of
67-
algorithms that do not meet the criteria for inclusion in the core scikit-learn
68-
package (see [FAQ](http://scikit-learn.org/stable/faq.html)), such as recent
69-
and unpopular developments in machine learning.
70-
However, developing using this template may also be a stepping stone to
71-
eventual inclusion in the core package.
72-
73-
In any case, developers should endeavor to adhere to scikit-learn's
74-
[Contributor's Guide](http://scikit-learn.org/stable/developers/) which promotes
75-
the use of:
76-
* algorithm-specific unit tests, in addition to `check_estimator`'s common tests
77-
* [PEP8](https://www.python.org/dev/peps/pep-0008/)-compliant code
78-
* a clearly documented API using [NumpyDoc](https://github.com/numpy/numpydoc)
79-
and [PEP257](https://www.python.org/dev/peps/pep-0257/)-compliant docstrings
80-
* references to relevant scientific literature in standard citation formats
81-
* [doctests](https://docs.python.org/3/library/doctest.html) to provide
82-
succinct usage examples
83-
* standalone examples to illustrate the usage, model visualisation, and
84-
benefits/benchmarks of particular algorithms
85-
* efficient code when the need for optimization is supported by benchmarks
86-
87-
### 3. Modifying the Documentation
17+
## Important Links
18+
scikit-learn - http://scikit-learn.org/
8819

89-
The documentation is built using [sphinx](http://www.sphinx-doc.org/en/stable/).
90-
It incorporates narrative documentation from the `doc/` directory, standalone
91-
examples from the `examples/` directory, and API reference compiled from
92-
estimator docstrings.
20+
Stefan Wager's `randomForestCI` - https://github.com/swager/randomForestCI
9321

94-
To build the documentation locally, ensure that you have `sphinx`,
95-
`sphinx-gallery` and `matplotlib` by executing:
96-
```shell
97-
$ pip install sphinx matplotlib sphinx-gallery
22+
## Installation and Usage
23+
Before installing the module you will need `numpy`, `scipy` and `scikit-learn`.
9824
```
99-
The documentation contains a home page (`doc/index.rst`), an API
100-
documentation page (`doc/api.rst`) and a page documenting the `template` module
101-
(`doc/template.rst`). Sphinx allows you to automatically document your modules
102-
and classes by using the `autodoc` directive (see `template.rst`). To change the
103-
asthetics of the docs and other paramteres, edit the `doc/conf.py` file. For
104-
more information visit the [Sphinx Documentation](http://www.sphinx-doc.org/en/stable/contents.html).
105-
106-
You can also add code examples in the `examples` folder. All files inside
107-
the folder of the form `plot_*.py` will be executed and their generated
108-
plots will be available for viewing in the `/auto_examples` URL.
109-
110-
To build the documentation locally execute
111-
```shell
112-
$ cd doc
113-
$ make html
25+
pip install numpy scipy scikit-learn
11426
```
11527

116-
### 4. Setting up Travis CI
117-
[TravisCI](https://travis-ci.org/) allows you to continuously build and test
118-
your code from Github to ensure that no code-breaking changes are pushed. After
119-
you sign up and authourize TravisCI, add your new repository to TravisCI so that
120-
it can start building it. The `travis.yml` contains the configuration required
121-
for Travis to build the project. You will have to update the variable `MODULE`
122-
with the name of your module for Travis to test it. Once you add the project on
123-
TravisCI, all subsequent pushes on the master branch will trigger a Travis
124-
build. By default, the project is tested on Python 2.7 and Python 3.5.
125-
126-
### 5. Setting up Coveralls
127-
[Coveralls](https://coveralls.io/) reports code coverage statistics of your
128-
tests on each push. Sign up on Coveralls and add your repository so that
129-
Coveralls can start monitoring it. The project already contains the required
130-
configuration for Coveralls to work. All subsequent builds after adding your
131-
project will generate a coverage report.
132-
133-
### 6. Setting up Circle CI
134-
The project uses [CircleCI](https://circleci.com/) to build its documentation
135-
from the `master` branch and host it using [Github Pages](https://pages.github.com/).
136-
Again, you will need to Sign Up and authorize CircleCI. The configuration
137-
of CircleCI is governed by the `circle.yml` file, which needs to be mofified
138-
if you want to setup the docs on your own website. The values to be changed
139-
are
140-
141-
| Variable | Value|
142-
|----------|------|
143-
| `USERNAME` | The name of the user or organization of the repository where the project and documentation is hosted |
144-
| `DOC_REPO` | The repository where the documentation will be hosted. This can be the same as the project repository |
145-
| `DOC_URL` | The relative URL where the documentation will be hosted |
146-
| `EMAIL` | The email id to use while pushing the documentation, this can be any valid email address |
147-
148-
In addition to this, you will need to grant access to the CircleCI computers
149-
to push to your documentation repository. To do this, visit the Project Settings
150-
page of your project in CircleCI. Select `Checkout SSH keys` option and then
151-
choose `Create and add user key` option. This should grant CircleCI privileges
152-
to push to the repository `https://github.com/USERNAME/DOC_REPO/`.
153-
154-
If all goes well, you should be able to visit the documentation of your project
155-
on
28+
To install the module execute:
15629
```
157-
https://github.com/USERNAME/DOC_REPO/DOC_URL
30+
pip install sklforestci
15831
```
159-
160-
### 7. Adding Badges
161-
162-
Follow the instructions to add a [Travis Badge](https://docs.travis-ci.com/user/status-images/),
163-
[Coveralls Badge](https://coveralls.io) and
164-
[CircleCI Badge](https://circleci.com/docs/status-badges) to your repository's
165-
`README`.
166-
167-
### 8. Advertising your package
168-
169-
Once your work is mature enough for the general public to use it, you should
170-
submit a Pull Request to modify scikit-learn's
171-
[related projects listing](https://github.com/scikit-learn/scikit-learn/edit/master/doc/related_projects.rst).
172-
Please insert brief description of your project and a link to its code
173-
repository or PyPI page.
174-
You may also wish to announce your work on the
175-
[`scikit-learn-general` mailing list](https://lists.sourceforge.net/lists/listinfo/scikit-learn-general).
176-
177-
### 9. Uploading your package to PyPI
178-
179-
Uploading your package to [PyPI](https://pypi.python.org/pypi) allows users to
180-
install your package through `pip`. Python provides two repositories to upload
181-
your packages. The [PyPI Test](https://testpypi.python.org/pypi) repository,
182-
which is to be used for testing packages before their release, and the
183-
[PyPI](https://pypi.python.org/pypi) repository, where you can make your
184-
releases. You need to register a username and password with both these sites.
185-
The username and passwords for both these sites need not be the same. To upload
186-
your package through the command line, you need to store your username and
187-
password in a file called `.pypirc` in your `$HOME` directory with the
188-
following format.
189-
32+
or
19033
```shell
191-
[distutils]
192-
index-servers =
193-
pypi
194-
pypitest
195-
196-
[pypi]
197-
repository=https://pypi.python.org/pypi
198-
username=<your-pypi-username>
199-
password=<your-pypi-passowrd>
200-
201-
[pypitest]
202-
repository=https://testpypi.python.org/pypi
203-
username=<your-pypitest-username>
204-
password=<your-pypitest-passowrd>
205-
```
206-
Make sure that all details in `setup.py` are up to date. To upload your package
207-
to the Test server, execute:
208-
```
209-
python setup.py register -r pypitest
210-
python setup.py sdist upload -r pypitest
211-
```
212-
Your package should now be visible on: https://testpypi.python.org/pypi
213-
214-
To install a package from the test server, execute:
215-
```
216-
pip install -i https://testpypi.python.org/pypi <package-name>
217-
```
218-
219-
Similary, to upload your package to the PyPI server execute
220-
```
221-
python setup.py register -r pypi
222-
python setup.py sdist upload -r pypi
223-
```
224-
To install your package, execute:
225-
```
226-
pip install <package-name>
34+
$ python setup.py install
22735
```
22836

229-
*Thank you for cleanly contributing to the scikit-learn ecosystem!*
37+
## Example
38+
See examples gallery

circle.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ machine:
88
DOC_REPO: "sklearn-forest-ci"
99

1010
# The base URL for the Github page where the documentation will be hosted
11-
DOC_URL: "http://uwescience.github.io/sklearn-forest-ci"
11+
DOC_URL: ""
1212

1313
# The email is to be used for commits in the Github Page
1414
EMAIL: "arokem+ci@uw.edu"

doc/_static/eScience_Logo_HR.png

54.7 KB
Loading

doc/index.rst

Lines changed: 14 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,28 +1,36 @@
11

2-
Confidence Intervals for Scikit Learn Random Forests
2+
Confidence Intervals for Scikit Learn Random Forests
33
=====================================================
44

5-
This is bla bla bla explanation
5+
`sklforestci` calculates confidence intervals from scikit-learn
6+
RandomForest regressor or classifier objects. The unbiased variance for a
7+
RandomForest object is returned in an array for plotting error bars.
68

79

810
.. toctree::
911
:maxdepth: 2
1012

1113
api
14+
introduction
15+
installation_guide
1216
auto_examples/index
1317

1418

1519
See the `README <https://github.com/uwescience/sklearn-forest-ci/blob/master/README.md>`_
1620
for more information.
1721

18-
Acknoweldgements: this work was supported by a grant from the Gordon & Betty Moore Foundation
19-
and from the Alfred P. Sloan Foundation to the University of Washington eScience Institute,
20-
and through a grant from the Bill & Melinda Gates Foundation.
21-
2222

2323
Indices and tables
2424
==================
2525

2626
* :ref:`genindex`
2727
* :ref:`modindex`
2828
* :ref:`search`
29+
30+
.. figure:: _static/eScience_Logo_HR.png
31+
:align: center
32+
:figclass: align-center
33+
34+
Acknowledgements: this work was supported by a grant from the Gordon & Betty Moore Foundation
35+
and from the Alfred P. Sloan Foundation to the University of Washington eScience Institute,
36+
and through a grant from the Bill & Melinda Gates Foundation.

doc/installation_guide.rst

Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,23 @@
1+
.. _installation_guide:
2+
3+
Installation Guide
4+
==================
5+
6+
Before installing the `sklforestci` module, you will need `numpy`, `scipy`
7+
and `scikit-learn`
8+
9+
.. code-block:: bash
10+
11+
pip install numpy scipy scikit-learn
12+
13+
To install `sklforestci`:
14+
15+
.. code-block:: bash
16+
17+
pip install sklforestci
18+
19+
or
20+
21+
.. code-block:: bash
22+
23+
python setup.py install

doc/introduction.rst

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
.. _introduction:
2+
3+
Introduction
4+
============
5+
`sklforestci` calculates the variance and error bars from scikit-learn
6+
RandomForest regressor or classifier objects. The unbiased variance for a
7+
RandomForest object is returned in an array for plotting.
8+
9+
The calculation of error is based on the infinitesimal jackknife variance, as
10+
described in [Wager2014]_ and is a Python implementation of the R code
11+
provided at: https://github.com/swager/randomForestCI
12+
13+
.. [Wager2014] S. Wager, T. Hastie, B. Efron. "Confidence Intervals for
14+
Random Forests: The Jackknife and the Infinitesimal Jackknife", Journal
15+
of Machine Learning Research vol. 15, pp. 1625-1651, 2014.

doc/sklforestci.rst

Lines changed: 9 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,16 @@
11
Random Forest Confidence Intervals
22
==================================
33

4-
This will explain what this module is all about.
4+
This module creates an in-bag and calculates the variance for a scikit-learn
5+
RandomForest regressor or classifier objects. The variance can be used to plot
6+
error bars for RandomForest objects
57

8+
The inbag is a matrix documenting which sample entered which decision tree
9+
10+
The error bar calculation is based on Wager's (2014) infinitesimal jackknife variance
11+
12+
See the `README <https://github.com/uwescience/sklearn-forest-ci/blob/master/README.md>`_
13+
for more information.
614

715

816
.. automodule:: sklforestci

examples/README.txt

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -5,5 +5,7 @@ General examples
55

66
Introductory examples.
77

8-
9-
Something something somthing
8+
These examples use data from standard machine learning libraries to show
9+
`sklforestci` calculate error bars on RandomForest regression and
10+
classification objects. The regression forest example predicts car MPG while the
11+
classification forest example attempts to distinguish spam from non-spam emails

0 commit comments

Comments
 (0)