evaluation_selection_hw_9

Capstone project for RSschool ml-course

This project uses Forest CoverType dataset.

Usage

This package allows you to train model for forest cover type prediction.

Clone this repository to your machine
Download Forest CoverType dataset, extract it to data/raw/ in directory's root
Use Python 3.9 and Poetry 1.1.11
Install project dependencies:

poetry install --no-dev

Run train with the following command:

poetry run train -d <path to csv with data> -s <path to save trained model>

You can pass many other options(select model and choose hyperparameters) in the CLI To get full list run this:

poetry run train --help

Run MLflow to see tracked experiments(models, perameters and metrics):

poetry run mlflow ui

Here are the results of running 2 models with different parameters and two feature engineering techniques. (Because my machine was so slow i had change logistic regression for KNN and use the simplest of approaches for feature selection and even then, as you can see from mlflow screenshot the evaluations were just painfully long. But I ran some experiments in Colab e.g. tried LogisticRegression L1-regularized feature elimination and it didn't show wonderful results on used dataset, so some further research and experiments TBD) 7. You can check --find-best-params=True to automatically find best model parameters(using randomized search)

Development

Install all requirements (including dev requirements) to poetry environment:

poetry install

Now you can use developer instruments. I also added pandas-profiling to dev-dependencies because it takes too long to install. You can run generate-eda-report.py to get profiling of the dataset, it will be stored in report/ folder

Format your code with black formatter:

poetry run black src tests

Check if your code is PEP8 compliant with Flake8:

poetry run flake8 src/

Name		Name	Last commit message	Last commit date
Latest commit History 34 Commits
models		models
reports		reports
src		src
tests		tests
.gitignore		.gitignore
README.md		README.md
flake_black_passed.png		flake_black_passed.png
mlflow_results.png		mlflow_results.png
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml
random_search.png		random_search.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

evaluation_selection_hw_9

Usage

Development

About

Uh oh!

Releases

Packages

Languages

egorherby/evaluation_selection_hw_9

Folders and files

Latest commit

History

Repository files navigation

evaluation_selection_hw_9

Usage

Development

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages