Skip to content

Commit

Permalink
Adding to machine learning
Browse files Browse the repository at this point in the history
  • Loading branch information
vprusso committed Feb 22, 2019
1 parent cadbcc5 commit 00e86ce
Show file tree
Hide file tree
Showing 2 changed files with 65 additions and 2 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -20,9 +20,31 @@

# Value of DESCR is a description of the dataset.
# Here are the first few values of the description
print(iris['DESCR'][:200] + "\n...")
# print(iris['DESCR'][:200] + "\n...")

# The value with key "target_names" consists of an
# array of strings with species that we intent to predict.
iris['target_names']
# print(iris['target_names'])

# We can also print out the feature names of each item.
# Things like petal length, width, etc.
# print(iris['feature_names'])

# The data for each flower is contained in the data
# field of the iris dataset.
# print(iris['data'])

# We can see that there are 150 different entries
# with 4 features per each entry where the features
# correspond to sepal length, sepal width, petal`
# length, and petal width, respectively.
# print(iris['data'].shape)

# The target field contains what species each entry
# corresponds to. There are three possible species:
# 0 -> Setosa
# 1 -> Versicolor
# 2 -> Viginica
# print(iris['target'])


41 changes: 41 additions & 0 deletions machine_learning/iris_classification/part_2.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
# Iris Classification Model: Machine learning model that will
# allow us to classify species of iris flowers. This application
# will introduce many rudimentary features and concepts of machine
# learning and is a good use case for these types of models.

# Use case: Botanist wants to determine the species of an
# iris flower based on characteristics of that flower. For
# instance attributes including petal length, width, etc.
# are the "features" that determine the classification
# of a given iris flower.

# Import the iris dataset as provided by the sklearn
# Python module:
from sklearn.datasets import load_iris
iris = load_iris()

# Goal: Built machine learning model from the iris
# data set that can predict the species of a new
# set of measurements.

# In order to determine how well our model performs,
# we need to run it on data it has not seen before, `
# that is, we need to run it on a new set of measurements
# and see where our model categorizes this new item.

# To do this, we can split our data up into two sets;
# a training and testing set. The training set will be
# what our model uses to learn, and the test set will be
# the remaining set that assesses whether the model is
# able to accurately predict the outcome of the measurements
# from this set.

# We will be using a 75/25 split for train/test respectively.
# That is, we will be training our model on 75% of our data,
# and then testing on the remaining 25%. What split percentage
# you use is up to you, but a 75/25 split is a reasonable rule
# to use as a starting point.

# Split our dataset into training and testing sets.


0 comments on commit 00e86ce

Please sign in to comment.