-
-
Notifications
You must be signed in to change notification settings - Fork 608
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add module to make iris dataset available.
- Loading branch information
1 parent
bc12a4d
commit 930ebaf
Showing
3 changed files
with
92 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -39,4 +39,7 @@ include("tree.jl") | |
include("sentiment.jl") | ||
using .Sentiment | ||
|
||
include("iris.jl") | ||
export Iris | ||
|
||
end |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,88 @@ | ||
|
||
""" | ||
Iris | ||
Fisher's classic iris dataset. | ||
Measurements from 3 different species of iris: setosa, versicolor and | ||
virginica. There are 50 examples of each species. | ||
There are 4 measurements for each example: sepal length, sepal width, petal | ||
length and petal width. The measurements are in centimeters. | ||
The module retrieves the data from the [UCI Machine Learning Repository](https://archive.ics.uci.edu/ml/datasets/iris). | ||
""" | ||
module Iris | ||
|
||
using DelimitedFiles | ||
using ..Data: deps, download_and_verify | ||
|
||
const cache_prefix = "" | ||
|
||
# Uncomment if the iris.data file is cached to cache.julialang.org. | ||
# const cache_prefix = "https://cache.julialang.org/" | ||
|
||
function load() | ||
isfile(deps("iris.data")) && return | ||
|
||
@info "Downloading iris dataset." | ||
download_and_verify("$(cache_prefix)https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data", | ||
deps("iris.data"), | ||
"6f608b71a7317216319b4d27b4d9bc84e6abd734eda7872b71a458569e2656c0") | ||
end | ||
|
||
""" | ||
labels() | ||
Get the labels of the iris dataset, a 150 element array of strings listing the | ||
species of each example. | ||
```jldoctest | ||
julia> labels = Flux.Data.Iris.labels(); | ||
julia> summary(labels) | ||
"150-element Array{String,1}" | ||
julia> labels[1] | ||
"Iris-setosa" | ||
``` | ||
""" | ||
function labels() | ||
load() | ||
iris = readdlm(deps("iris.data"), ',') | ||
Vector{String}(iris[1:end, end]) | ||
end | ||
|
||
""" | ||
features() | ||
Get the features of the iris dataset. This is a 4x150 matrix of Float64 | ||
elements. It has a row for each feature (sepal length, sepal width, | ||
petal length, petal width) and a column for each example. | ||
```jldoctest | ||
julia> features = Flux.Data.Iris.features(); | ||
julia> summary(features) | ||
"4×150 Array{Float64,2}" | ||
julia> features[:, 1] | ||
4-element Array{Float64,1}: | ||
5.1 | ||
3.5 | ||
1.4 | ||
0.2 | ||
``` | ||
""" | ||
function features() | ||
load() | ||
iris = readdlm(deps("iris.data"), ',') | ||
Matrix{Float64}(iris[1:end, 1:4]') | ||
end | ||
end | ||
|
||
|