Skip to content

Commit 4bcde4b

Browse files
committed
Initial commit
0 parents  commit 4bcde4b

File tree

20 files changed

+4280
-0
lines changed

20 files changed

+4280
-0
lines changed

.gitignore

Lines changed: 95 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,95 @@
1+
# Temporary files
2+
.DS_Store
3+
.Trashes
4+
.Spotlight-V100
5+
*.swp
6+
*.lock
7+
8+
# Xcode
9+
build/
10+
DerivedData/
11+
12+
*.pbxuser
13+
*.mode1v3
14+
*.mode2v3
15+
*.perspectivev3
16+
*.xccheckout
17+
*.moved-aside
18+
*.xcuserstate
19+
20+
xcuserdata
21+
22+
!default.pbxuser
23+
!default.mode1v3
24+
!default.mode2v3
25+
!default.perspectivev3
26+
27+
profile
28+
*.hmap
29+
*.ipa
30+
31+
# CocoaPods
32+
Pods/
33+
!Podfile.lock
34+
35+
# Byte-compiled / optimized / DLL files
36+
__pycache__/
37+
*.py[cod]
38+
*$py.class
39+
40+
# C extensions
41+
*.so
42+
43+
# Distribution / packaging
44+
.Python
45+
env/
46+
build/
47+
develop-eggs/
48+
dist/
49+
downloads/
50+
eggs/
51+
.eggs/
52+
lib/
53+
lib64/
54+
parts/
55+
sdist/
56+
var/
57+
*.egg-info/
58+
.installed.cfg
59+
*.egg
60+
61+
# PyInstaller
62+
# Usually these files are written by a python script from a template
63+
# before PyInstaller builds the exe, so as to inject date/other infos into it.
64+
*.manifest
65+
*.spec
66+
67+
# Installer logs
68+
pip-log.txt
69+
pip-delete-this-directory.txt
70+
71+
# Unit test / coverage reports
72+
htmlcov/
73+
.tox/
74+
.coverage
75+
.coverage.*
76+
.cache
77+
nosetests.xml
78+
coverage.xml
79+
*,cover
80+
.hypothesis/
81+
82+
# Translations
83+
*.mo
84+
*.pot
85+
86+
# Django stuff:
87+
*.log
88+
89+
# Sphinx documentation
90+
docs/_build/
91+
92+
# PyBuilder
93+
target/
94+
95+
.ipynb_checkpoints/

README.markdown

Lines changed: 40 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,40 @@
1+
# TensorFlow on iOS demo
2+
3+
This is the code that accompanies my blog post [Getting started with TensorFlow on iOS](http://machinethink.net/blog/tensorflow-on-ios/).
4+
5+
It uses TensorFlow to train a basic binary classifier on the [Gender Recognition by Voice and Speech Analysis](https://www.kaggle.com/primaryobjects/voicegender) dataset.
6+
7+
This project includes the following:
8+
9+
- The dataset in the file **voice.csv**.
10+
- Python scripts to train the model with TensorFlow on your Mac.
11+
- An iOS app that uses the TensorFlow C++ API to do inference.
12+
- An iOS app that uses Metal to do inference using the trained model.
13+
14+
## Training the model
15+
16+
To train the model, do the following:
17+
18+
1. Make sure these are installed: `python3`, `numpy`, `pandas`, `scikit-learn`, `tensorflow`.
19+
2. Run the **split_data.py** script to divide the dataset into a training set and a test set. This creates 4 new files: `X_train.npy`, `y_train.npy`, `X_test.npy`, and `y_test.npy`.
20+
3. Run the **train.py** script. This trains the logistic classifier and saves the model to `/tmp/voice` every 10,000 training steps. Training happens in an infinite loop and goes on forever, so press Ctrl+C when you're happy with the training set accuracy and the loss no longer becomes any lower.
21+
4. Run the **test.py** script to compute the accuracy on the test set. This also prints out a report with precision / recall / f1-score and a confusion matrix.
22+
23+
## Using the model with the iOS TensorFlow app
24+
25+
To run the model on the iOS TensorFlow app, do the following:
26+
27+
1. Clone [TensorFlow](https://github.com/tensorflow) and [build the iOS library](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/makefile).
28+
2. Open the **VoiceTensorFlow** Xcode project. In **Build Settings**, **Other Linker Flags** and **Header Search Paths**, change the paths to your local installation of TensorFlow.
29+
30+
The model is already included in the app as **inference.pb**. If you train the model with different settings, you need to run the `freeze_graph` and `optimize_for_inference` tools to create a new inference.pb.
31+
32+
## Using the model with the iOS Metal app
33+
34+
To run the model on the iOS Metal app, do the following:
35+
36+
1. Run the **export_weights.py** script. This creates two new files that contain the model's learned parameters: `W.bin` for the weights and `b.bin` for the bias.
37+
2. Copy `W.bin` and `b.bin` into the **VoiceMetal** Xcode project and build the app.
38+
39+
You need to run the Metal app on a device, it won't work in the simulator.
40+

Scripts/clean.sh

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
#!/bin/sh
2+
# Run this script to start afresh.
3+
rm *.npy
4+
rm *.bin

Scripts/export_weights.py

Lines changed: 38 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,38 @@
1+
# This script exports the learned parameters so that we can use them from Metal.
2+
3+
# Note: Dor this simple demo project the weight matrix is only 20 values and the bias
4+
# is a single number. With such a simple model you might as well stick the parameters
5+
# inside a static array in the iOS app source code. In practice, however, most models
6+
# will have millions of parameters.
7+
8+
import os
9+
import numpy as np
10+
import tensorflow as tf
11+
from sklearn import metrics
12+
13+
checkpoint_dir = "/tmp/voice/"
14+
15+
with tf.Session() as sess:
16+
# Load the graph.
17+
graph_file = os.path.join(checkpoint_dir, "graph.pb")
18+
with tf.gfile.FastGFile(graph_file, "rb") as f:
19+
graph_def = tf.GraphDef()
20+
graph_def.ParseFromString(f.read())
21+
tf.import_graph_def(graph_def, name="")
22+
23+
# Get the model's variables.
24+
W = sess.graph.get_tensor_by_name("model/W:0")
25+
b = sess.graph.get_tensor_by_name("model/b:0")
26+
27+
# Load the saved variables from the checkpoint back into the session.
28+
checkpoint_file = os.path.join(checkpoint_dir, "model")
29+
saver = tf.train.Saver([W, b])
30+
saver.restore(sess, checkpoint_file)
31+
32+
# Just for debugging, print out the learned parameters.
33+
print("W:", W.eval())
34+
print("b:", b.eval())
35+
36+
# Export the contents of W and b as binary files.
37+
W.eval().tofile("W.bin")
38+
b.eval().tofile("b.bin")

Scripts/split_data.py

Lines changed: 40 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,40 @@
1+
# This script loads the original dataset and splits it into a training set and test set.
2+
3+
import numpy as np
4+
import pandas as pd
5+
6+
# Read the CSV file.
7+
df = pd.read_csv("voice.csv", header=0)
8+
9+
# Extract the labels into a numpy array. The original labels are text but we convert
10+
# this to numbers: 1 = male, 0 = female.
11+
labels = (df["label"] == "male").values * 1
12+
13+
# labels is a row vector but TensorFlow expects a column vector, so reshape it.
14+
labels = labels.reshape(-1, 1)
15+
16+
# Remove the column with the labels.
17+
del df["label"]
18+
19+
# OPTIONAL: Do additional preprocessing, such as scaling the features.
20+
# for column in df.columns:
21+
# mean = df[column].mean()
22+
# std = df[column].std()
23+
# df[column] = (df[column] - mean) / std
24+
25+
# Convert the training data to a numpy array.
26+
data = df.values
27+
print("Full dataset size:", data.shape)
28+
29+
# Split into a random training set and a test set.
30+
from sklearn.model_selection import train_test_split
31+
X_train, X_test, y_train, y_test = train_test_split(data, labels, test_size=0.3, random_state=123456)
32+
33+
print("Training set size:", X_train.shape)
34+
print("Test set size:", X_test.shape)
35+
36+
# Save the matrices using numpy's native format.
37+
np.save("X_train.npy", X_train)
38+
np.save("X_test.npy", X_test)
39+
np.save("y_train.npy", y_train)
40+
np.save("y_test.npy", y_test)

Scripts/test.py

Lines changed: 50 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,50 @@
1+
# This script tests how well the trained model performs on the portion of the
2+
# data that was not used for training.
3+
4+
import os
5+
import numpy as np
6+
import tensorflow as tf
7+
from sklearn import metrics
8+
9+
checkpoint_dir = "/tmp/voice/"
10+
11+
# Load the test data.
12+
X_test = np.load("X_test.npy")
13+
y_test = np.load("y_test.npy")
14+
15+
print("Test set size:", X_test.shape)
16+
17+
with tf.Session() as sess:
18+
# Load the graph.
19+
graph_file = os.path.join(checkpoint_dir, "graph.pb")
20+
with tf.gfile.FastGFile(graph_file, "rb") as f:
21+
graph_def = tf.GraphDef()
22+
graph_def.ParseFromString(f.read())
23+
tf.import_graph_def(graph_def, name="")
24+
25+
# Uncomment the next line in case you're curious what the graph looks like.
26+
#print(graph_def.ListFields())
27+
28+
# Get the model's variables.
29+
W = sess.graph.get_tensor_by_name("model/W:0")
30+
b = sess.graph.get_tensor_by_name("model/b:0")
31+
32+
# Load the saved variables from the checkpoint back into the session.
33+
checkpoint_file = os.path.join(checkpoint_dir, "model")
34+
saver = tf.train.Saver([W, b])
35+
saver.restore(sess, checkpoint_file)
36+
37+
# Get the placeholders and the accuracy operation, so that we can compute
38+
# the accuracy (% correct) of the test set.
39+
x = sess.graph.get_tensor_by_name("inputs/x-input:0")
40+
y = sess.graph.get_tensor_by_name("inputs/y-input:0")
41+
accuracy = sess.graph.get_tensor_by_name("score/accuracy:0")
42+
print("Test set accuracy:", sess.run(accuracy, feed_dict={x: X_test, y: y_test}))
43+
44+
# Also show some other reports.
45+
inference = sess.graph.get_tensor_by_name("inference/inference:0")
46+
predictions = sess.run(inference, feed_dict={x: X_test})
47+
print("\nClassification report:")
48+
print(metrics.classification_report(y_test.ravel(), predictions))
49+
print("Confusion matrix:")
50+
print(metrics.confusion_matrix(y_test.ravel(), predictions))

Scripts/train.py

Lines changed: 125 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,125 @@
1+
# This script is used to train the model. It repeats indefinitely and saves the
2+
# model every so often to a checkpoint.
3+
#
4+
# Press Ctrl+C when you feel that training has gone on long enough (since this is
5+
# only a simple model it takes less than a minute to train, but a training a deep l
6+
# earning model could take days).
7+
8+
import os
9+
import numpy as np
10+
import tensorflow as tf
11+
12+
checkpoint_dir = "/tmp/voice/"
13+
print_every = 1000
14+
save_every = 10000
15+
num_inputs = 20
16+
num_classes = 1
17+
18+
# Load the training data.
19+
X_train = np.load("X_train.npy")
20+
y_train = np.load("y_train.npy")
21+
22+
print("Training set size:", X_train.shape)
23+
24+
# Below we'll define the computational graph using TensorFlow. The different parts
25+
# of the model are grouped into different "scopes", making it easier to understand
26+
# what each part is doing.
27+
28+
# Hyperparameters let you configure the model and how it is trained. They're
29+
# called "hyper" parameters because unlike the regular parameters they are not
30+
# learned by the model -- you have to set them to appropriate values yourself.
31+
#
32+
# The learning_rate tells the optimizer how big of a steps it should take.
33+
# Regularization is used to prevent overfitting on the training set.
34+
with tf.name_scope("hyperparameters"):
35+
regularization = tf.placeholder(tf.float32, name="regularization")
36+
learning_rate = tf.placeholder(tf.float32, name="learning-rate")
37+
38+
# This is where we feed the training data (and later the test data) into the model.
39+
# In this dataset there are 20 features, so x is a matrix with 20 columns. Its number
40+
# of rows is None because it depends on how many examples at a time we put into this
41+
# matrix. This is a binary classifier so for every training example, y gives a single
42+
# output: 1 = male, 0 = female.
43+
with tf.name_scope("inputs"):
44+
x = tf.placeholder(tf.float32, [None, num_inputs], name="x-input")
45+
y = tf.placeholder(tf.float32, [None, num_classes], name="y-input")
46+
47+
# The parameters that we'll learn consist of W, a weight matrix, and b, a vector
48+
# of bias values. (Actually, b is just a single value since the classifier has only
49+
# one output. For a classifier that can recognize multiple classes, b would have as
50+
# many elements as there are classes.)
51+
with tf.name_scope("model"):
52+
W = tf.Variable(tf.zeros([num_inputs, num_classes]), name="W")
53+
b = tf.Variable(tf.zeros([num_classes]), name="b")
54+
55+
# The output is the probability the speaker is male. If this is greater than
56+
# 0.5, we consider the speaker to be male, otherwise female.
57+
y_pred = tf.sigmoid(tf.matmul(x, W) + b, name="y_pred")
58+
59+
# This is a logistic classifier, so the loss function is the logistic loss.
60+
with tf.name_scope("loss-function"):
61+
loss = tf.losses.log_loss(labels=y, predictions=y_pred)
62+
63+
# Add L2 regularization to the loss.
64+
loss += regularization * tf.nn.l2_loss(W)
65+
66+
# Use the ADAM optimizer to minimize the loss.
67+
with tf.name_scope("train"):
68+
optimizer = tf.train.AdamOptimizer(learning_rate)
69+
train_op = optimizer.minimize(loss)
70+
71+
# For doing inference on new data for which we don't have labels.
72+
with tf.name_scope("inference"):
73+
inference = tf.to_float(y_pred > 0.5, name="inference")
74+
75+
# The accuracy operation computes the % correct on a dataset with known labels.
76+
with tf.name_scope("score"):
77+
correct_prediction = tf.equal(inference, y)
78+
accuracy = tf.reduce_mean(tf.to_float(correct_prediction), name="accuracy")
79+
80+
init = tf.global_variables_initializer()
81+
82+
# For writing training checkpoints and reading them back in.
83+
saver = tf.train.Saver()
84+
tf.gfile.MakeDirs(checkpoint_dir)
85+
86+
with tf.Session() as sess:
87+
# Write the graph definition to a file. We'll load this in the test.py script.
88+
tf.train.write_graph(sess.graph_def, checkpoint_dir, "graph.pb", False)
89+
90+
# Reset W and b to zero.
91+
sess.run(init)
92+
93+
# Sanity check: the initial loss should be 0.693146, which is -ln(0.5).
94+
loss_value = sess.run(loss, feed_dict={x: X_train, y: y_train, regularization: 0})
95+
print("Initial loss:", loss_value)
96+
97+
# Loop forever:
98+
step = 0
99+
while True:
100+
# We randomly shuffle the examples every time we train.
101+
perm = np.arange(len(X_train))
102+
np.random.shuffle(perm)
103+
X_train = X_train[perm]
104+
y_train = y_train[perm]
105+
106+
# Run the optimizer over the entire training set at once. For larger datasets
107+
# you would train in batches of 100-1000 examples instead of the entire thing.
108+
feed = {x: X_train, y: y_train, learning_rate: 1e-2, regularization: 1e-5}
109+
sess.run(train_op, feed_dict=feed)
110+
111+
# Print the loss once every so many steps. Because of the regularization,
112+
# at some point the loss won't become smaller anymore. At that point, it's
113+
# safe to press Ctrl+C to stop the training.
114+
if step % print_every == 0:
115+
train_accuracy, loss_value = sess.run([accuracy, loss], feed_dict=feed)
116+
print("step: %4d, loss: %.4f, training accuracy: %.4f" % \
117+
(step, loss_value, train_accuracy))
118+
119+
step += 1
120+
121+
# Save the model. You should only press Ctrl+C after you see this message.
122+
if step % save_every == 0:
123+
checkpoint_file = os.path.join(checkpoint_dir, "model")
124+
saver.save(sess, checkpoint_file)
125+
print("*** SAVED MODEL ***")

0 commit comments

Comments
 (0)