Skip to content

Commit af6dc60

Browse files
committed
softmax regression model, minor changes to other models
1 parent 34de448 commit af6dc60

4 files changed

+542
-29
lines changed

logistic_regression.ipynb

+47-6
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@
44
"cell_type": "markdown",
55
"metadata": {},
66
"source": [
7-
"# Logistic Regression in plain Python"
7+
"## Logistic Regression in plain Python"
88
]
99
},
1010
{
@@ -24,10 +24,11 @@
2424
"- it has a real-valued bias $b$\n",
2525
"- it uses a sigmoid function as its activation function\n",
2626
"\n",
27-
"Different to linear regression, logistic regression has no closed form solution. But the cost function is convex, so we can train the model using gradient descent. In fact, **gradient descent** (or any other optimization algorithm) is guaranteed to find the global minimum (if the learning rate is small enough and enough training iterations are used). \n",
28-
"* * *\n",
29-
"Training a logistic regression model has different steps.\n",
27+
"Different to linear regression, logistic regression has no closed form solution. But the cost function is convex, so we can train the model using gradient descent. In fact, **gradient descent** (or any other optimization algorithm) is guaranteed to find the global minimum (if the learning rate is small enough and enough training iterations are used). \n",
28+
"\n",
29+
"Training a logistic regression model has different steps. In the beginning (step 0) the parameters are initialized. The other steps are repeated for a specified number of training iterations or until convergence or the parameters.\n",
3030
"\n",
31+
"* * * \n",
3132
"**Step 0: ** Initialize the weight vector and bias with zeros (or small random values).\n",
3233
"* * *\n",
3334
"\n",
@@ -53,7 +54,7 @@
5354
"\n",
5455
"$ \\frac{\\partial J}{\\partial w_j} = \\frac{1}{m}\\sum_{i=1}^m\\left[\\hat{y}^{(i)}-y^{(i)}\\right]\\,x_j^{(i)}$\n",
5556
"\n",
56-
"For the bias, the input $x_j^{(i)}$ will be given 1.\n",
57+
"For the bias, the inputs $x_j^{(i)}$ will be given 1.\n",
5758
"* * *\n",
5859
"\n",
5960
"** Step 5: ** Update the weights and bias\n",
@@ -83,6 +84,13 @@
8384
"% matplotlib inline"
8485
]
8586
},
87+
{
88+
"cell_type": "markdown",
89+
"metadata": {},
90+
"source": [
91+
"## Dataset"
92+
]
93+
},
8694
{
8795
"cell_type": "code",
8896
"execution_count": 3,
@@ -139,6 +147,13 @@
139147
"print('Shape y_test: ', y_test.shape)"
140148
]
141149
},
150+
{
151+
"cell_type": "markdown",
152+
"metadata": {},
153+
"source": [
154+
"## Logistic regression model"
155+
]
156+
},
142157
{
143158
"cell_type": "code",
144159
"execution_count": 50,
@@ -165,7 +180,7 @@
165180
" \"\"\"\n",
166181
" Computes the sigmoid function\n",
167182
" \"\"\"\n",
168-
" return 1/ (1 + np.exp(-a))\n",
183+
" return 1 / (1 + np.exp(-a))\n",
169184
"\n",
170185
"def train(X, y_true,w, b, n_iter, learning_rate):\n",
171186
" \"\"\"\n",
@@ -204,6 +219,13 @@
204219
" return np.array(y_predict_labels)[:, np.newaxis]"
205220
]
206221
},
222+
{
223+
"cell_type": "markdown",
224+
"metadata": {},
225+
"source": [
226+
"## Training"
227+
]
228+
},
207229
{
208230
"cell_type": "code",
209231
"execution_count": 52,
@@ -242,6 +264,13 @@
242264
"plt.show()"
243265
]
244266
},
267+
{
268+
"cell_type": "markdown",
269+
"metadata": {},
270+
"source": [
271+
"## Testing the model"
272+
]
273+
},
245274
{
246275
"cell_type": "code",
247276
"execution_count": 53,
@@ -283,6 +312,18 @@
283312
"nbconvert_exporter": "python",
284313
"pygments_lexer": "ipython3",
285314
"version": "3.6.4"
315+
},
316+
"toc": {
317+
"nav_menu": {},
318+
"number_sections": true,
319+
"sideBar": true,
320+
"skip_h1_title": false,
321+
"title_cell": "Table of Contents",
322+
"title_sidebar": "Contents",
323+
"toc_cell": false,
324+
"toc_position": {},
325+
"toc_section_display": true,
326+
"toc_window_display": false
286327
}
287328
},
288329
"nbformat": 4,

perceptron.ipynb

+55-5
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@
44
"cell_type": "markdown",
55
"metadata": {},
66
"source": [
7-
"# Perceptron algorithm in plain Python\n",
7+
"## Perceptron algorithm in plain Python\n",
88
"\n",
99
"The perceptron is a simple supervised machine learning algorithm and one of the earliest **neural network** architectures. It was introduced by Rosenblatt in the late 1950s. A perceptron represents a **binary linear classifier** that maps a set of training examples (of $d$ dimensional input vectors) onto binary output values using a $d-1$ dimensional hyperplane.\n",
1010
"\n",
@@ -73,6 +73,13 @@
7373
"% matplotlib inline"
7474
]
7575
},
76+
{
77+
"cell_type": "markdown",
78+
"metadata": {},
79+
"source": [
80+
"## Dataset"
81+
]
82+
},
7683
{
7784
"cell_type": "code",
7885
"execution_count": 49,
@@ -118,10 +125,17 @@
118125
"y_true = y[:, np.newaxis]\n",
119126
"\n",
120127
"X_train, X_test, y_train, y_test = train_test_split(X, y_true)\n",
121-
"print('Shape X_train: ', X_train.shape)\n",
122-
"print('Shape y_train: ', y_train.shape)\n",
123-
"print('Shape X_test: ', X_test.shape)\n",
124-
"print('Shape y_test: ', y_test.shape)"
128+
"print(f'Shape X_train: {X_train.shape}')\n",
129+
"print(f'Shape y_train: {y_train.shape)')\n",
130+
"print(f'Shape X_test: {X_test.shape}')\n",
131+
"print(f'Shape y_test: {y_test.shape}')"
132+
]
133+
},
134+
{
135+
"cell_type": "markdown",
136+
"metadata": {},
137+
"source": [
138+
"## Perceptron model class"
125139
]
126140
},
127141
{
@@ -169,6 +183,18 @@
169183
" return self.step_function(a)"
170184
]
171185
},
186+
{
187+
"cell_type": "markdown",
188+
"metadata": {
189+
"ExecuteTime": {
190+
"end_time": "2018-03-09T14:34:26.829303Z",
191+
"start_time": "2018-03-09T14:34:26.824652Z"
192+
}
193+
},
194+
"source": [
195+
"## Initialization, training and testing"
196+
]
197+
},
172198
{
173199
"cell_type": "code",
174200
"execution_count": 52,
@@ -194,6 +220,18 @@
194220
"print(\"test accuracy: {} %\".format(100 - np.mean(np.abs(y_p_test - y_test)) * 100))"
195221
]
196222
},
223+
{
224+
"cell_type": "markdown",
225+
"metadata": {
226+
"ExecuteTime": {
227+
"end_time": "2018-03-09T14:35:14.630757Z",
228+
"start_time": "2018-03-09T14:35:14.626460Z"
229+
}
230+
},
231+
"source": [
232+
"## Visualize decision boundary"
233+
]
234+
},
197235
{
198236
"cell_type": "code",
199237
"execution_count": 53,
@@ -266,6 +304,18 @@
266304
"nbconvert_exporter": "python",
267305
"pygments_lexer": "ipython3",
268306
"version": "3.6.4"
307+
},
308+
"toc": {
309+
"nav_menu": {},
310+
"number_sections": true,
311+
"sideBar": true,
312+
"skip_h1_title": false,
313+
"title_cell": "Table of Contents",
314+
"title_sidebar": "Contents",
315+
"toc_cell": false,
316+
"toc_position": {},
317+
"toc_section_display": true,
318+
"toc_window_display": false
269319
}
270320
},
271321
"nbformat": 4,

simple_neural_net.py.ipynb

+26-18
Original file line numberDiff line numberDiff line change
@@ -191,10 +191,10 @@
191191
"# Split the data into a training and test set\n",
192192
"X_train, X_test, y_train, y_test = train_test_split(X, y)\n",
193193
"\n",
194-
"print('Shape X_train: ', X_train.shape)\n",
195-
"print('Shape y_train: ', y_train.shape)\n",
196-
"print('Shape X_test: ', X_test.shape)\n",
197-
"print('Shape y_test: ', y_test.shape)"
194+
"print(f'Shape X_train: {X_train.shape}')\n",
195+
"print(f'Shape y_train: {y_train.shape}')\n",
196+
"print(f'Shape X_test: {X_test.shape}')\n",
197+
"print(f'Shape y_test: {y_test.shape}')"
198198
]
199199
},
200200
{
@@ -219,21 +219,18 @@
219219
" self.n_outputs = n_outputs\n",
220220
" self.hidden = n_hidden\n",
221221
"\n",
222-
" # Initialie weight matrices and bias vectors\n",
222+
" # Initialize weight matrices and bias vectors\n",
223223
" self.W_h = np.random.randn(self.n_inputs, self.hidden)\n",
224224
" self.b_h = np.zeros((1, self.hidden))G\n",
225225
" self.W_o = np.random.randn(self.hidden, self.n_outputs)\n",
226226
" self.b_o = np.zeros((1, self.n_outputs))\n",
227227
"\n",
228228
" def sigmoid(self, a):\n",
229-
" \"\"\"\n",
230-
" Applies the sigmoid function to the given input and returns the result\n",
231-
" \"\"\"\n",
232229
" return 1 / (1 + np.exp(-a))\n",
233230
"\n",
234231
" def forward_pass(self, X):\n",
235232
" \"\"\"\n",
236-
" Propagates the given input forward through the net.\n",
233+
" Propagates the given input X forward through the net.\n",
237234
"\n",
238235
" Returns:\n",
239236
" A_h: matrix with activations of all hidden neurons for all input examples\n",
@@ -307,12 +304,10 @@
307304
" self.b_o = self.b_o - eta * gradients[\"db_o\"]\n",
308305
" self.b_h = self.b_h - eta * gradients[\"db_h\"]\n",
309306
"\n",
310-
" def train(self, X, y, n_iters, eta):\n",
307+
" def train(self, X, y, n_iters=500, eta=0.3):\n",
311308
" \"\"\"\n",
312309
" Trains the neural net on the given input data\n",
313310
" \"\"\"\n",
314-
" assert eta is not None, 'learning rate must be provided'\n",
315-
" assert n_iters is not None, 'number of iterations must be provided'\n",
316311
" assert X is not None, 'dataset must be provided'\n",
317312
" assert y is not None, 'target values must be provided'\n",
318313
"\n",
@@ -324,7 +319,7 @@
324319
" gradients = self.backward_pass(X, y, n_samples, outputs)\n",
325320
"\n",
326321
" if i % 100 == 0:\n",
327-
" print('Cost at iteration {}: {}'.format(i, np.round(cost, 4)))\n",
322+
" print(f'Cost at iteration {i}: {np.round(cost, 4)}')\n",
328323
"\n",
329324
" self.update_weights(gradients, eta)\n",
330325
"\n",
@@ -388,10 +383,10 @@
388383
"source": [
389384
"nn = NeuralNet(n_inputs=2, n_hidden=6, n_outputs=1)\n",
390385
"print(\"Shape of weight matrices and bias vectors:\")\n",
391-
"print('W_h shape: ', nn.W_h.shape)\n",
392-
"print('b_h shape: ', nn.b_h.shape)\n",
393-
"print('W_o shape: ', nn.W_o.shape)\n",
394-
"print('b_o shape: ', nn.b_o.shape)\n",
386+
"print(f'W_h shape: {nn.W_h.shape}')\n",
387+
"print(f'b_h shape: {nn.b_h.shape}')\n",
388+
"print(f'W_o shape: {nn.W_o.shape}')\n",
389+
"print(f'b_o shape: {nn.b_o.shape}')\n",
395390
"print()\n",
396391
"\n",
397392
"print(\"Training:\")\n",
@@ -421,7 +416,7 @@
421416
"source": [
422417
"n_test_samples, _ = X_test.shape\n",
423418
"y_predict = nn.predict(X_test)\n",
424-
"print(\"Classification accuracy on test set: {}\".format(np.sum(y_predict == y_test)/n_test_samples))"
419+
"print(f\"Classification accuracy on test set: {format(np.sum(y_predict == y_test)/n_test_samples)}\")"
425420
]
426421
},
427422
{
@@ -492,6 +487,7 @@
492487
}
493488
],
494489
"metadata": {
490+
"anaconda-cloud": {},
495491
"kernelspec": {
496492
"display_name": "Python [conda root]",
497493
"language": "python",
@@ -508,6 +504,18 @@
508504
"nbconvert_exporter": "python",
509505
"pygments_lexer": "ipython3",
510506
"version": "3.6.4"
507+
},
508+
"toc": {
509+
"nav_menu": {},
510+
"number_sections": true,
511+
"sideBar": true,
512+
"skip_h1_title": false,
513+
"title_cell": "Table of Contents",
514+
"title_sidebar": "Contents",
515+
"toc_cell": false,
516+
"toc_position": {},
517+
"toc_section_display": true,
518+
"toc_window_display": false
511519
}
512520
},
513521
"nbformat": 4,

0 commit comments

Comments
 (0)