Skip to content

Assignment 6. Test and Diagnostics #9

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 5 additions & 1 deletion diagnosticTests/learningCurve.m
Original file line number Diff line number Diff line change
Expand Up @@ -57,7 +57,11 @@




for i = 1:m
[theta] = trainLinearReg(X(1:i, :), y(1:i), lambda);
error_train(i) = linearRegCostFunction(X(1:i, :), y(1:i), theta, 0);
error_val(i) = linearRegCostFunction(Xval, yval, theta, 0);
end

% -------------------------------------------------------------

Expand Down
11 changes: 11 additions & 0 deletions diagnosticTests/linearRegCostFunction.m
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,18 @@
%


J = sum((X*theta - y).^2)/(2*m);

costRegularizationTerm = lambda/(2*m) * sum( theta(2:end).^2 );

J = J + costRegularizationTerm;


grad = X' * (X*theta - y);


gradientRegularizationTerm = lambda/m * [0; theta(2:end)];
grad = (1/m) * grad + gradientRegularizationTerm;



Expand Down
6 changes: 6 additions & 0 deletions diagnosticTests/polyFeatures.m
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,13 @@
%


m = size(X, 1);

for i = 1:m
for j = 1:p
X_poly(i, j) = X(i, 1)^j;
end
end



Expand Down
6 changes: 6 additions & 0 deletions diagnosticTests/validationCurve.m
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,13 @@
%
%

for i = 1:length(lambda_vec)
lambda = lambda_vec(i);
[theta] = trainLinearReg(X, y, lambda);
error_train(i) = linearRegCostFunction(X, y, theta, 0);
error_val(i) = linearRegCostFunction(Xval, yval, theta, 0);

end



Expand Down
Binary file added supportVectorMachines/ex6.pdf
Binary file not shown.
49 changes: 49 additions & 0 deletions supportVectorMachines/ex6/dataset3Params.m
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
function [C, sigma] = dataset3Params(X, y, Xval, yval)
%DATASET3PARAMS returns your choice of C and sigma for Part 3 of the exercise
%where you select the optimal (C, sigma) learning parameters to use for SVM
%with RBF kernel
% [C, sigma] = DATASET3PARAMS(X, y, Xval, yval) returns your choice of C and
% sigma. You should complete this function to return the optimal C and
% sigma based on a cross-validation set.
%

% You need to return the following variables correctly.
C = 1;
sigma = 0.3;

% ====================== YOUR CODE HERE ======================
% Instructions: Fill in this function to return the optimal C and sigma
% learning parameters found using the cross validation set.
% You can use svmPredict to predict the labels on the cross
% validation set. For example,
% predictions = svmPredict(model, Xval);
% will return the predictions on the cross validation set.
%
% Note: You can compute the prediction error using
% mean(double(predictions ~= yval))
%


meanVal = intmax
for i = [0.01 0.03 0.1 0.3 1 3 10 30]
for j = [0.01 0.03 0.1 0.3 1 3 10 30]
model = svmTrain(X, y, i, @(x1, x2) gaussianKernel(x1, x2, j));
predictions = svmPredict(model, Xval);
tempMean = mean(double(predictions ~= yval))
if(tempMean < meanVal)
meanVal = tempMean;
C = i;
sigma = j;
end
end
end







% =========================================================================

end
63 changes: 63 additions & 0 deletions supportVectorMachines/ex6/emailFeatures.m
Original file line number Diff line number Diff line change
@@ -0,0 +1,63 @@
function x = emailFeatures(word_indices)
%EMAILFEATURES takes in a word_indices vector and produces a feature vector
%from the word indices
% x = EMAILFEATURES(word_indices) takes in a word_indices vector and
% produces a feature vector from the word indices.

% Total number of words in the dictionary
n = 1899;

% You need to return the following variables correctly.
x = zeros(n, 1);

% ====================== YOUR CODE HERE ======================
% Instructions: Fill in this function to return a feature vector for the
% given email (word_indices). To help make it easier to
% process the emails, we have have already pre-processed each
% email and converted each word in the email into an index in
% a fixed dictionary (of 1899 words). The variable
% word_indices contains the list of indices of the words
% which occur in one email.
%
% Concretely, if an email has the text:
%
% The quick brown fox jumped over the lazy dog.
%
% Then, the word_indices vector for this text might look
% like:
%
% 60 100 33 44 10 53 60 58 5
%
% where, we have mapped each word onto a number, for example:
%
% the -- 60
% quick -- 100
% ...
%
% (note: the above numbers are just an example and are not the
% actual mappings).
%
% Your task is take one such word_indices vector and construct
% a binary feature vector that indicates whether a particular
% word occurs in the email. That is, x(i) = 1 when word i
% is present in the email. Concretely, if the word 'the' (say,
% index 60) appears in the email, then x(60) = 1. The feature
% vector should look like:
%
% x = [ 0 0 0 0 1 0 0 0 ... 0 0 0 0 1 ... 0 0 0 1 0 ..];
%
%
m = size(word_indices, 1)
for i = 1:m
x(word_indices(i)) = 1;
end






% =========================================================================


end
10 changes: 10 additions & 0 deletions supportVectorMachines/ex6/emailSample1.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
> Anyone knows how much it costs to host a web portal ?
>
Well, it depends on how many visitors you're expecting.
This can be anywhere from less than 10 bucks a month to a couple of $100.
You should checkout http://www.rackspace.com/ or perhaps Amazon EC2
if youre running something big..

To unsubscribe yourself from this mailing list, send an email to:
groupname-unsubscribe@egroups.com

34 changes: 34 additions & 0 deletions supportVectorMachines/ex6/emailSample2.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
Folks,

my first time posting - have a bit of Unix experience, but am new to Linux.


Just got a new PC at home - Dell box with Windows XP. Added a second hard disk
for Linux. Partitioned the disk and have installed Suse 7.2 from CD, which went
fine except it didn't pick up my monitor.

I have a Dell branded E151FPp 15" LCD flat panel monitor and a nVidia GeForce4
Ti4200 video card, both of which are probably too new to feature in Suse's default
set. I downloaded a driver from the nVidia website and installed it using RPM.
Then I ran Sax2 (as was recommended in some postings I found on the net), but
it still doesn't feature my video card in the available list. What next?

Another problem. I have a Dell branded keyboard and if I hit Caps-Lock twice,
the whole machine crashes (in Linux, not Windows) - even the on/off switch is
inactive, leaving me to reach for the power cable instead.

If anyone can help me in any way with these probs., I'd be really grateful -
I've searched the 'net but have run out of ideas.

Or should I be going for a different version of Linux such as RedHat? Opinions
welcome.

Thanks a lot,
Peter

--
Irish Linux Users' Group: ilug@linux.ie
http://www.linux.ie/mailman/listinfo/ilug for (un)subscription information.
List maintainer: listmaster@linux.ie


150 changes: 150 additions & 0 deletions supportVectorMachines/ex6/ex6.m
Original file line number Diff line number Diff line change
@@ -0,0 +1,150 @@
%% Machine Learning Online Class
% Exercise 6 | Support Vector Machines
%
% Instructions
% ------------
%
% This file contains code that helps you get started on the
% exercise. You will need to complete the following functions:
%
% gaussianKernel.m
% dataset3Params.m
% processEmail.m
% emailFeatures.m
%
% For this exercise, you will not need to change any code in this file,
% or any other files other than those mentioned above.
%

%% Initialization
clear ; close all; clc

%% =============== Part 1: Loading and Visualizing Data ================
% We start the exercise by first loading and visualizing the dataset.
% The following code will load the dataset into your environment and plot
% the data.
%

fprintf('Loading and Visualizing Data ...\n')

% Load from ex6data1:
% You will have X, y in your environment
load('ex6data1.mat');

% Plot training data
plotData(X, y);

fprintf('Program paused. Press enter to continue.\n');
pause;

%% ==================== Part 2: Training Linear SVM ====================
% The following code will train a linear SVM on the dataset and plot the
% decision boundary learned.
%

% Load from ex6data1:
% You will have X, y in your environment
load('ex6data1.mat');

fprintf('\nTraining Linear SVM ...\n')

% You should try to change the C value below and see how the decision
% boundary varies (e.g., try C = 1000)
C = 1;
model = svmTrain(X, y, C, @linearKernel, 1e-3, 20);
visualizeBoundaryLinear(X, y, model);

fprintf('Program paused. Press enter to continue.\n');
pause;

%% =============== Part 3: Implementing Gaussian Kernel ===============
% You will now implement the Gaussian kernel to use
% with the SVM. You should complete the code in gaussianKernel.m
%
fprintf('\nEvaluating the Gaussian Kernel ...\n')

x1 = [1 2 1]; x2 = [0 4 -1]; sigma = 2;
sim = gaussianKernel(x1, x2, sigma);

fprintf(['Gaussian Kernel between x1 = [1; 2; 1], x2 = [0; 4; -1], sigma = %f :' ...
'\n\t%f\n(for sigma = 2, this value should be about 0.324652)\n'], sigma, sim);

fprintf('Program paused. Press enter to continue.\n');
pause;

%% =============== Part 4: Visualizing Dataset 2 ================
% The following code will load the next dataset into your environment and
% plot the data.
%

fprintf('Loading and Visualizing Data ...\n')

% Load from ex6data2:
% You will have X, y in your environment
load('ex6data2.mat');

% Plot training data
plotData(X, y);

fprintf('Program paused. Press enter to continue.\n');
pause;

%% ========== Part 5: Training SVM with RBF Kernel (Dataset 2) ==========
% After you have implemented the kernel, we can now use it to train the
% SVM classifier.
%
fprintf('\nTraining SVM with RBF Kernel (this may take 1 to 2 minutes) ...\n');

% Load from ex6data2:
% You will have X, y in your environment
load('ex6data2.mat');

% SVM Parameters
C = 1; sigma = 0.1;

% We set the tolerance and max_passes lower here so that the code will run
% faster. However, in practice, you will want to run the training to
% convergence.
model= svmTrain(X, y, C, @(x1, x2) gaussianKernel(x1, x2, sigma));
visualizeBoundary(X, y, model);

fprintf('Program paused. Press enter to continue.\n');
pause;

%% =============== Part 6: Visualizing Dataset 3 ================
% The following code will load the next dataset into your environment and
% plot the data.
%

fprintf('Loading and Visualizing Data ...\n')

% Load from ex6data3:
% You will have X, y in your environment
load('ex6data3.mat');

% Plot training data
plotData(X, y);

fprintf('Program paused. Press enter to continue.\n');
pause;

%% ========== Part 7: Training SVM with RBF Kernel (Dataset 3) ==========

% This is a different dataset that you can use to experiment with. Try
% different values of C and sigma here.
%

% Load from ex6data3:
% You will have X, y in your environment
load('ex6data3.mat');

% Try different SVM Parameters here
[C, sigma] = dataset3Params(X, y, Xval, yval);

% Train the SVM
model= svmTrain(X, y, C, @(x1, x2) gaussianKernel(x1, x2, sigma));
visualizeBoundary(X, y, model);

fprintf('Program paused. Press enter to continue.\n');
pause;

Loading