Skip to content

Commit 94264ed

Browse files
committed
ex8 "Anomaly Detection and Recommender Systems" is done and submitted.
1 parent 255db0a commit 94264ed

File tree

3 files changed

+45
-38
lines changed

3 files changed

+45
-38
lines changed

ex8/mlclass-ex8/cofiCostFunc.m

Lines changed: 27 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@
1111
Theta = reshape(params(num_movies*num_features+1:end), ...
1212
num_users, num_features);
1313

14-
14+
1515
% You need to return the following values correctly
1616
J = 0;
1717
X_grad = zeros(size(X));
@@ -21,39 +21,48 @@
2121
% Instructions: Compute the cost function and gradient for collaborative
2222
% filtering. Concretely, you should first implement the cost
2323
% function (without regularization) and make sure it is
24-
% matches our costs. After that, you should implement the
24+
% matches our costs. After that, you should implement the
2525
% gradient and use the checkCostFunction routine to check
2626
% that the gradient is correct. Finally, you should implement
2727
% regularization.
2828
%
2929
% Notes: X - num_movies x num_features matrix of movie features
3030
% Theta - num_users x num_features matrix of user features
3131
% Y - num_movies x num_users matrix of user ratings of movies
32-
% R - num_movies x num_users matrix, where R(i, j) = 1 if the
32+
% R - num_movies x num_users matrix, where R(i, j) = 1 if the
3333
% i-th movie was rated by the j-th user
3434
%
3535
% You should set the following variables correctly:
3636
%
37-
% X_grad - num_movies x num_features matrix, containing the
37+
% X_grad - num_movies x num_features matrix, containing the
3838
% partial derivatives w.r.t. to each element of X
39-
% Theta_grad - num_users x num_features matrix, containing the
39+
% Theta_grad - num_users x num_features matrix, containing the
4040
% partial derivatives w.r.t. to each element of Theta
4141
%
4242

43+
% calculating cost function.
44+
diff = (X*Theta'-Y);
45+
J = sum((diff.^2)(R==1))/2;
46+
J = J + lambda*sum(sum(Theta.^2))/2; % regularized term of theta.
47+
J = J + lambda*sum(sum(X.^2))/2; % regularized term of x.
48+
49+
% calculating gradient of x.
50+
for i=1:num_movies
51+
idx = find(R(i, :)==1); % users that have rated movie i.
52+
Theta_tmp = Theta(idx, :); % user features of movie i.
53+
Y_tmp = Y(i, idx); % user's ratings of movie i.
54+
X_grad(i, :) = (X(i, :)*Theta_tmp' - Y_tmp)*Theta_tmp;
55+
X_grad(i, :) = X_grad(i, :)+lambda*X(i, :); % regularized term of x.
56+
end
4357

44-
45-
46-
47-
48-
49-
50-
51-
52-
53-
54-
55-
56-
58+
% calculating gradient of theta.
59+
for j=1:num_users
60+
idx = find(R(:, j)==1)'; % movies that have rated by user j.
61+
X_tmp = X(idx, :); % features of movies rated by user j.
62+
Y_tmp = Y(idx, j); % user ratings by user j.
63+
Theta_grad(j, :) = (X_tmp*Theta(j, :)'-Y_tmp)'*X_tmp;
64+
Theta_grad(j, :) = Theta_grad(j, :)+lambda*Theta(j, :); % regularized term of theta.
65+
end
5766

5867
% =============================================================
5968

ex8/mlclass-ex8/estimateGaussian.m

Lines changed: 5 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -1,11 +1,11 @@
11
function [mu sigma2] = estimateGaussian(X)
2-
%ESTIMATEGAUSSIAN This function estimates the parameters of a
2+
%ESTIMATEGAUSSIAN This function estimates the parameters of a
33
%Gaussian distribution using the data in X
4-
% [mu sigma2] = estimateGaussian(X),
4+
% [mu sigma2] = estimateGaussian(X),
55
% The input X is the dataset with each n-dimensional data point in one row
66
% The output is an n-dimensional vector mu, the mean of the data set
77
% and the variances sigma^2, an n x 1 vector
8-
%
8+
%
99

1010
% Useful variables
1111
[m, n] = size(X);
@@ -21,14 +21,8 @@
2121
% should contain variance of the i-th feature.
2222
%
2323

24-
25-
26-
27-
28-
29-
30-
31-
24+
mu = mean(X)';
25+
sigma2 = var(X, 1)';
3226

3327
% =============================================================
3428

ex8/mlclass-ex8/selectThreshold.m

Lines changed: 13 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -12,28 +12,32 @@
1212

1313
stepsize = (max(pval) - min(pval)) / 1000;
1414
for epsilon = min(pval):stepsize:max(pval)
15-
15+
1616
% ====================== YOUR CODE HERE ======================
1717
% Instructions: Compute the F1 score of choosing epsilon as the
1818
% threshold and place the value in F1. The code at the
1919
% end of the loop will compare the F1 score for this
2020
% choice of epsilon and set it to be the best epsilon if
2121
% it is better than the current choice of epsilon.
22-
%
22+
%
2323
% Note: You can use predictions = (pval < epsilon) to get a binary vector
2424
% of 0's and 1's of the outlier predictions
2525

26+
% yval says it's an anomaly and so algorithm does.
27+
tp = sum((yval==1) & (pval<epsilon));
2628

29+
% yval says it's not an anomaly, but algorithm says anomaly.
30+
fp = sum((yval==0) & (pval<epsilon));
2731

32+
% yval says it's an anomaly, but algorithm says not anomaly.
33+
fn = sum((yval==1) & (pval>=epsilon));
2834

35+
% precision and recall
36+
prec = tp/(tp+fp);
37+
rec = tp/(tp+fn);
2938

30-
31-
32-
33-
34-
35-
36-
39+
% F1 value;
40+
F1 = (2*prec*rec)/(prec+rec);
3741

3842
% =============================================================
3943

0 commit comments

Comments
 (0)