Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added doctest, docstring and typehint for sigmoid_function & cost_function #10828

Merged
merged 8 commits into from
Oct 26, 2023
60 changes: 58 additions & 2 deletions machine_learning/logistic_regression.py
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@
# classification problems


def sigmoid_function(z):
def sigmoid_function(z: float | np.ndarray) -> float | np.ndarray:
"""
Also known as Logistic Function.

Expand All @@ -42,11 +42,63 @@ def sigmoid_function(z):

@param z: input to the function
@returns: returns value in the range 0 to 1

Suyashd999 marked this conversation as resolved.
Show resolved Hide resolved
Examples:
>>> sigmoid_function(4)
0.9820137900379085
>>> sigmoid_function(np.array([-3, 3]))
array([0.04742587, 0.95257413])
>>> sigmoid_function(np.array([-3, 3, 1]))
array([0.04742587, 0.95257413, 0.73105858])
>>> sigmoid_function(np.array([-0.01, -2, -1.9]))
array([0.49750002, 0.11920292, 0.13010847])
>>> sigmoid_function(np.array([-1.3, 5.3, 12]))
array([0.21416502, 0.9950332 , 0.99999386])
>>> sigmoid_function(np.array([0.01, 0.02, 4.1]))
array([0.50249998, 0.50499983, 0.9836975 ])
>>> sigmoid_function(np.array([0.8]))
array([0.68997448])
"""
return 1 / (1 + np.exp(-z))


def cost_function(h, y):
def cost_function(h: np.ndarray, y: np.ndarray) -> float:
"""
Suyashd999 marked this conversation as resolved.
Show resolved Hide resolved
Cost function quantifies the error between predicted and expected values.
The cost function used in Logistic Regression is called Log Loss
or Cross Entropy Function.

J(θ) = (1/m) * Σ [ -y * log(hθ(x)) - (1 - y) * log(1 - hθ(x)) ]

Where:
- J(θ) is the cost that we want to minimize during training
- m is the number of training examples
- Σ represents the summation over all training examples
- y is the actual binary label (0 or 1) for a given example
- hθ(x) is the predicted probability that x belongs to the positive class

@param h: the output of sigmoid function. It is the estimated probability
that the input example 'x' belongs to the positive class

@param y: the actual binary label associated with input example 'x'

Examples:
>>> estimations = sigmoid_function(np.array([0.3, -4.3, 8.1]))
>>> cost_function(h=estimations,y=np.array([1, 0, 1]))
0.18937868932131605
>>> estimations = sigmoid_function(np.array([4, 3, 1]))
>>> cost_function(h=estimations,y=np.array([1, 0, 0]))
1.459999655669926
>>> estimations = sigmoid_function(np.array([4, -3, -1]))
>>> cost_function(h=estimations,y=np.array([1,0,0]))
0.1266663223365915
>>> estimations = sigmoid_function(0)
>>> cost_function(h=estimations,y=np.array([1]))
0.6931471805599453

References:
- https://en.wikipedia.org/wiki/Logistic_regression
"""
return (-y * np.log(h) - (1 - y) * np.log(1 - h)).mean()


Expand Down Expand Up @@ -75,6 +127,10 @@ def logistic_reg(alpha, x, y, max_iterations=70000):
# In[68]:

if __name__ == "__main__":
import doctest

doctest.testmod()

iris = datasets.load_iris()
x = iris.data[:, :2]
y = (iris.target != 0) * 1
Expand Down