gradientdescent Algorithm
The gradient descent algorithm is an optimization technique used in machine learning and deep learning to minimize a loss function, which in turn helps in refining the model's parameters. This iterative algorithm works by adjusting the model's parameters in small steps in the direction of the negative gradient of the loss function. The ultimate goal is to find the optimal values for these parameters that minimize the error between the predicted outputs and the actual outputs. Gradient descent is widely used in various machine learning problems such as linear regression, logistic regression, and neural networks.
The gradient descent algorithm starts with an initial set of parameters and calculates the gradient (or the slope) of the loss function with respect to each parameter. The gradients represent the direction of the steepest increase in the loss function, and therefore, the algorithm updates the parameters by taking a step in the opposite direction, i.e., the direction of the steepest decrease. The size of the step taken in each iteration is determined by a learning rate, which is a hyperparameter that controls how quickly the algorithm converges towards the optimal solution. The process is repeated until the algorithm converges, which is typically identified by a minimal change in the loss function between iterations or by reaching a predefined number of iterations.
% This function demonstrates gradient descent in case of linear regression with one variable.
% Theta is a column vector with two elements which this function returns after modifying it.
% This function receives the feature vector x, vector of actual target variables Y, Theta
% containing initial values of theta_0 and theta_1, learning rate Alpha, number of iterations
% noi.
function Theta = gradientdescent(x, Y, Theta, Alpha, noi)
n = length(Y); % Number of training examples.
for i = 1:noi
theta_1 = Theta(1) - Alpha * (1 / n) * sum(((x * Theta) - Y) .* x(:, 1)); % Temporary variable to simultaneously update theta_0 but i have used 1 to
% avoid confusion since indexing in MATLAB/Octave starts from 1.
theta_2 = Theta(2) - Alpha * (1 / n) * sum(((x * Theta) - Y) .* x(:, 2)); % Temporary variable to simultaneously update theta_1.
Theta(1) = theta_1; % Assigning first temporary value to update first actual value simultaneously.
Theta(2) = theta_2; % Assigning second temporary value to update second actual value simultaneously.
end
end