If the goal is to explain variation in the response variable that can be ascribed to variation in the explanatory variables, linear regression analysis can be apply to quantify the strength of the relationship between the response and the explanatory variables, and in particular to determine whether some explanatory variables may have no linear relationship with the response at all, or to identify which subsets of explanatory variables may contain redundant information about the response. like all forms of regression analysis, linear regression focuses on the conditional probability distribution of the response given the values of the predictors, rather than on the joint probability distribution of all of these variables, which is the domain of multivariate analysis. Least squares linear regression, as a means of finding a good rough linear fit to a set of points was performed by Legendre (1805) and gauss (1809) for the prediction of planetary movement. Quetelet was responsible for make the procedure well-known and for use it extensively in the social sciences.
% This file runs univariate linear regression to predict profits of food trucks based on previous % actual values of profits in $10,000s in various cities with populations in 10,000s respectively. clear ; close all; clc ; fprintf('Plotting data\n'); data = load('data_.txt'); x = data(:, 1); Y = data(:, 2); n = length(Y); % Number of training examples. plotdata(x, Y); fprintf('Program paused, press enter to continue\n'); pause; x = [ones(n, 1), data(:,1)]; Theta = zeros(2, 1); noi = 1500; % Number of iterations in gradient descent. Alpha = 0.01; % Learning rate. fprintf('Testing the cost function\n') j = computecost(x, Y, Theta); fprintf('With Theta = [0 ; 0]\nCost computed = %f\n', j); fprintf('Expected cost value (approx) 32.07\n'); j = computecost(x, Y, [-1 ; 2]); fprintf('With theta = [-1 ; 2]\nCost computed = %f\n', j); fprintf('Expected cost value (approx) 54.24\n'); fprintf('Program paused, press enter to continue\n'); pause; fprintf('Running gradient descent\n'); Theta = gradientdescent(x, Y, Theta, Alpha, noi); fprintf('Theta found by gradient descent\n'); fprintf('%f\n', Theta); fprintf('Expected Theta vector (approx)\n'); fprintf(' -3.6303\n 1.1664\n\n'); hold on; % To plot hypothesis on data. plot(x(:, 2), x * Theta, '-'); legend('Training data', 'Linear regression'); predict1 = [1, 3.5] * Theta; fprintf('For population = 35,000, we predict a profit of %f\n',... predict1*10000); predict2 = [1, 7] * Theta; fprintf('For population = 70,000, we predict a profit of %f\n',... predict2*10000);