Linear regression with multiple variables(多特征的线型回归)算法实例_梯度下降解法(Gradient DesentMulti)以及正规方程解法(Normal

Posted 豆子

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了Linear regression with multiple variables(多特征的线型回归)算法实例_梯度下降解法(Gradient DesentMulti)以及正规方程解法(Normal 相关的知识,希望对你有一定的参考价值。

 

%第一列为 size of House(feet^2),第二列为 number of bedroom,第三列为 price of House
1
2104,3,399900 2 1600,3,329900 3 2400,3,369000 4 1416,2,232000 5 3000,4,539900 6 1985,4,299900 7 1534,3,314900 8 1427,3,198999 9 1380,3,212000 10 1494,3,242500 11 1940,4,239999 12 2000,3,347000 13 1890,3,329999 14 4478,5,699900 15 1268,3,259900 16 2300,4,449900 17 1320,2,299900 18 1236,3,199900 19 2609,4,499998 20 3031,4,599000 21 1767,3,252900 22 1888,2,255000 23 1604,3,242900 24 1962,4,259900 25 3890,3,573900 26 1100,3,249900 27 1458,3,464500 28 2526,3,469000 29 2200,3,475000 30 2637,3,299900 31 1839,2,349900 32 1000,1,169900 33 2040,4,314900 34 3137,3,579900 35 1811,4,285900 36 1437,3,249900 37 1239,3,229900 38 2132,4,345000 39 4215,4,549000 40 2162,4,287000 41 1664,2,368500 42 2238,3,329900 43 2567,4,314000 44 1200,3,299000 45 852,2,179900 46 1852,4,299900 47 1203,3,239500
 1 %  Exercise 1: Linear regression with multiple variables
 2 
 3 %% Initialization
 4 
 5 %% ================ Part 1: Feature Normalization ================
 6 
 7 %% Clear and Close Figures
 8 clear ; close all; clc
 9 
10 fprintf(Loading data ...\n);
11 
12 %% Load Data
13 data = load(ex1data2.txt);
14 X = data(:, 1:2);
15 y = data(:, 3);
16 m = length(y);
17 
18 % Print out some data points
19 fprintf(First 10 examples from the dataset: \n);
20 fprintf( x = [%.0f %.0f], y = %.0f \n, [X(1:10,:) y(1:10,:)]);
21 
22 fprintf(Program paused. Press enter to continue.\n);
23 pause;
24 
25 % Scale features and set them to zero mean
26 fprintf(Normalizing Features ...\n);
27 
28 [X, mu, sigma] = featureNormalize(X);

 1 %featureNormalize(X)函数实现
 2 function [X_norm, mu, sigma] = featureNormalize(X)
 3 X_norm = X;                      % X是需要正规化的矩阵
 4 mu = zeros(1, size(X, 2));       % 生成 1x3 的全0矩阵
 5 sigma = zeros(1, size(X, 2));    % 同上
 6 
 7 % Instructions: First, for each feature dimension, compute the mean
 8 %               of the feature and subtract it from the dataset,
 9 %               storing the mean value in mu. Next, compute the 
10 %               standard deviation of each feature and divide
11 %               each feature by its standard deviation, storing
12 %               the standard deviation in sigma. 
13 %
14 %               Note that X is a matrix where each column is a 
15 %               feature and each row is an example. You need 
16 %               to perform the normalization separately for 
17 %               each feature. 
18 %
19 % Hint: You might find the mean and std functions useful.
20 
21 % std,均方差,std(X,0,1)求列向量方差,std(X,0,2)求行向量方差。
22 
23 mu = mean(X, 1);                 %求每列的均值--即一种特征的所有样本的均值
24 sigma = std(X);                  %默认同std(X,0,1)求列向量方差
25 %fprintf(Debug....\n); disp(sigma);  
26 i = 1;
27 len = size(X,2);                 %行数
28 while i <= len,
29     %对每列的所有行上的样本进行normalization(归一化):(每列的所有行-该列均值)/(该列的标准差)
30     X_norm(:,i) = (X(:,i) - mu(1,i)) / (sigma(1,i));
31     i = i + 1;
32 end

 

 1 % Add intercept term to X 
 2 X = [ones(m, 1) X];        
 3 
 4 
 5 %% ================ Part 2: Gradient Descent ================
 6 
 7 % ====================== YOUR CODE HERE ======================
 8 % Instructions: We have provided you with the following starter
 9 %               code that runs gradient descent with a particular
10 %               learning rate (alpha). 
11 %
12 %               Your task is to first make sure that your functions - 
13 %               computeCost and gradientDescent already work with 
14 %               this starter code and support multiple variables.
15 %
16 %               After that, try running gradient descent with 
17 %               different values of alpha and see which one gives
18 %               you the best result.
19 %
20 %               Finally, you should complete the code at the end
21 %               to predict the price of a 1650 sq-ft, 3 br house.
22 %
23 % Hint: By using the hold on command, you can plot multiple
24 %       graphs on the same figure.
25 %
26 % Hint: At prediction, make sure you do the same feature normalization.
27 %
28 
29 fprintf(Running gradient descent ...\n);
30 
31 % Choose some alpha value
32 alpha = 0.03;                         % learning rate - 可尝试0.01,0.03,0.1,0.3...
33 num_iters = 400;                      % 迭代次数
34 
35 % Init Theta and Run Gradient Descent 
36 theta = zeros(3, 1);                  % 3x1的全零矩阵
37 [theta, J_history] = gradientDescentMulti(X, y, theta, alpha, num_iters);
% gradientDescentMulti()函数实现
1
function [theta, J_history] = gradientDescentMulti(X, y, theta, alpha, num_iters) 2 3 % theta = GRADIENTDESCENTMULTI(x, y, theta, alpha, num_iters) updates theta by 4 % taking num_iters gradient steps with learning rate alpha 5 6 % Initialize some useful values 7 m = length(y); % number of training examples 8 feature_number = size(X,2); % number of feature 9 10 J_history = zeros(num_iters, 1); 11 temp = zeros(feature_number, 1); 12 13 for iter = 1 : num_iters 14 predictions = X * theta; 15 sqrError = (predictions - y); 16 for i = 1 : feature_number % Simultneously update theta(i) (同时更新) 17 temp(i) = theta(i) - (alpha / m) * sum(sqrError .* X(:,i)); 18 end 19 20 for j = 1 : feature_number 21 theta(j) = temp(j); 22 end 23 24 % ====================== YOUR CODE HERE ====================== 25 % Instructions: Perform a single gradient step on the parameter vector 26 % theta. 27 % 28 % Hint: While debugging, it can be useful to print out the values 29 % of the cost function (computeCostMulti) and gradient here. 30 % 31 32 % ============================================================ 33 34 % Save the cost J in every iteration 35 J_history(iter) = computeCostMulti(X, y, theta); 36 % disp(J_history(iter)); 37 38 end 39 40 end

 

 1 % Plot the convergence graph
 2 figure;
 3 plot(1:numel(J_history), J_history, -b, LineWidth, 2); % ‘-b‘--用蓝线绘制图像,线宽为2
 4 xlabel(Number of iterations);
 5 ylabel(Cost J);
 6 
 7 % Display gradient descents result
 8 fprintf(Theta computed from gradient descent: \n);
 9 fprintf( %f \n, theta);
10 fprintf(\n);
Tip:
To compare how di erent learning learning rates a ect convergence, it‘s helpful to plot J for several learning rates on the same gure. In Octave/MATLAB, this can be done by perform- ing gradi
ent descent multiple times with a `hold on‘ command between plots. Concretely, if you‘ve tried three di erent values of alpha (you should probably try more values than this) and stored the costs in J1, J2 and J3, you can use the following commands to plot them on the same gure: plot(1:50, J1(1:50), `b‘); hold on; plot(1:50, J2(1:50), `r‘); plot(1:50, J3(1:50), `k‘); The nal arguments `b‘, `r‘, and `k‘ specify di erent colors for the plots.
 1 % 如,可以添加本段代码进行比较 不同的learning rate
 2 figure;
 3 plot(1:100, J_history(1:100), ‘-b‘, ‘LineWidth‘, 2);
 4 xlabel(‘Number of iterations‘);
 5 ylabel(‘Cost J‘);
 6 
 7 % Compare learning rate
 8 hold on;
 9 alpha = 0.03;
10 theta = zeros(3, 1);
11 [theta, J_history1] = gradientDescentMulti(X, y, theta, alpha, num_iters);
12 plot(1:100, J_history1(1:100), ‘r‘, ‘LineWidth‘, 2);
13 
14 hold on;
15 alpha = 0.1;
16 theta = zeros(3, 1);
17 [theta, J_history2] = gradientDescentMulti(X, y, theta, alpha, num_iters);
18 plot(1:100, J_history2(1:100), ‘g‘, ‘LineWidth‘, 2);

 

 1 % 利用梯度下降算法预测新值
 2 price = [1, X(1:2)] * theta;   %利用矩阵乘法--预测多特征下的price
 3 
 4 % ============================================================
 5 
 6 fprintf([Predicted price of a 1650 sq-ft, 3 br house  ...
 7          (using gradient descent):\n $%f\n], price);
 8 
 9 fprintf(Program paused. Press enter to continue.\n);
10 pause;
 1 %% ================ Part 3: Normal Equations ================
 2 %利用正规方程预测新值(Normal Equation)
 3 fprintf(Solving with normal equations...\n);
 4 
 5 %% Load Data
 6 data = csvread(ex1data2.txt);
 7 X = data(:, 1:2);
 8 y = data(:, 3);
 9 m = length(y);
10 
11 % Add intercept term to X
12 X = [ones(m, 1) X];
13 
14 % Calculate the parameters from the normal equation
15 theta = normalEqn(X, y);
 % normalEquation的实现
1
function [theta] = normalEqn(X, y) 2 3 theta = zeros(size(X, 2), 1); 4 6 % Instructions: Complete the code to compute the closed form solution 7 % to linear regression and put the result in theta. 8 9 theta = pinv(X * X) * X * y; 10 11 end
 1 % Display normal equation‘s result
 2 fprintf(Theta computed from the normal equations: \n);
 3 fprintf( %f \n, theta);
 4 fprintf(\n);
 5 
 6 
 7 % Estimate the price of a 1650 sq-ft, 3 br house
 8 
 9 price = 0; 
10 price = [1, X(1:2)] * theta;    %利用正规方程预测新值
11 
12 
13 fprintf([Predicted price of a 1650 sq-ft, 3 br house  ...
14          (using normal equations):\n $%f\n], price);

 

以上是关于Linear regression with multiple variables(多特征的线型回归)算法实例_梯度下降解法(Gradient DesentMulti)以及正规方程解法(Normal 的主要内容,如果未能解决你的问题,请参考以下文章

Linear Regression with PyTorch

Coursera《machine learning》--单变量线性回归(Linear Regression with One Variable)

Linear regression with one variable - Model representation

机器学习笔记-1 Linear Regression with Multiple Variables(week 2)

Coursera - machine learning Linear regression with one variable-quiz

#Week2 Linear Regression with One Variable