MachineLearning(AndrewNg)Notes-Week1-Week5总结
Liner Regression
-
Cost Function
-
Linear Regression
-
Gradient descent algorithm
repeat until convergence{
}
-
Feature scaling and mean normalization
: the average of all the values for feature (i)
: standard deviation
-
learning rate
If α is too small: slow convergence. If α is too large: may not decrease on every iteration and thus may not converge.
-
Polynomial Regression
change the behavior or curve of our hypothesis function by making it a quadratic, cubic or square root function (or any other form).
-
Normal Equation
-
Logistic Regression
-
Logistic Function or Sigmoid Function
-
Decision Boundary
-
Cost Function
-
Gradient Descent
-
-
Advanced Optimization
1 2 3 4 5 6 7 8
function [jVal, gradient] = costFunction(theta) jVal = [...code to compute J(theta)...]; gradient = [...code to compute derivative of J(theta)...]; end options = optimset('GradObj', 'on', 'MaxIter', 100); initialTheta = zeros(2,1); [optTheta, functionVal, exitFlag] = fminunc(@costFunction, initialTheta, options);
-
Multiclass Classification: One-vs-all
Train a logistic regression classifier for each class to predict the probability that y = i . To make a prediction on a new x, pick the class that maximizes
-
Overfitting
-
Reduce the number of features
-
Regularization
-
-
Regularized Logistic Regression
Neural Networks
-
Model Representation
- Forward propagation:Vectorized implementation
-
Multiclass Classification
one-vs-all
-
Neural Network(Classification)
L = total number of layers in the network = number of units (not counting bias unit) in layer l K = number of output units/classes
-
Cost Function
-
Backpropagation Algorithm
-
Gradient Checking
-
Random Initialization
-