Neuron network is about mimicking the brain. Our brain can learn anything with one algorithm (“one learning algorithm” hypothesis, re-wiring test).
How our brain works
A neuron takes inputs from other neurons or sensors, process and provide outputs to other neurons. The way we learn is to change the weight of each input for output.
Neuron Model: Logistic Unit
For each neuron, with input of , output would be . Here, is the activation function(whether this neuron output a big value).
Artificial Neural Network
Activation Function (Hypothesis Function)
To justify, activation function is not hypothesis function since there should be only one hypothesis function. Activation function of the output layer is the hypothesis function.
Vectorized:
Remember to add biased unit
Cost Function
Cost function of neural network is just summing up cost function of logistic regression for each output unit.
For the regularization part, just sum up all the parameters except for the bias units.
Optimization
Forward-propagation
Execute activation function (and adding bias unit) for each layer.
Back-propagation
Back-propagation is to calculate how much to
For output layer units:
For hidden layer units:
where
Multiclass classification
- One-vs-all: Output layer has n units each represent a class
Random Initialization
To make each neuron unit get different features, we should initialize the randomly (in ).
A good choice of is
Notations
- : “activation”(output) of unit in layer
- : matrix storing all parameters (weights) that the neurons in layer uses to get output.
- : No. of layers
- : No. of units(not include bias unit) in layer
- : No. of units in output layer