Cost Function
$
J_\Theta=C\sum{[y * cost_1(\Theta^TX)+(1-y) * cost_0(\Theta^TX)]}+\frac{1}{2} * \sum{\Theta^2}
$
Differences with logistic cost function:
- Replace with and
- No
- Instead of on regulation, is used (can be treated like )
Large Margin Classifier
SVM will maximize the margin to allow variance in test data. Also, it will ignore outliers when is not too large.
Math behind: Since and requires to be significant, and regularization requires to be small, would be large, which is the margin.
()
(||x||
denotes the length of vector, and it can be negative)
Kernel
Kernel is used to perform non linear classification, by using attributes given by kerneling function instead of raw attributes.
Kernel function will calculate the similarity of a data to another(landmark), from 0 to 1(similar).
- Gaussian Kernel