Support Vector Machine

Cost Function

$
J_\Theta=C\sum{[y * cost_1(\Theta^TX)+(1-y) * cost_0(\Theta^TX)]}+\frac{1}{2} * \sum{\Theta^2}

Differences with logistic cost function:

SVM will maximize the margin to allow variance in test data. Also, it will ignore outliers when is not too large.

Math behind: Since and requires to be significant, and regularization requires to be small, would be large, which is the margin.

()

(||x|| denotes the length of vector, and it can be negative)

Kernel is used to perform non linear classification, by using attributes given by kerneling function instead of raw attributes.

Kernel function will calculate the similarity of a data to another(landmark), from 0 to 1(similar).