Learning a binary classifier which outputs probability -
when, in general, objective build binary classifier outputs probability instance positive, machine learning appropriate , in situation?
in particular, seems support vector machines platt's scaling candidate, read around web uses kernel logistic regression or gaussian processes task. there evident advantage/disadvantage of 1 approach against others?
thank you
listing potential algorithms use general task close impossible. since mentioned support vector machines (svms), try elaborate little on those.
svm classifiers never output actual probability. output of svm classifier distance of test instance separating hyperplane in feature space (this called decision value). default, predicted label selected based on sign of decision value.
platt scaling fits sigmoid on top of svm decision values scale range of [0, 1], can interpreted probability. similar techniques can applied on type of classifier produces real-valued output.
some evident advantages svm include:
- computationally efficient nonlinear classifiers (quadratic in no. of training instances),
- can deal high-dimensional data,
- have shown performance in countless domains.
downsides svm include:
- data must vectorized,
- models relatively hard interpret (compared decision trees or logistic regression),
- dealing nominal features can klunky,
- missing values can hard deal with.
when looking proper probabilistic outputs (including confidence intervals), may want consider statistical methods such logistic regression (kernelized versions exist too, suggest start basic stuff).
Comments
Post a Comment