KNN은 차별적 인 학습 알고리즘입니까?

17

KNN은 차별적 인 학습 알고리즘 인 것 같지만이를 확인하는 온라인 출처를 찾지 못하는 것 같습니다.

machine-learning classification k-nearest-neighbour

— jpmuc
소스

19

KNN은 주어진 클래스에 속하는 샘플 의 조건부 확률 을 모델링 하기 때문에 판별 알고리즘 입니다. 이것을 보려면 kNN의 결정 규칙을 얻는 방법을 고려하십시오.

클래스 레이블은 피쳐 공간 의 일부 영역에 속하는 포인트 세트에 해당합니다 . 실제 확률 분포 에서 독립적으로 표본 점을 그리는 경우 해당 클래스에서 표본을 그릴 확률은 $R$ $p(x)$

P = \int_{R} p (x) d x

$P = \int_{R} p(x) dx$

포인트 가 있다면 ? 확률 그 지점 점 영역에 분류 이항 분포는 다음과 $N$ $K$ $N$ $R$

P r o b (K) = (\binom{N}{K}) P^{K} (1 - P)^{N - K}

$Prob(K) = {{N} \choose {K}}P^{K}(1-P)^{N-K}$

마찬가지로 이 분포가 급격하게 뾰족하고, 확률은 그 평균값으로 근사 될 수 있도록하는 것이 $N \to \infty$ . 추가 근사값은대한 확률 분포가 거의 일정하게 유지되므로 적분을 로근사 할 수 있습니다. 여기서는 영역의 총 부피입니다. 이 근사치 $\frac{K}{N}$ $R$

P = \int_{R} p (x) d x \approx p (x) V

$P = \int_{R} p(x) dx \approx p(x)V$

V

$V$

.

p (x) \approx \frac{K}{N V}

$p(x) \approx \frac{K}{NV}$

이제 여러 클래스가 있다면 각각에 대해 동일한 분석을 반복 할 수 있습니다.

p (x | C_{k}) = \frac{K_{k}}{N_{k} V}

$p(x|C_{k}) = \frac{K_{k}}{N_{k}V}$

K_{k}

$K_{k}$

k

$k$

N_{k}

$N_{k}$

C_{k}

$C_k$

\sum_{k} N_{k} = N

$\sum_{k}N_{k}=N$

$P(C_{k}) = \frac{N_{k}}{N}$

P (C_{k} | x) = \frac{p (x | C_{k}) p (C_{k})}{p (x)} = \frac{K_{k}}{K}

$P(C_{k}|x) = \frac{p(x|C_{k})p(C_{k})}{p(x)} = \frac{K_{k}}{K}$ which is the rule for kNNs.

— jpmuc
소스

2

The reference does not include any information on KNN. Is it the right one?

— bayerj

1

I meant it to enphasize what is understood for a discriminative algorithm vs a generative.

— jpmuc

5

Answer by @jpmuc doesn't seem to be accurate. Generative models model the underlying distribution P(x/Ci) and then later use Bayes theorem to find the posterior probabilities. That is exactly what has been shown in that answer and then concludes the exact opposite. :O

For KNN to be a generative model, we should be able to generate synthetic data. It seems that this is possible once we have some initial training data. But starting from no training data and generating synthetic data is not possible. So KNN doesn't fit nicely with generative models.

One may argue that KNN is a discriminative model because we can draw discriminant boundary for classification, or we can compute the posterior P(Ci/x). But all these are true in the case of generative models as well. A true discriminative model doesn't tell anything about the underlying distribution. But in the case of KNN we know a lot about the underlying distribution, infact we are storing the entire training set.

So it seems KNN is mid-way between generative and discriminative models. Probably that is why KNN is not categorized under any of generative or discriminative models in reputed articles. Let's just call them non-parametric models.

— Binu Jasim
소스

I do not agree. "Generative classifiers learn a model of the joint probability, p(x, y), of the inputs x and the label y, and make their predictions by using Bayes rules to calculate p(ylx), and then picking the most likely label y. Discriminative classifiers model the posterior p(ylx) directly, or learn a direct map from inputs x to the class labels". See "On Discriminative vs. Generative Classifiers: A comparison of logistic regression and naive Bayes.

— jpmuc

3

I have come accross a book which says the opposite (i.e. a Generative Nonparametric Classification Model)

This is the online link: Machine Learning A Probabilistic Perspective by Murphy, Kevin P. (2012)

Here the excerpt from the book:

— Gürol Canbek
소스

Must be a mistake..

1

I agree that kNN is discriminative. The reason is that it does not explicitly store or tries to learn a (probabilistic) model that explains the data (as opposed to, e.g. Naive Bayes).

The answer by juampa confuses me since, to my understanding, a generative classifier is one that attempts to explain how the data is generated (e.g. using a model), and that answer says that it is discriminative because of this reason...

— Amir
소스

1

A generative model learns P(Ck,X), so you can generate more data using that joint distribution. In contrast, a discriminative model would learn P(Ck|X). This is what @juampa is pointing at with KNN.

— Zhubarb

1

At classification time, both generative and discriminative ends up using conditional probabilities to make predictions. However, generative classifiers learns the joint probability and by Bayes rule it computes the conditional, while in discriminative a classifier either computes directly the conditional, or provides an approximation for that as good as it can get.

— rapaio