KNN은 차별적 인 학습 알고리즘입니까?


답변:


19

KNN은 주어진 클래스에 속하는 샘플 의 조건부 확률모델링 하기 때문에 판별 알고리즘 입니다. 이것을 보려면 kNN의 결정 규칙을 얻는 방법을 고려하십시오.

클래스 레이블은 피쳐 공간 의 일부 영역에 속하는 포인트 세트에 해당합니다 . 실제 확률 분포 p ( x ) 에서 독립적으로 표본 점을 그리는 경우 해당 클래스에서 표본을 그릴 확률은 P = R pRp(x)

P=Rp(x)dx

포인트 가 있다면 ? 확률 K의 그 지점 N의 점 영역에 분류 R은 이항 분포는 다음과 P의 R O B ( K ) =를 (NKNR

Prob(K)=(NK)PK(1P)NK

마찬가지로 이 분포가 급격하게 뾰족하고, 확률은 그 평균값으로 근사 될 수 있도록하는 것이 KN . 추가 근사값은R에대한 확률 분포가 거의 일정하게 유지되므로 적분을P=Rp(x)dxp(x)V 로근사 할 수 있습니다. 여기서V는 영역의 총 부피입니다. 이 근사치p(xKNR

P=Rp(x)dxp(x)V
V .p(x)KNV

이제 여러 클래스가 있다면 각각에 대해 동일한 분석을 반복 할 수 있습니다.

p(x|Ck)=KkNkV
KkkNkCkkNk=N

P(Ck)=NkN

P(Ck|x)=p(x|Ck)p(Ck)p(x)=KkK
which is the rule for kNNs.

2
The reference does not include any information on KNN. Is it the right one?
bayerj

1
I meant it to enphasize what is understood for a discriminative algorithm vs a generative.
jpmuc

5

Answer by @jpmuc doesn't seem to be accurate. Generative models model the underlying distribution P(x/Ci) and then later use Bayes theorem to find the posterior probabilities. That is exactly what has been shown in that answer and then concludes the exact opposite. :O

For KNN to be a generative model, we should be able to generate synthetic data. It seems that this is possible once we have some initial training data. But starting from no training data and generating synthetic data is not possible. So KNN doesn't fit nicely with generative models.

One may argue that KNN is a discriminative model because we can draw discriminant boundary for classification, or we can compute the posterior P(Ci/x). But all these are true in the case of generative models as well. A true discriminative model doesn't tell anything about the underlying distribution. But in the case of KNN we know a lot about the underlying distribution, infact we are storing the entire training set.

So it seems KNN is mid-way between generative and discriminative models. Probably that is why KNN is not categorized under any of generative or discriminative models in reputed articles. Let's just call them non-parametric models.


I do not agree. "Generative classifiers learn a model of the joint probability, p(x, y), of the inputs x and the label y, and make their predictions by using Bayes rules to calculate p(ylx), and then picking the most likely label y. Discriminative classifiers model the posterior p(ylx) directly, or learn a direct map from inputs x to the class labels". See "On Discriminative vs. Generative Classifiers: A comparison of logistic regression and naive Bayes.
jpmuc


1

I agree that kNN is discriminative. The reason is that it does not explicitly store or tries to learn a (probabilistic) model that explains the data (as opposed to, e.g. Naive Bayes).

The answer by juampa confuses me since, to my understanding, a generative classifier is one that attempts to explain how the data is generated (e.g. using a model), and that answer says that it is discriminative because of this reason...


1
A generative model learns P(Ck,X), so you can generate more data using that joint distribution. In contrast, a discriminative model would learn P(Ck|X). This is what @juampa is pointing at with KNN.
Zhubarb

1
At classification time, both generative and discriminative ends up using conditional probabilities to make predictions. However, generative classifiers learns the joint probability and by Bayes rule it computes the conditional, while in discriminative a classifier either computes directly the conditional, or provides an approximation for that as good as it can get.
rapaio
당사 사이트를 사용함과 동시에 당사의 쿠키 정책개인정보 보호정책을 읽고 이해하였음을 인정하는 것으로 간주합니다.
Licensed under cc by-sa 3.0 with attribution required.