통계 학습 요소의 실습 2.2

10

교과서는 먼저 다음을 통해 2 클래스 데이터를 생성합니다.

여기에 이미지 설명을 입력하십시오

이것은 다음을 제공합니다.

여기에 이미지 설명을 입력하십시오

그런 다음 묻습니다.

여기에 이미지 설명을 입력하십시오

먼저이 그래픽 모델로 모델링 하여이 문제를 해결하려고합니다.

여기에 이미지 설명을 입력하십시오

어디 $c$ 라벨입니다 $h\,(1\le h \le 10)$ 선택한 평균의 색인입니다 $m_h^c$ , $x$ 데이터 포인트입니다. 이것은 줄 것이다

\begin{aligned} Pr (x ∣ m_{h}^{c}) = & N (m_{h}^{c}, I / 5) \\ Pr (m_{h}^{c} ∣ h, c = b l u e) = & N ((1, 0)^{T}, I) \\ Pr (m_{h}^{c} ∣ h, c = o r a n g e) = & N ((0, 1)^{T}, I) \\ Pr (h) = & \frac{1}{10} \\ Pr (c) = & \frac{1}{2} \end{aligned}

$\begin{align*} \Pr(x\mid m_h^c) =& \mathcal{N}(m_h^c,\mathbf{I}/5)\\ \Pr(m_h^c\mid h,c=\mathrm{blue}) =& \mathcal{N}((1,0)^T,\mathbf{I})\\ \Pr(m_h^c\mid h,c=\mathrm{orange}) =& \mathcal{N}((0,1)^T,\mathbf{I})\\ \Pr(h) =& \frac{1}{10}\\ \Pr(c) =& \frac{1}{2} \end{align*}$

반면 경계는 입니다. 베이지안 규칙으로 $\{x:\Pr(c=\mathrm{blue}\mid x)=\Pr(c=\mathrm{orange}\mid x)\}$

\begin{aligned} Pr (c ∣ x) = & \frac{Pr (x ∣ c) Pr (c)}{\sum_{c} Pr (x ∣ c) Pr (c)} \\ Pr (x ∣ c) = & \sum_{h} \int_{m_{h}^{c}} Pr (h) Pr (m_{h}^{c} ∣ h, c) Pr (x ∣ m_{h}^{c}) \end{aligned}

$\begin{align*} \Pr(c\mid x) =& \frac{\Pr(x\mid c)\Pr(c)}{\sum_c\Pr(x\mid c)\Pr(c)}\\ \Pr(x\mid c) =& \sum_h\int_{m_h^c}\Pr(h)\Pr(m_h^c\mid h,c)\Pr(x\mid m_h^c) \end{align*}$

그러나 나중에 문제 설정이 대칭이므로 경계로 를 산출 할 수 있습니다 . 만약 가 조절 될 때 문제가 경계를 묻는다면, 방정식은 매개 변수를 포함 할 것입니다. $x=y$ $m_h^c$ $40$

그래서 내가 오해하고 있습니까? 감사합니다.

self-study bayesian

— 지위 안
소스

8

나는 의 주어진 실현에 대해 Bayes 결정 경계에 대한 분석 표현을 찾아야한다고 생각하지 않습니다 . 마찬가지로 분포에 대한 경계를 얻는 것으로 의심됩니다 . 왜냐하면 그것이 언급 한 것처럼 대칭에 의해 이기 때문입니다 . $m_k$ $m_k$ $x=y$

당신이 보여주는 것은 의 주어진 실현을위한 결정 경계를 계산할 수 있는 프로그램 이라고 생각합니다 . 이는 및 값 의 그리드를 설정 하고 클래스 조건부 밀도를 계산하고 동일한 점을 찾아서 수행 할 수 있습니다 . $m_k$ $x$ $y$

이 코드는 찌르다. IIRC는 실제로 S를 사용한 Modern Applied Statistics 의 의사 결정 경계를 계산하는 코드가 있지만 지금은 그다지 편리하지 않습니다.

# for dmvnorm/rmvnorm: multivariate normal distribution
library(mvtnorm)

# class-conditional density given mixture centers
f <- function(x, m)
{
    out <- numeric(nrow(x))
    for(i in seq_len(nrow(m)))
        out <- out + dmvnorm(x, m[i, ], diag(0.2, 2))
    out
}

# generate the class mixture centers
m1 <- rmvnorm(10, c(1,0), diag(2))
m2 <- rmvnorm(10, c(0,1), diag(2))
# and plot them
plot(m1, xlim=c(-2, 3), ylim=c(-2, 3), col="blue")
points(m2, col="red")

# display contours of the class-conditional densities
dens <- local({
    x <- y <- seq(-3, 4, len=701)
    f1 <- outer(x, y, function(x, y) f(cbind(x, y), m1))
    f2 <- outer(x, y, function(x, y) f(cbind(x, y), m2))
    list(x=x, y=y, f1=f1, f2=f2)
})

contour(dens$x, dens$y, dens$f1, col="lightblue", lty=2, levels=seq(.3, 3, len=10),
        labels="", add=TRUE)

contour(dens$x, dens$y, dens$f2, col="pink", lty=2, levels=seq(.3, 3, len=10),
        labels="", add=TRUE)

# find which points are on the Bayes decision boundary
eq <- local({
    f1 <- dens$f1
    f2 <- dens$f2
    pts <- seq(-3, 4, len=701)
    eq <- which(abs((dens$f1 - dens$f2)/(dens$f1 + dens$f2)) < 5e-3, arr.ind=TRUE)
    eq[,1] <- pts[eq[,1]]
    eq[,2] <- pts[eq[,2]]
    eq
})
points(eq, pch=16, cex=0.5, col="grey")

결과:

여기에 이미지 설명을 입력하십시오

— 홍 오오이
소스

3

실제로이 책 은 이 문제에 대한 분석 솔루션을 제공하도록 요청합니다. 그리고 네, 경계를 조절해야하지만 40 가지가 아닙니다. 정확하게 알 수는 없습니다. 대신에 당신은 당신이 볼 수있는 200 데이터 포인트를 조건으로해야합니다. 따라서 200 개의 매개 변수가 필요하지만 합계를 사용하기 때문에 대답이 너무 복잡하지 않습니다.

나는이 공식을 결코 얻을 수 없으므로 분석 솔루션이 추악하지 않아도되고 구글에서 검색 할 필요가 없다는 것을 깨닫는 것만으로도 신용을 얻습니다. 운 좋게도, 그것은 6-7 페이지 ~~의 저자들~~ 에 의해 좋은 사람들 이 제공합니다 .

— 최대
소스

2

위의 코드를 우연히 발견했으면 좋겠다. 아래에 대체 코드를 작성하는 것만으로도 가치가 있습니다.

set.seed(1)
library(MASS)

#create original 10 center points/means for each class 
I.mat=diag(2)
mu1=c(1,0);mu2=c(0,1)
mv.dist1=mvrnorm(n = 10, mu1, I.mat)
mv.dist2=mvrnorm(n = 10, mu2, I.mat)

values1=NULL;values2=NULL

#create 100 observations for each class, after random sampling of a center point, based on an assumed bivariate probability distribution around each center point  
for(i in 1:10){
  mv.values1=mv.dist1[sample(nrow(mv.dist1),size=1,replace=TRUE),]
  sub.mv.dist1=mvrnorm(n = 10, mv.values1, I.mat/5)
  values1=rbind(sub.mv.dist1,values1)
}
values1

#similar as per above, for second class
for(i in 1:10){
  mv.values2=mv.dist2[sample(nrow(mv.dist2),size=1,replace=TRUE),]
  sub.mv.dist2=mvrnorm(n = 10, mv.values2, I.mat/5)
  values2=rbind(sub.mv.dist2,values2)
}
values2

#did not find probability function in MASS, so used mnormt
library(mnormt)

#create grid of points
grid.vector1=seq(-2,2,0.1)
grid.vector2=seq(-2,2,0.1)
length(grid.vector1)*length(grid.vector2)
grid=expand.grid(grid.vector1,grid.vector2)



#calculate density for each point on grid for each of the 100 multivariates distributions
prob.1=matrix(0:0,nrow=1681,ncol=10) #initialize grid
for (i in 1:1681){
  for (j in 1:10){
    prob.1[i,j]=dmnorm(grid[i,], mv.dist1[j,], I.mat/5)  
  }
}
prob.1
prob1.max=apply(prob.1,1,max)

#second class - as per above
prob.2=matrix(0:0,nrow=1681,ncol=10) #initialize grid
for (i in 1:1681){
  for (j in 1:10){
    prob.2[i,j]=dmnorm(grid[i,], mv.dist2[j,], I.mat/5)  
  }
}
prob.2
prob2.max=apply(prob.2,1,max)

#bind
prob.total=cbind(prob1.max,prob2.max)
class=rep(1,1681)
class[prob1.max<prob2.max]=2
cbind(prob.total,class)

#plot points
plot(grid[,1], grid[,2],pch=".", cex=3,col=ifelse(class==1, "coral", "cornflowerblue"))

points(values1,col="coral")
points(values2,col="cornflowerblue")

#check - original centers
# points(mv.dist1,col="coral")
# points(mv.dist2,col="cornflowerblue")

— 사용자 1885116
소스