nsasakiの勉強メモ

PRML読書会第一回まとめpart3

PRML

Chapter1:序論

Probability

Freauentist(客観) と Bayesian(主観)

Freauentist:頻度主義
- 客観確率
- 世界の中に存在する頻度や傾向性など、主観とは独立に存在する確率。
Bayesian:ベイズ主義
- 主観確率
- 人間が考える主観的な信念あるいは信頼の度合(客観的には求められない)

これらは解釈が異なるためアプローチが違うのだけど、それぞれ別個の確率論というわけではない。

P(A) := (#A)/(#Ω)

A:不確実な事象
w : パラメータ
D: { $t_{1}$ , ... , $t_{n}$ } observed data

Frequentist: maximum likelihood

max P(D | w) :likelihood function

or

min -log P(D | w) :error function

Bayesian:

prior distribution P(w) を仮定　前条件
posterior distribution P(w| D) = P(D | w) P(w) / P(D)

一長一短　それぞれの利点と欠点をしっかりと認識することが大切
具体的にどう応用するのかを先の例に戻ってみる

Noise の表現戦略

※ここからは数式が増えます。僕もしんどいです。
　こうなるんだー。ふーん。程度の認識でよろしいかと思います。

p( t | x. w, $\beta$ ) = N (t | y (x, w) , $\beta^{-1}$ )

Training set 下記の二つが独立すること
- x= $\{ x_{1}$ ,..., $x_{N} }^{\mathrm{T}}$
- t= $\{ t_{1}$ ,..., $t_{N} }^{\mathrm{T}}$

$p( t | x, w, \beta ) = \prod_{n=1}^{N}(N (t_{n} | y (x_{n}, w) , \beta^{-1})$
t,x,wはベクトル

Normal distributionのdensity

$N(x | \mu , \sigma^{2}) = \frac{1}{\sqrt{2\pi\sigma^{2}}}e^{-\frac{(x-\mu^{2})}{2\sigma^{2}}}$

であったので、

error functionは

$-\log P(t | x, w, \beta ) = \frac{\beta}{2}\sum_{n=0}^{N} \{ y(x_{n},w)-t_{n} \}^{2} - \frac{N}{2} \log \beta + \frac{N}{2}\log2\pi$

これによって

predictive distribution

$P(t | x, w_{ML},\beta_{ML} ) = N \left( t | y\left(x, w_{ML}\right),p^{-1}_{ML} \right)$

を得る。
次にBayesian approach

Bayesian approach

prior distributionとして

- $\alpha$ : hyperparameter
- M : M次多項式

Bayes' thmより

- 点推定↑maximum
- 別個に指定するパラメータ $\alpha$ と $\beta$ を一つにする。

Predictive distribution

- 積分することによって周辺化を行っている。
where...
- $m(x) = \beta \phi(x)^{\mathrm{T}}S \sum_{n=1}^{N} \phi(x_{n}) t_{n}$
- - wに対する不確定性
- - Iは次元単位行列
- $\phi(x) = (x^{0}$ , ... , $x^{M} )^{\mathrm{T}}$

Decision theory(決定論)

Probability(不確実性の定量化)+Decision(最適な意思決定)

決定方針

誤り確率の最小化
loss function の最小化　間違いの重みを反映させたもの(誤り確率の最小化を含む)
棄却option

例は第二回に続く