Bayesian inference Student's t-distribution
in bayesian statistics, (scaled, shifted) t-distribution arises marginal distribution of unknown mean of normal distribution, when dependence on unknown variance has been marginalised out:
p
(
μ
∣
d
,
i
)
=
∫
p
(
μ
,
σ
2
∣
d
,
i
)
d
σ
2
=
∫
p
(
μ
∣
d
,
σ
2
,
i
)
p
(
σ
2
∣
d
,
i
)
d
σ
2
,
{\displaystyle {\begin{aligned}p(\mu \mid d,i)=&\int p(\mu ,\sigma ^{2}\mid d,i)\,d\sigma ^{2}\\=&\int p(\mu \mid d,\sigma ^{2},i)\,p(\sigma ^{2}\mid d,i)\,d\sigma ^{2},\end{aligned}}}
where d stands data {xi}, , represents other information may have been used create model. distribution compounding of conditional distribution of μ given data , σ marginal distribution of σ given data.
with n data points, if uninformative, or flat, location , scale priors
p
(
μ
∣
σ
2
,
i
)
=
const
{\displaystyle p(\mu \mid \sigma ^{2},i)={\text{const}}}
,
p
(
σ
2
∣
i
)
∝
1
/
σ
2
{\displaystyle p(\sigma ^{2}\mid i)\propto 1/\sigma ^{2}}
can taken μ , σ, bayes theorem gives
p
(
μ
∣
d
,
σ
2
,
i
)
∼
n
(
x
¯
,
σ
2
/
n
)
,
p
(
σ
2
∣
d
,
i
)
∼
s
c
a
l
e
-
i
n
v
-
χ
2
(
ν
,
s
2
)
,
{\displaystyle {\begin{aligned}p(\mu \mid d,\sigma ^{2},i)&\sim n({\bar {x}},\sigma ^{2}/n),\\p(\sigma ^{2}\mid d,i)&\sim \operatorname {scale-inv-} \chi ^{2}(\nu ,s^{2}),\end{aligned}}}
a normal distribution , scaled inverse chi-squared distribution respectively,
ν
=
n
−
1
{\displaystyle \nu =n-1}
and
s
2
=
∑
(
x
i
−
x
¯
)
2
n
−
1
.
{\displaystyle s^{2}=\sum {\frac {(x_{i}-{\bar {x}})^{2}}{n-1}}.}
the marginalisation integral becomes
p
(
μ
∣
d
,
i
)
∝
∫
0
∞
1
σ
2
exp
(
−
1
2
σ
2
n
(
μ
−
x
¯
)
2
)
⋅
σ
−
ν
−
2
exp
(
−
ν
s
2
/
2
σ
2
)
d
σ
2
∝
∫
0
∞
σ
−
ν
−
3
exp
(
−
1
2
σ
2
(
n
(
μ
−
x
¯
)
2
+
ν
s
2
)
)
d
σ
2
.
{\displaystyle {\begin{aligned}p(\mu \mid d,i)&\propto \int _{0}^{\infty }{\frac {1}{\sqrt {\sigma ^{2}}}}\exp \left(-{\frac {1}{2\sigma ^{2}}}n(\mu -{\bar {x}})^{2}\right)\cdot \sigma ^{-\nu -2}\exp(-\nu s^{2}/2\sigma ^{2})\,d\sigma ^{2}\\&\propto \int _{0}^{\infty }\sigma ^{-\nu -3}\exp \left(-{\frac {1}{2\sigma ^{2}}}\left(n(\mu -{\bar {x}})^{2}+\nu s^{2}\right)\right)\,d\sigma ^{2}.\end{aligned}}}
this can evaluated substituting
z
=
a
/
2
σ
2
{\displaystyle z=a/2\sigma ^{2}}
,
a
=
n
(
μ
−
x
¯
)
2
+
ν
s
2
{\displaystyle a=n(\mu -{\bar {x}})^{2}+\nu s^{2}}
, giving
d
z
=
−
a
2
σ
4
d
σ
2
,
{\displaystyle dz=-{\frac {a}{2\sigma ^{4}}}\,d\sigma ^{2},}
so
p
(
μ
∣
d
,
i
)
∝
a
−
ν
+
1
2
∫
0
∞
z
(
ν
−
1
)
/
2
exp
(
−
z
)
d
z
.
{\displaystyle p(\mu \mid d,i)\propto a^{-{\frac {\nu +1}{2}}}\int _{0}^{\infty }z^{(\nu -1)/2}\exp(-z)\,dz.}
but z integral standard gamma integral, evaluates constant, leaving
p
(
μ
∣
d
,
i
)
∝
a
−
ν
+
1
2
∝
(
1
+
n
(
μ
−
x
¯
)
2
ν
s
2
)
−
ν
+
1
2
.
{\displaystyle {\begin{aligned}p(\mu \mid d,i)&\propto a^{-{\frac {\nu +1}{2}}}\\&\propto \left(1+{\frac {n(\mu -{\bar {x}})^{2}}{\nu s^{2}}}\right)^{-{\frac {\nu +1}{2}}}.\end{aligned}}}
this form of t-distribution explicit scaling , shifting explored in more detail in further section below. can related standardised t-distribution substitution
t
=
μ
−
x
¯
s
/
n
.
{\displaystyle t={\frac {\mu -{\bar {x}}}{s/{\sqrt {n}}}}.}
the derivation above has been presented case of uninformative priors μ , σ; apparent priors lead normal distribution being compounded scaled inverse chi-squared distribution lead t-distribution scaling , shifting p(μ | d, i), although scaling parameter corresponding s/n above influenced both prior information , data, rather data above.
Comments
Post a Comment