BOOK: Applied Bayesian Statistics With R and OpenBUGS Examples
The below content is a question from Chapter 2
Background: 全概率公式
In the following example: We must know Prior Probability -- P(Mi), the value of which may change in every step. And also we must know the conditional probabilities P(A|Mi), the value of that will not be changed in different steps, as the value of P(A|Mi) can be got according to the realistic background or historical information. And our goal is calculate the value of P(Mj|A) which is also a conditional probability. (在事件A发生的条件下,事件Mj发生的概率)
Tips:所有的事件M1,M2,... are mutually exclusive and exhaustive,互斥(不相关)并不要求独立,但要求所有的Mi并在一起构成全集。
Then it's time to introduce our example.
Hemophilia is a rare hereditary bleeding disorder caused by a defect in genes that control the body’s production of blood-clotting factors. It occurs almost exclusively in males. However, women may be carriers of the hemophilia gene. Female carriers of the hemophilia gene usually show no physical symptoms of hemophilia. A son born of a woman who is a hemophilia carrier and a man who does not have hemophilia has a 0.5 probability of inheriting hemophilia from his mother. A son born of a woman who is not a carrier and a man who does not have hemophilia has zero probability of inheriting hemophilia.
Danielle is a young married woman. Her husband does not have hemophilia. Because Danielle’s mother is known to be a carrier of hemophilia, there is a 0.5 probability that Danielle inherited a hemophilia gene from her mother and is also a carrier. We may consider two possible “models”: Danielle is a carrier, and Danielle is not a carrier. Danielle gives birth to three sons. None of them are identical twins, and we will consider their hemophilia outcomes to be independent conditional on her carrier status. For each of the sons, we will define a random variable Yi that takes on the value 1 if the son has hemophilia and 0 if he does not.
血友病传男不传女,女性即使携带致病基因,但也不会表现出患病的症状。资料显示,如果母亲携带治病基因,父亲不患病,那么儿子患病的可能性是0.5;如果母亲不携带治病基因,父亲不患病,那么儿子患病的可能性是0.(以上信息是根据某research研究的结果,所以以此得到的条件概率在任一step中都不变locked) 女主角Danielle(D)的妈妈是致病基因携带者,所以D(女主)有0.5的概率继承了此治病基因。(这个信息就是所谓的先验概率Prior Probability,先验概率是根据经验得到的,不准确,也正是我们在后续的迭代中需要优化更新的对象)
下面引入两个model: D携带治病基因(model);D不携带治病基因 (model) 多年以后,D和丈夫结婚,并生了3个儿子,丈夫没有血友病,我们给定记号:如果第一个son患病血友病则y1=1,第一个son没病则y1=0,依次类推。为了准确起见,我们assume三个儿子先后出生,并不是双(三)胞胎,所以每个孩子的出生可以看做是独立的事件,互不影响。
(1)写出models的先验概率:(repeat:D携带治病基因(model);D不携带治病基因 (model) ) Obviously, according to the above information... P(model)=0.5 P(model)=0.5
(2)写出各自model下对应的条件概率P(y=0|model)and P(y=1|model) 即表示D携带或不携带治病基因的条件下,儿子患病和不患病的概率。
Pic.1 CP Question (b)有Question (a)中的计算结果,假设三个孩子先后出生(3 steps)求后验概率P(model|y)以在已知孩子是否患病的条件下,确定D为血友病治病基因的携带者的条件概率,如上图所示,第一个儿子不患病,第二个儿子患病,第三个儿子不患病。 The calculation process can like this:
Pic.2 Calculation ProcessStep 3的计算思想于此相同,答案仍然是1.
以上,就是贝叶斯全概率公式,在一个简化的实际例子中的应用。贝叶斯统计的探索之旅正式启程。
网友评论