美文网首页生物信息学数据科学与R语言Biostat
Biostatistics(9)R实例:负贝努里分布和几何分布

Biostatistics(9)R实例:负贝努里分布和几何分布

作者: jlyq617 | 来源:发表于2018-01-29 14:34 被阅读19次

    通过两个类似的例子区分负贝努里分布和几何分布,同时了解R语言中与这两个分布相对应的函数: dnbinom(),dgeom()。并且通过作图学习如何绘制柱状图、将不同数据绘制在同一幅图及添加图例等

    负贝努里分布

    You are now assigned to investigate the people who have a family name “Cao”.Today you pay a visit to a small village located in central Henan Province. According to previous census of population, we know that 1/4 households here use “Cao” as their family name. Suppose their residential area is randomly distributed,
    (1) how many households you are expected to visit until you gather 3 households named with “Cao”? Then you find that people who have a family name “Song” are also useful to your research. Based on the record “10 percentage households named with ‘Song’” in census report, you conduct another investigation.
    (2) How many households you are expected to visit until you gather 3 households named with “Song”?
    (3) Please make a statement after comparing two investigations’ variance to show weather the expected value of “Song” is reliable.
    (4) Finally, overlay two bar plots which represent the relationship between required visit times and corresponding probability in one chart.Choose proper colors, axis limits and add legend & title. (hint: some of following functions are useful in your homework: dgeom(), dnbinom(), barplot(,add=T), rgb(,alpha=))

    A:

    Q1-3.png

    The variance of visit numbers in “Song” survey is much higher than “Cao”. Thus, we suggest that the value of “Song” survey is NOT reliable.

    (4)

    #生成x:从0到60
    x<-c(0:60)
    #画柱状图,主标题为:The Probability of Required Visit Times,y轴标签为Probability,x轴标签为Visit Times,每个条下出现的名称的向量为x+5,颜色为灰色
    barplot(dnbinom(x,3,0.25),ylim=c(0,0.1),main = "The Probability of Required Visit Times",ylab = "Probability",xlab = "Visit times",names.arg = x+5,col = "grey")
    #dnbinom负贝努里分布,颜色为rgb(0,0.5,0.1),透明度为0.7,add=T,在原图上添加(不重新生成新图)
    barplot(dnbinom(x,3,0.1),col = rgb(0,0.5,0.1,alpha = 0.7),add = T)
    #在x=55,y=0.09处添加图例,pch=15表示符号为实心正方形,颜色分别为grey和rgb(0,0.5,0.1,alpha = 0.7)
    legend(55,0.09,c("Cao","Song"),pch = 15,col = c("grey",rgb(0,0.5,0.1,alpha = 0.7)))
    
    Figure1.png

    几何分布

    You are now assigned to investigate the people who have a family name “Cao”.Today you pay a visit to a small village located in central Henan Province. According to previous census of population, we know that 2/5 households here use “Cao” as their family name. Suppose their residential area is randomly distributed,
    (1) how many households you are expected to visit until you find one household named with “Cao”?Then you find that people who have a family name “Song” are also useful to your research. Based on the record “15 percentage households named with ‘Song’” in census report, you conduct another investigation.
    (2) How many households you are expected to visit until you find one household named with “Song”?
    (3)Please make a statement after comparing two investigations’ variance to show weather the expected value of “Song” is reliable.
    (4) Finally, overlay two bar plots which represent the relationship between required visit times and corresponding probability in one chart. Choose proper colors, axis limits and add legend & title. (hint: some of following functions are useful in your homework: dgeom(), dnbinom(), barplot(,add=T), rgb(,alpha=))
    A:

    Q1-3.png

    The variance of visit numbers in “Song” survey is much higher than “Cao”. Thus, we suggest that the value of “Song” survey is NOT reliable.
    (4)

    x<-c(0:20)
    barplot(dgeom(x,0.4), main = "The Probability of Require d Visit Times",ylab = "Probability",xlab = "Visit times",names.arg = x+1,col = "grey") 
    barplot(dgeom(x,0.15),col = rgb(0,0.5,0.1,alpha = 0.7),add = T) legend(15,0.4,c("Cao","Song"),pch = 15,col = c("grey",rgb(0,0.5,0.1,alpha = 0.7)))
    
    Figure2.png

    相关文章

      网友评论

        本文标题:Biostatistics(9)R实例:负贝努里分布和几何分布

        本文链接:https://www.haomeiwen.com/subject/cppxzxtx.html