[R语言] Statistics and R - Week 1

作者: 半为花间酒 | 来源:发表于2020-04-05 23:04 被阅读0次

Week 1: R


参考书籍:《Data Analysis for the Life Sciences》

参考视频:

  1. Data Analysis for the Life Sciences Series - rafalib
  2. Professional Certificate in Data Analysis for Life Sciences (Harvard University) - edX

开了个统计学的新坑 : )

Getting Started with R

- 对R基础语法不熟悉可以安装

install.packages("swirl")
library(swirl)
swirl()

- 重要的安装包

library(rafalib)
library(downloader) # 下载器
install.packages("devtools") # 连接Github

GitHub

https://github.com/genomicsclass

labs: 储存课程的源码
dagdata:含有课程所需的原始数据

Download from within R

- downloader

可以下载文件到当前Rproj目录或setwd()

library(downloader)

url <- "https://raw.githubusercontent.com/genomicsclass/dagdata/master/inst/extdata/femaleMiceWeights.csv"
filename <- "femaleMiceWeights.csv" 
download(url, destfile=filename)

dir <- "https://raw.githubusercontent.com/genomicsclass/dagdata/master/inst/extdata/"
filename <- "femaleMiceWeights.csv"
url <- paste0(dir, filename)
if (!file.exists(filename)) download(url,destfile=filename)
dat <- read.csv(url)

- devtools

library(devtools)

install_github("genomicsclass/dagdata")
#extracts the location of package
dir <- system.file(package="dagdata") 
list.files(dir)
list.files(file.path(dir,"extdata"))
# [1] "admissions.csv"               "astronomicalunit.csv"         "babies.txt"                  
# [4] "femaleControlsPopulation.csv" "femaleMiceWeights.csv"        "mice_pheno.csv"              
# [7] "msleep_ggplot2.csv"           "README"                       "spider_wolff_gorb_2013.csv" 

# 由于不在当前文件夹需要指名绝对路径
filename <- file.path(dir,"extdata/femaleMiceWeights.csv")
dat <- read.csv(filename)

- Exercises

> 1
# 下面的题目都可以用dplyr简化

> 2
dat[12,2] # 确实没明白and怎么用
# [1] 26.25

> 3
dat$Bodyweight[11]
# [1] 26.91

> 4
length(dat$Bodyweight)
# [1] 24

> 5
mean(dat[seq(13,24),2])
# [1] 26.83417

> 6
set.seed(1)
sample(dat[seq(13,24),2],1)
# [1] 34.02

Brief Introduction to dplyr

- dplyr + unlist

unlist可以解除data.frame性质

If dplyr receives a data.frame it will return a data.frame.
To obtain a numeric vector with dplyr, we can apply the unlist function which turns lists, such as data.frames, into numeric vectors.

library(dplyr)
chowVals <- filter(dat, Diet=="chow") %>% 
  select(Bodyweight) 
class(chowVals)
# [1] "data.frame"

chowVals <- filter(dat, Diet=="chow") %>% 
  select(Bodyweight) %>% 
  unlist()
class(chowVals)
# [1] "numeric"
  • Exercises
> 1
dir <- system.file(package="dagdata")
filename <- file.path(dir,"extdata/msleep_ggplot2.csv")
dat <- read.csv(filename)
class(dat)
# [1] "data.frame"

> 2
primates <- dat %>% 
  filter(order == 'Primates')

nrow(primates)
# [1] 12

> 3
class(primates)
# [1] "data.frame"

> 4
primates_st <- primates %>% 
  select(sleep_total)

class(primates_st)
# [1] "data.frame"

> 5
primates %>% 
  select(sleep_total) %>% 
  unlist() %>% 
  mean()
#  [1] 10.5

> 6
primates %>% 
  select(sleep_total) %>% 
  summarise(mean(sleep_total))
#  mean(sleep_total)
#  1              10.5

相关文章

网友评论

    本文标题:[R语言] Statistics and R - Week 1

    本文链接:https://www.haomeiwen.com/subject/xxcpphtx.html