DAY 8 R

作者: Peng_001 | 来源:发表于2020-05-14 19:24 被阅读0次

参考:data_camp_intermediate_data_visualization

还是这个图


后续的四个参数

后面主要讲解statistics、coordinates、facets

statistics

该参数可以通过两种方式调用
1)在geom内。2)单独调用。
stat_*一般也会与相关的geom_*相连。意味着调用stats 也可以显示相关的默认对应的geom

geom_bar()
stat_count()
# 返回相同结果

smooth的几种参数

  1. se
    控制置信区间参数,默认为 se = TRUE

  2. span
    控制smooth 的程度。

  3. method
    可以控制参数模型,如glm, rlm, gam, lm。
    默认下为lOESS



    可以改为linear model


例子

# Amend the plot
ggplot(mtcars, aes(x = wt, y = mpg, color = fcyl)) +
  geom_point() +
  # Map color to dummy variable "All"
  stat_smooth(aes(color = "All"), se = FALSE) +
  stat_smooth(method = "lm", se = FALSE)
  1. fullrange

scatter_plot 的几个参数

  1. sum

  2. quantile
    分位线段图像
    可以用向量定义quantiles,设定不同的分位数线段。
    关于分位数可以参考
    https://www.cnblogs.com/jiangkejie/p/10482636.html
    例子

# Amend the plot to color by year_group
ggplot(Vocab, aes(x = education, y = vocabulary, color = year_group)) +
  geom_jitter(alpha = 0.25) +
  stat_quantile(quantiles = c(0.05, 0.5, 0.95))
  1. size
    stat_sum() 有一个特殊的变量,..prop.. 可以将其赋值给size
    将数目单位转换为比例单位
# Amend the stat to use proportion sizes
ggplot(Vocab, aes(x = education, y = vocabulary)) +
  stat_sum()
# Amend the stat to use proportion sizes
ggplot(Vocab, aes(x = education, y = vocabulary)) +
  stat_sum(aes(size = ..prop..))

stats outside geom

还可以在图像外绘制数据图表。
例如
比如充满喜感的QQplot


stat_summary

p_wt_vs_fcyl_by_fam_jit +
  # Add a summary stat of std deviation limits
  stat_summary(fun.data = mean_sdl, fun.args = list(mult = 1), position = posn_d)

Coordinate

按照字面意思即可理解,这部分是讲述坐标的。
如之前提到的scale,还有xlim,可以修改图像显示尺寸。
coord.fixed()可以用来调节x,y 轴的比例

ggplot(iris, aes(x = Sepal.Length, y = Sepal.Width, color = Species)) +
  geom_jitter() +
  geom_smooth(method = "lm", se = FALSE) +
  # Fix the coordinate ratio
  coord_fixed(1)

变化前


变化后


coord function

coord_*()


expand 消除缓冲区域
ggplot(mtcars, aes(wt, mpg)) +
  geom_point(size = 2) +
  # Add Cartesian coordinates with zero expansion
  coord_cartesian(expand = 0) +
  theme_classic()



coord和scale 的对比

下图二者的结果是一样的

# Perform a log10 coordinate system transformation
ggplot(msleep, aes(bodywt, brainwt)) +
  geom_point() +
  coord_trans(x = "log10", y = "log10")

# second one
# Add scale_*_*() functions
ggplot(msleep, aes(bodywt, brainwt)) +
  geom_point() +
  scale_x_log10() +
  scale_y_log10() +
  ggtitle("Scale_ functions")

坐标系的其他变动类型

通过flipped axes 转换坐标轴

# Flip the axes to set car to the y axis
ggplot(mtcars, aes(car, wt)) +
  geom_point() +
  labs(x = "car", y = "weight") +
  coord_flip()

添加新的坐标轴

# From previous step
y_breaks <- c(59, 68, 77, 86, 95, 104)
y_labels <- (y_breaks - 32) * 5 / 9
secondary_y_axis <- sec_axis(
  trans = identity,
  name = "Celsius",
  breaks = y_breaks,
  labels = y_labels
)

# Update the plot
ggplot(airquality, aes(Date, Temp)) +
  geom_line() +
  # Add the secondary y-axis 
  scale_y_continuous(sec.axis = secondary_y_axis) +
  labs(x = "Date (1973)", y = "Fahrenheit")

极坐标

之前一直在用笛卡尔坐标系,试试polar coordinate 呢?



例子

ggplot(mtcars, aes(x = 1, fill = fcyl)) +
  # Reduce the bar width to 0.1
  geom_bar(width = 0.1) +
  coord_polar(theta = "y") +
  # Add a continuous x scale from 0.5 to 1.5
  scale_x_continuous(limit = c(0.5, 1.5))

facet function

facet 功能用于将数据区分为多个组,通过某个分类变量,并将各个组呈现在各自的图像上。
facet_grid() 是最好用的。

plot +
  facet_grid(rows = vars(A), cols = vars(B))

例子

ggplot(mtcars, aes(wt, mpg)) + 
  geom_point() +
  # Facet rows by am
  facet_grid(rows = vars(am))

facet 有两种调用形式

ggplot(mtcars, aes(wt, mpg)) + 
  geom_point() +
  # Facet rows by am and columns by cyl using formula notation
  facet_grid(am ~ cyl)

facet labels and orders

fct_relevel, fct_recode, labeller
通过label 可以注释相关信息

# Plot wt by mpg
ggplot(mtcars, aes(wt, mpg)) +
  geom_point() +
  # Two variables
  facet_grid(cols = vars(vs, cyl), labeller = label_context)

通过设置scale 可以消除没有数据的坐标注释。



值得注意的是:

When faceting by columns, "free_y" has no effect, but we can adjust the x-axis. In contrast, when faceting by rows, "free_x" has no effect, but we can adjust the y-axis.


通过调整space 可以消除空格较大的facet

ggplot(mtcars, aes(x = mpg, y = car, color = fam)) +
  geom_point() +
  # Free the y scales and space
  facet_grid(rows = vars(gear), scales = "free_y", space = "free_y")

facet_warp()

由于facet_grid() 只会呈现数据为一行或一列(取决于设定方式),有时候显示起来不友好。而facet_warp则可以根据给的变量分割,相对而言能更好表现图像,且能自由调整行与列的数目。
例子

ggplot(Vocab, aes(x = education, y = vocabulary)) +
  stat_smooth(method = "lm", se = FALSE) +
  # Update the facet layout, using 11 columns
  facet_wrap(~ year, ncol = 11)

通过margin 将facet 图像中额外添增本来未拆分图像,用于显示总体和个体的差异。


一起画热图啦

# Using barley, plot variety vs. year, filled by yield
ggplot(barley, aes(year, variety, fill = yield)) +
  # Add a tile geom
  geom_tile()

但往往热图仅适用于数据数量较少的情况,当数据量比较大,比较复杂时,或许可以考虑其他的替代方案。

比如如果想要判断两组不同条件数据的综合变化趋势,可以考虑使用折线图

# The heat map we want to replace
# Don't remove, it's here to help you!
ggplot(barley, aes(x = year, y = variety, fill = yield)) +
  geom_tile() +
  facet_wrap( ~ site, ncol = 1) +
  scale_fill_gradientn(colors = brewer.pal(9, "Reds"))

# Using barley, plot yield vs. year, colored and grouped by variety
ggplot(barley, aes(x = year, y = yield, color = variety, group = variety)) +
  # Add a line layer
  geom_line() +
  # Facet, wrapping by site, with 1 row
  facet_wrap(~ site, nrow = 1)

好数据也可能搞成坏图


如何做一个坏图

  • 最后再解决一个问题
# Change type
TG$dose <- as.numeric(as.character(TG$dose))

# Plot
growth_by_dose <- ggplot(TG, aes(dose, len, color = supp)) +
  stat_summary(fun.data = mean_sdl,
               fun.args = list(mult = 1),
               position = position_dodge(0.2)) +
  stat_summary(fun.y = mean,
               geom = "line",
               position = position_dodge(0.1)) +
  theme_classic() +
  # Adjust labels and colors:
  labs(x = "Dose (mg/day)", y = "Odontoblasts length (mean, standard deviation)", color = "Supplement") +
  scale_color_brewer(palette = "Set1", labels = c("Orange juice", "Ascorbic acid")) +
  scale_y_continuous(limits = c(0,35), breaks = seq(0, 35, 5), expand = c(0,0))

# View plot
growth_by_dose

柱状图+误差棒

# Plot wt vs. fcyl
ggplot(mtcars, aes(x = fcyl, y = wt)) +
  # Add a bar summary stat of means, colored skyblue
  stat_summary(fun.y = mean, geom = "bar", fill = "skyblue") +
  # Add an errorbar summary stat std deviation limits
  stat_summary(fun.data = mean_sdl, fun.args = list(mult = 1), geom = "errorbar", width = 0.1)

相关文章

网友评论

      本文标题:DAY 8 R

      本文链接:https://www.haomeiwen.com/subject/olignhtx.html