DAY 8 R

作者: Peng_001 | 来源:发表于2020-05-14 19:24 被阅读0次

    参考:data_camp_intermediate_data_visualization

    还是这个图


    后续的四个参数

    后面主要讲解statistics、coordinates、facets

    statistics

    该参数可以通过两种方式调用
    1)在geom内。2)单独调用。
    stat_*一般也会与相关的geom_*相连。意味着调用stats 也可以显示相关的默认对应的geom

    geom_bar()
    stat_count()
    # 返回相同结果
    

    smooth的几种参数

    1. se
      控制置信区间参数,默认为 se = TRUE

    2. span
      控制smooth 的程度。

    3. method
      可以控制参数模型,如glm, rlm, gam, lm。
      默认下为lOESS



      可以改为linear model


    例子

    # Amend the plot
    ggplot(mtcars, aes(x = wt, y = mpg, color = fcyl)) +
      geom_point() +
      # Map color to dummy variable "All"
      stat_smooth(aes(color = "All"), se = FALSE) +
      stat_smooth(method = "lm", se = FALSE)
    
    1. fullrange

    scatter_plot 的几个参数

    1. sum

    2. quantile
      分位线段图像
      可以用向量定义quantiles,设定不同的分位数线段。
      关于分位数可以参考
      https://www.cnblogs.com/jiangkejie/p/10482636.html
      例子

    # Amend the plot to color by year_group
    ggplot(Vocab, aes(x = education, y = vocabulary, color = year_group)) +
      geom_jitter(alpha = 0.25) +
      stat_quantile(quantiles = c(0.05, 0.5, 0.95))
    
    1. size
      stat_sum() 有一个特殊的变量,..prop.. 可以将其赋值给size
      将数目单位转换为比例单位
    # Amend the stat to use proportion sizes
    ggplot(Vocab, aes(x = education, y = vocabulary)) +
      stat_sum()
    
    # Amend the stat to use proportion sizes
    ggplot(Vocab, aes(x = education, y = vocabulary)) +
      stat_sum(aes(size = ..prop..))
    

    stats outside geom

    还可以在图像外绘制数据图表。
    例如
    比如充满喜感的QQplot


    stat_summary

    p_wt_vs_fcyl_by_fam_jit +
      # Add a summary stat of std deviation limits
      stat_summary(fun.data = mean_sdl, fun.args = list(mult = 1), position = posn_d)
    

    Coordinate

    按照字面意思即可理解,这部分是讲述坐标的。
    如之前提到的scale,还有xlim,可以修改图像显示尺寸。
    coord.fixed()可以用来调节x,y 轴的比例

    ggplot(iris, aes(x = Sepal.Length, y = Sepal.Width, color = Species)) +
      geom_jitter() +
      geom_smooth(method = "lm", se = FALSE) +
      # Fix the coordinate ratio
      coord_fixed(1)
    

    变化前


    变化后


    coord function

    coord_*()


    expand 消除缓冲区域
    ggplot(mtcars, aes(wt, mpg)) +
      geom_point(size = 2) +
      # Add Cartesian coordinates with zero expansion
      coord_cartesian(expand = 0) +
      theme_classic()
    



    coord和scale 的对比

    下图二者的结果是一样的

    # Perform a log10 coordinate system transformation
    ggplot(msleep, aes(bodywt, brainwt)) +
      geom_point() +
      coord_trans(x = "log10", y = "log10")
    
    # second one
    # Add scale_*_*() functions
    ggplot(msleep, aes(bodywt, brainwt)) +
      geom_point() +
      scale_x_log10() +
      scale_y_log10() +
      ggtitle("Scale_ functions")
    

    坐标系的其他变动类型

    通过flipped axes 转换坐标轴

    # Flip the axes to set car to the y axis
    ggplot(mtcars, aes(car, wt)) +
      geom_point() +
      labs(x = "car", y = "weight") +
      coord_flip()
    

    添加新的坐标轴

    # From previous step
    y_breaks <- c(59, 68, 77, 86, 95, 104)
    y_labels <- (y_breaks - 32) * 5 / 9
    secondary_y_axis <- sec_axis(
      trans = identity,
      name = "Celsius",
      breaks = y_breaks,
      labels = y_labels
    )
    
    # Update the plot
    ggplot(airquality, aes(Date, Temp)) +
      geom_line() +
      # Add the secondary y-axis 
      scale_y_continuous(sec.axis = secondary_y_axis) +
      labs(x = "Date (1973)", y = "Fahrenheit")
    

    极坐标

    之前一直在用笛卡尔坐标系,试试polar coordinate 呢?



    例子

    ggplot(mtcars, aes(x = 1, fill = fcyl)) +
      # Reduce the bar width to 0.1
      geom_bar(width = 0.1) +
      coord_polar(theta = "y") +
      # Add a continuous x scale from 0.5 to 1.5
      scale_x_continuous(limit = c(0.5, 1.5))
    

    facet function

    facet 功能用于将数据区分为多个组,通过某个分类变量,并将各个组呈现在各自的图像上。
    facet_grid() 是最好用的。

    plot +
      facet_grid(rows = vars(A), cols = vars(B))
    

    例子

    ggplot(mtcars, aes(wt, mpg)) + 
      geom_point() +
      # Facet rows by am
      facet_grid(rows = vars(am))
    

    facet 有两种调用形式

    ggplot(mtcars, aes(wt, mpg)) + 
      geom_point() +
      # Facet rows by am and columns by cyl using formula notation
      facet_grid(am ~ cyl)
    

    facet labels and orders

    fct_relevel, fct_recode, labeller
    通过label 可以注释相关信息

    # Plot wt by mpg
    ggplot(mtcars, aes(wt, mpg)) +
      geom_point() +
      # Two variables
      facet_grid(cols = vars(vs, cyl), labeller = label_context)
    

    通过设置scale 可以消除没有数据的坐标注释。



    值得注意的是:

    When faceting by columns, "free_y" has no effect, but we can adjust the x-axis. In contrast, when faceting by rows, "free_x" has no effect, but we can adjust the y-axis.


    通过调整space 可以消除空格较大的facet

    ggplot(mtcars, aes(x = mpg, y = car, color = fam)) +
      geom_point() +
      # Free the y scales and space
      facet_grid(rows = vars(gear), scales = "free_y", space = "free_y")
    

    facet_warp()

    由于facet_grid() 只会呈现数据为一行或一列(取决于设定方式),有时候显示起来不友好。而facet_warp则可以根据给的变量分割,相对而言能更好表现图像,且能自由调整行与列的数目。
    例子

    ggplot(Vocab, aes(x = education, y = vocabulary)) +
      stat_smooth(method = "lm", se = FALSE) +
      # Update the facet layout, using 11 columns
      facet_wrap(~ year, ncol = 11)
    

    通过margin 将facet 图像中额外添增本来未拆分图像,用于显示总体和个体的差异。


    一起画热图啦

    # Using barley, plot variety vs. year, filled by yield
    ggplot(barley, aes(year, variety, fill = yield)) +
      # Add a tile geom
      geom_tile()
    

    但往往热图仅适用于数据数量较少的情况,当数据量比较大,比较复杂时,或许可以考虑其他的替代方案。

    比如如果想要判断两组不同条件数据的综合变化趋势,可以考虑使用折线图

    # The heat map we want to replace
    # Don't remove, it's here to help you!
    ggplot(barley, aes(x = year, y = variety, fill = yield)) +
      geom_tile() +
      facet_wrap( ~ site, ncol = 1) +
      scale_fill_gradientn(colors = brewer.pal(9, "Reds"))
    
    # Using barley, plot yield vs. year, colored and grouped by variety
    ggplot(barley, aes(x = year, y = yield, color = variety, group = variety)) +
      # Add a line layer
      geom_line() +
      # Facet, wrapping by site, with 1 row
      facet_wrap(~ site, nrow = 1)
    

    好数据也可能搞成坏图


    如何做一个坏图

    • 最后再解决一个问题
    # Change type
    TG$dose <- as.numeric(as.character(TG$dose))
    
    # Plot
    growth_by_dose <- ggplot(TG, aes(dose, len, color = supp)) +
      stat_summary(fun.data = mean_sdl,
                   fun.args = list(mult = 1),
                   position = position_dodge(0.2)) +
      stat_summary(fun.y = mean,
                   geom = "line",
                   position = position_dodge(0.1)) +
      theme_classic() +
      # Adjust labels and colors:
      labs(x = "Dose (mg/day)", y = "Odontoblasts length (mean, standard deviation)", color = "Supplement") +
      scale_color_brewer(palette = "Set1", labels = c("Orange juice", "Ascorbic acid")) +
      scale_y_continuous(limits = c(0,35), breaks = seq(0, 35, 5), expand = c(0,0))
    
    # View plot
    growth_by_dose
    

    柱状图+误差棒

    # Plot wt vs. fcyl
    ggplot(mtcars, aes(x = fcyl, y = wt)) +
      # Add a bar summary stat of means, colored skyblue
      stat_summary(fun.y = mean, geom = "bar", fill = "skyblue") +
      # Add an errorbar summary stat std deviation limits
      stat_summary(fun.data = mean_sdl, fun.args = list(mult = 1), geom = "errorbar", width = 0.1)
    

    相关文章

      网友评论

          本文标题:DAY 8 R

          本文链接:https://www.haomeiwen.com/subject/olignhtx.html