参考:data_camp_intermediate_data_visualization
还是这个图
后续的四个参数
后面主要讲解statistics、coordinates、facets
statistics
该参数可以通过两种方式调用
1)在geom内。2)单独调用。
stat_*
一般也会与相关的geom_*
相连。意味着调用stats 也可以显示相关的默认对应的geom
geom_bar()
stat_count()
# 返回相同结果
smooth的几种参数
-
se
控制置信区间参数,默认为se = TRUE
-
span
控制smooth 的程度。 -
method
可以控制参数模型,如glm, rlm, gam, lm。
默认下为lOESS
可以改为linear model
例子
# Amend the plot
ggplot(mtcars, aes(x = wt, y = mpg, color = fcyl)) +
geom_point() +
# Map color to dummy variable "All"
stat_smooth(aes(color = "All"), se = FALSE) +
stat_smooth(method = "lm", se = FALSE)
- fullrange
scatter_plot 的几个参数
-
sum
-
quantile
分位线段图像
可以用向量定义quantiles
,设定不同的分位数线段。
关于分位数可以参考
https://www.cnblogs.com/jiangkejie/p/10482636.html
例子
# Amend the plot to color by year_group
ggplot(Vocab, aes(x = education, y = vocabulary, color = year_group)) +
geom_jitter(alpha = 0.25) +
stat_quantile(quantiles = c(0.05, 0.5, 0.95))
- size
stat_sum() 有一个特殊的变量,..prop..
可以将其赋值给size
将数目单位转换为比例单位
# Amend the stat to use proportion sizes
ggplot(Vocab, aes(x = education, y = vocabulary)) +
stat_sum()
# Amend the stat to use proportion sizes
ggplot(Vocab, aes(x = education, y = vocabulary)) +
stat_sum(aes(size = ..prop..))
stats outside geom
还可以在图像外绘制数据图表。
例如
比如充满喜感的QQplot
stat_summary
p_wt_vs_fcyl_by_fam_jit +
# Add a summary stat of std deviation limits
stat_summary(fun.data = mean_sdl, fun.args = list(mult = 1), position = posn_d)
Coordinate
按照字面意思即可理解,这部分是讲述坐标的。
如之前提到的scale
,还有xlim
,可以修改图像显示尺寸。
coord.fixed()
可以用来调节x,y 轴的比例
ggplot(iris, aes(x = Sepal.Length, y = Sepal.Width, color = Species)) +
geom_jitter() +
geom_smooth(method = "lm", se = FALSE) +
# Fix the coordinate ratio
coord_fixed(1)
变化前
变化后
coord function
coord_*()
expand 消除缓冲区域
ggplot(mtcars, aes(wt, mpg)) +
geom_point(size = 2) +
# Add Cartesian coordinates with zero expansion
coord_cartesian(expand = 0) +
theme_classic()
前
后
coord和scale 的对比
下图二者的结果是一样的
# Perform a log10 coordinate system transformation
ggplot(msleep, aes(bodywt, brainwt)) +
geom_point() +
coord_trans(x = "log10", y = "log10")
# second one
# Add scale_*_*() functions
ggplot(msleep, aes(bodywt, brainwt)) +
geom_point() +
scale_x_log10() +
scale_y_log10() +
ggtitle("Scale_ functions")
坐标系的其他变动类型
通过flipped axes 转换坐标轴
# Flip the axes to set car to the y axis
ggplot(mtcars, aes(car, wt)) +
geom_point() +
labs(x = "car", y = "weight") +
coord_flip()
添加新的坐标轴
# From previous step
y_breaks <- c(59, 68, 77, 86, 95, 104)
y_labels <- (y_breaks - 32) * 5 / 9
secondary_y_axis <- sec_axis(
trans = identity,
name = "Celsius",
breaks = y_breaks,
labels = y_labels
)
# Update the plot
ggplot(airquality, aes(Date, Temp)) +
geom_line() +
# Add the secondary y-axis
scale_y_continuous(sec.axis = secondary_y_axis) +
labs(x = "Date (1973)", y = "Fahrenheit")
极坐标
之前一直在用笛卡尔坐标系,试试polar coordinate 呢?
例子
ggplot(mtcars, aes(x = 1, fill = fcyl)) +
# Reduce the bar width to 0.1
geom_bar(width = 0.1) +
coord_polar(theta = "y") +
# Add a continuous x scale from 0.5 to 1.5
scale_x_continuous(limit = c(0.5, 1.5))
facet function
facet 功能用于将数据区分为多个组,通过某个分类变量,并将各个组呈现在各自的图像上。
facet_grid() 是最好用的。
plot +
facet_grid(rows = vars(A), cols = vars(B))
例子
ggplot(mtcars, aes(wt, mpg)) +
geom_point() +
# Facet rows by am
facet_grid(rows = vars(am))
facet 有两种调用形式
ggplot(mtcars, aes(wt, mpg)) +
geom_point() +
# Facet rows by am and columns by cyl using formula notation
facet_grid(am ~ cyl)
facet labels and orders
fct_relevel
, fct_recode
, labeller
通过label 可以注释相关信息
# Plot wt by mpg
ggplot(mtcars, aes(wt, mpg)) +
geom_point() +
# Two variables
facet_grid(cols = vars(vs, cyl), labeller = label_context)
通过设置scale 可以消除没有数据的坐标注释。
值得注意的是:
When faceting by columns, "free_y" has no effect, but we can adjust the x-axis. In contrast, when faceting by rows, "free_x" has no effect, but we can adjust the y-axis.
通过调整space 可以消除空格较大的facet
ggplot(mtcars, aes(x = mpg, y = car, color = fam)) +
geom_point() +
# Free the y scales and space
facet_grid(rows = vars(gear), scales = "free_y", space = "free_y")
facet_warp()
由于facet_grid() 只会呈现数据为一行或一列(取决于设定方式),有时候显示起来不友好。而facet_warp则可以根据给的变量分割,相对而言能更好表现图像,且能自由调整行与列的数目。
例子
ggplot(Vocab, aes(x = education, y = vocabulary)) +
stat_smooth(method = "lm", se = FALSE) +
# Update the facet layout, using 11 columns
facet_wrap(~ year, ncol = 11)
通过margin 将facet 图像中额外添增本来未拆分图像,用于显示总体和个体的差异。
一起画热图啦
# Using barley, plot variety vs. year, filled by yield
ggplot(barley, aes(year, variety, fill = yield)) +
# Add a tile geom
geom_tile()
但往往热图仅适用于数据数量较少的情况,当数据量比较大,比较复杂时,或许可以考虑其他的替代方案。
比如如果想要判断两组不同条件数据的综合变化趋势,可以考虑使用折线图
# The heat map we want to replace
# Don't remove, it's here to help you!
ggplot(barley, aes(x = year, y = variety, fill = yield)) +
geom_tile() +
facet_wrap( ~ site, ncol = 1) +
scale_fill_gradientn(colors = brewer.pal(9, "Reds"))
# Using barley, plot yield vs. year, colored and grouped by variety
ggplot(barley, aes(x = year, y = yield, color = variety, group = variety)) +
# Add a line layer
geom_line() +
# Facet, wrapping by site, with 1 row
facet_wrap(~ site, nrow = 1)
好数据也可能搞成坏图
如何做一个坏图
- 最后再解决一个问题
# Change type
TG$dose <- as.numeric(as.character(TG$dose))
# Plot
growth_by_dose <- ggplot(TG, aes(dose, len, color = supp)) +
stat_summary(fun.data = mean_sdl,
fun.args = list(mult = 1),
position = position_dodge(0.2)) +
stat_summary(fun.y = mean,
geom = "line",
position = position_dodge(0.1)) +
theme_classic() +
# Adjust labels and colors:
labs(x = "Dose (mg/day)", y = "Odontoblasts length (mean, standard deviation)", color = "Supplement") +
scale_color_brewer(palette = "Set1", labels = c("Orange juice", "Ascorbic acid")) +
scale_y_continuous(limits = c(0,35), breaks = seq(0, 35, 5), expand = c(0,0))
# View plot
growth_by_dose
柱状图+误差棒
# Plot wt vs. fcyl
ggplot(mtcars, aes(x = fcyl, y = wt)) +
# Add a bar summary stat of means, colored skyblue
stat_summary(fun.y = mean, geom = "bar", fill = "skyblue") +
# Add an errorbar summary stat std deviation limits
stat_summary(fun.data = mean_sdl, fun.args = list(mult = 1), geom = "errorbar", width = 0.1)
网友评论