ggplot2 part_1
可视化使得数据科学从业者更好地分析并解释数据
常用的ggplot 模版
ggplot(data = ) +
<geom_function>(mapping = aex())
按照之前的学习,可以总结小部分的geom_function
factor()
可以将某个向量转变为factor 形式。
具体参考
https://www.jianshu.com/p/5561c995a621
图形的重要参数
!!!重要
首先介绍的四个参数
图像的画图标准
第一标准:
准确且有效的展示。
第二标准:
好看的视图。
开始游戏
aes 的参数
1. alpha 用于显示透明度
0~1
2. color 显示点的颜色,或填充图形的外边颜色
3. fill 填充颜色
4. size 大小
5. linetype
针对line plot 类型
6. labels
对图形中的参数进行标记
7. shape
8. stroke
部分内容可以参考
https://www.yuque.com/mugpeng/rr/004
9. width
binwidth
调整图像位置
1. position
用于改变图形中数据的位置
position = "jitter"
或
position = position_jitter()
都可以
jitter 可以改变散点图点的分布
感受一下差别
使用前:
使用后
不同的position 类型
1)jitter
2)dodge
3)stack
对于 bar 型图像来说(col,histogram等)
fill:
dodge
stack
indentity
2. scale
用于改变坐标系的范围
scale_x/y_xxx("name of certain axis", limit = , breaks = ,expand = )
调整scale 的大小
scale_size()
Call scale_size() setting range as a numeric vector of the form c(low, high).
3. label
例子
palette <- c(automatic = "#377EB8", manual = "#E41A1C")
## Set the position
ggplot(mtcars, aes(fcyl, fill = fam)) +
geom_bar(position = 'dodge') +
labs(x = "Number of Cylinders", y = "Count")
scale_fill_manual("Transmission", values = palette)
labs 还可以设置title 或者caption
labs(title = "Highest and lowest life expectancies, 2007", caption = "Source: gapminder")
mapping
x轴
xlim()
,如xlim(-2, 2)
,限制取x坐标的(-2, 2)
y轴
ylim()
不同类型的geometries
有48种之多!
Theme
通过theme 可以改变图像本来的格式。
p + theme(l.position = new_value)
可以改变图像旁边注释的位置
一般修改类型有四种。
element_rect()
element_text()
element_line()
element_blank()
三种类型
text
line
rectangle
范例
plt_prop_unemployed_over_time +
theme(
rect = element_rect(fill = "grey92"),
legend.key = element_rect(color = NA),
axis.ticks = element_blank(),
panel.grid = element_blank(),
panel.grid.major.y = element_line(
color = "white",
size = 0.5,
linetype = "dotted"
),
# Set the axis text color to grey25
axis.text = element_text(color = "grey25"),
# Set the plot title font face to italic and font size to 16
plot.title = element_text(size = 16, face = "italic")
)
修改whitespace
unit(x, unit)
x表示数量,unit 表示单位。
# View the original plot
plt_mpg_vs_wt_by_cyl
plt_mpg_vs_wt_by_cyl +
theme(
# Set the axis tick length to 2 lines
axis.ticks.length = unit(2, "lines")
)
margin(top, right, bottom, left, unit)
设定上下左右的数值,unit 表示单位。
plt_mpg_vs_wt_by_cyl +
theme(
# Set the legend margin to (20, 30, 40, 50) points
legend.margin = margin(20, 30, 40, 50, "pt")
)
其他theme 做的事
除了1、2点(上面介绍的外)
使用ggplot 内建theme
theme.*()
,如theme.classic()
其他还有
theme_gray() is the default.
theme_bw() is useful when you use transparency. theme_classic() is more traditional.
theme_void() removes everything but the data.
例子
# Theme layer saved as an object, theme_recession
theme_recession <- theme(
rect = element_rect(fill = "grey92"),
legend.key = element_rect(color = NA),
axis.ticks = element_blank(),
panel.grid = element_blank(),
panel.grid.major.y = element_line(color = "white", size = 0.5, linetype = "dotted"),
axis.text = element_text(color = "grey25"),
plot.title = element_text(face = "italic", size = 16),
legend.position = c(0.6, 0.1)
)
# Combine the Tufte theme with theme_recession
theme_tufte_recession <- theme_tufte() + theme_recession
# Add the recession theme to the plot
plt_prop_unemployed_over_time + theme_tufte_recession
使用其他包的theme
library(ggtheme)
还可以使用theme_set()
恢复默认
theme_tufte_recession <- theme_tufte() + theme_recession
# Set theme_tufte_recession as the default theme
theme_set(theme_tufte_recession)
对图片进行注释
使用annotate()
例子
# Add a curve
plt_country_vs_lifeExp +
step_1_themes +
geom_vline(xintercept = global_mean, color = "grey40", linetype = 3) +
step_3_annotation +
annotate(
"curve",
x = x_start, y = y_start,
xend = x_end, yend = y_end,
arrow = arrow(length = unit(0.2, "cm"), type = "closed"),
color = "grey40"
)
数据预备的程序
适用情况
- large datasets
- aligned values on a single axis
- low-precision data
网友评论