tibble是R语言中一个用来替换data.frame类型的扩展的数据框,tibble继承了data.frame,是弱类型的,同时与data.frame有相同的语法,使用起来更方便。tibble包,也是由Hadley开发的R包。
- tibble,不关心输入类型,可存储任意类型,包括list类型
- tibble,没有行名设置 row.names
- tibble,支持任意的列名
- tibble,会自动添加列名
- tibble,类型只能回收长度为1的输入
- tibble,会懒加载参数,并按顺序运行
- tibble,是tbl_df类型
创建tibble
library(tidyverse)
> head(iris)
Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1 5.1 3.5 1.4 0.2 setosa
2 4.9 3.0 1.4 0.2 setosa
3 4.7 3.2 1.3 0.2 setosa
4 4.6 3.1 1.5 0.2 setosa
5 5.0 3.6 1.4 0.2 setosa
6 5.4 3.9 1.7 0.4 setosa
> as_tibble(iris)
# A tibble: 150 x 5
Sepal.Length Sepal.Width Petal.Length Petal.Width Species
<dbl> <dbl> <dbl> <dbl> <fct>
1 5.1 3.5 1.4 0.2 setosa
2 4.9 3 1.4 0.2 setosa
3 4.7 3.2 1.3 0.2 setosa
4 4.6 3.1 1.5 0.2 setosa
5 5 3.6 1.4 0.2 setosa
6 5.4 3.9 1.7 0.4 setosa
7 4.6 3.4 1.4 0.3 setosa
8 5 3.4 1.5 0.2 setosa
9 4.4 2.9 1.4 0.2 setosa
10 4.9 3.1 1.5 0.1 setosa
# ... with 140 more rows
tibble(
x = 1:5,
y = 1,
z = x ^ 2 + y
)
# A tibble: 5 x 3
x y z
<int> <dbl> <dbl>
1 1 1 2
2 2 1 5
3 3 1 10
4 4 1 17
5 5 1 26
对特殊符号的支持
tb <- tibble(
`:)` = "smile",
` ` = "space",
`2000` = "number"
)
tb
# A tibble: 1 x 3
`:)` ` ` `2000`
<chr> <chr> <chr>
1 smile space number
定制化生成
tribble(
~x, ~y, ~z,
#--|--|----
"a", 2, 3.6,
"b", 1, 8.5
)
# A tibble: 2 x 3
x y z
<chr> <dbl> <dbl>
1 a 2 3.6
2 b 1 8.5
对比tibble与data.frame
人性化打印。
tibble(
a = lubridate::now() + runif(1e3) * 86400,
b = lubridate::today() + runif(1e3) * 30,
c = 1:1e3,
d = runif(1e3),
e = sample(letters, 1e3, replace = TRUE)
)
# A tibble: 1,000 x 5
a b c d e
<dttm> <date> <int> <dbl> <chr>
1 2019-07-16 19:27:28 2019-07-31 1 0.906 v
2 2019-07-17 03:46:36 2019-08-12 2 0.271 k
3 2019-07-17 06:38:54 2019-08-10 3 0.0282 x
4 2019-07-16 13:02:51 2019-08-07 4 0.938 v
5 2019-07-17 07:18:28 2019-08-14 5 0.759 t
6 2019-07-16 17:11:20 2019-08-09 6 0.275 f
7 2019-07-16 10:54:57 2019-08-11 7 0.0217 u
8 2019-07-17 03:35:19 2019-07-20 8 0.110 b
9 2019-07-16 14:57:29 2019-07-27 9 0.436 g
10 2019-07-17 05:01:51 2019-08-01 10 0.401 s
# ... with 990 more rows
nycflights13::flights %>%
print(n = 10, width = Inf)
# A tibble: 336,776 x 19
year month day dep_time sched_dep_time dep_delay arr_time sched_arr_time arr_delay carrier flight tailnum origin dest air_time distance
<int> <int> <int> <int> <int> <dbl> <int> <int> <dbl> <chr> <int> <chr> <chr> <chr> <dbl> <dbl>
1 2013 1 1 517 515 2 830 819 11 UA 1545 N14228 EWR IAH 227 1400
2 2013 1 1 533 529 4 850 830 20 UA 1714 N24211 LGA IAH 227 1416
3 2013 1 1 542 540 2 923 850 33 AA 1141 N619AA JFK MIA 160 1089
4 2013 1 1 544 545 -1 1004 1022 -18 B6 725 N804JB JFK BQN 183 1576
5 2013 1 1 554 600 -6 812 837 -25 DL 461 N668DN LGA ATL 116 762
6 2013 1 1 554 558 -4 740 728 12 UA 1696 N39463 EWR ORD 150 719
7 2013 1 1 555 600 -5 913 854 19 B6 507 N516JB EWR FLL 158 1065
8 2013 1 1 557 600 -3 709 723 -14 EV 5708 N829AS LGA IAD 53 229
9 2013 1 1 557 600 -3 838 846 -8 B6 79 N593JB JFK MCO 140 944
10 2013 1 1 558 600 -2 753 745 8 AA 301 N3ALAA LGA ORD 138 733
hour minute time_hour
<dbl> <dbl> <dttm>
1 5 15 2013-01-01 05:00:00
2 5 29 2013-01-01 05:00:00
3 5 40 2013-01-01 05:00:00
4 5 45 2013-01-01 05:00:00
5 6 0 2013-01-01 06:00:00
6 5 58 2013-01-01 05:00:00
7 6 0 2013-01-01 06:00:00
8 6 0 2013-01-01 06:00:00
9 6 0 2013-01-01 06:00:00
10 6 0 2013-01-01 06:00:00
# ... with 3.368e+05 more rows
You can also control the default print behaviour by setting options:
-
options(tibble.print_max = n, tibble.print_min = m): if more than n rows, print only m rows.
-
options(tibble.print_min = Inf) to always show all rows.
-
options(tibble.width = Inf) to always print all columns, regardless of the width of the screen.
取子集。
nycflights13::flights %>%
print(n = 10, width = Inf)
df <- tibble(
x = runif(5),
y = rnorm(5)
)
> df
# A tibble: 5 x 2
x y
<dbl> <dbl>
1 0.140 0.492
2 0.0541 -0.307
3 0.366 -0.395
4 0.616 0.441
5 0.203 -2.16
# Extract by name
df$x
#> [1] 0.434 0.395 0.548 0.762 0.254
df[["x"]]
#> [1] 0.434 0.395 0.548 0.762 0.254
# Extract by position
df[[1]]
#> [1] 0.434 0.395 0.548 0.762 0.254
df %>% .$x
#> [1] 0.434 0.395 0.548 0.762 0.254
df %>% .[["x"]]
#> [1] 0.434 0.395 0.548 0.762 0.254
与旧代码交互
class(as.data.frame(tb))
#> [1] "data.frame"
网友评论