美文网首页
advanced R. Data structure

advanced R. Data structure

作者: MJades | 来源:发表于2020-02-13 10:40 被阅读0次

Vector

  1. Vectors come in two flavours: atomic vectors and lists. They have three common properties:

Type, typeof(), what it is.
Length, length(), how many elements it contains.
Attributes, attributes(), additional arbitrary metadata.

  1. There are four common types of atomic vectors: logical, integer, double (often called numeric), and character.
  2. is.vector() does not test if an object is a vector. Instead it returns TRUE only if the object is a vector with no attributes apart from names. Use is.atomic(x) || is.list(x) to test if an object is actually a vector.
  3. is.numeric() is a general test for the “numberliness” of a vector and returns TRUE for both integer and double vectors. It is not a specific test for double vectors, which are often called numeric.

List

  1. Lists are different from atomic vectors because their elements can be of any type, including lists.
  2. Lists are sometimes called recursive vectors, because a list can contain other lists. This makes them fundamentally different from atomic vectors.
  3. c() will combine several lists into one. If given a combination of atomic vectors and lists, c() will coerce the vectors to lists before combining them. Compare the results of list() and c():
x <- list(list(1, 2), c(3, 4))
y <- c(list(1, 2), c(3, 4))
str(x)
#> List of 2
#>  $ :List of 2
#>   ..$ : num 1
#>   ..$ : num 2
#>  $ : num [1:2] 3 4
str(y)
 #> List of 4
 #>  $ : num 1
#>  $ : num 2
#>  $ : num 3
#>  $ : num 4

attributes

The only attributes not lost are the three most important:

  1. Names, a character vector giving each element a name, described in names.

  2. Dimensions, used to turn vectors into matrices and arrays, described in matrices and arrays.

  3. Class, used to implement the S3 object system, described in S3.

na.strings
  1. Unfortunately, most data loading functions in R automatically convert character vectors to factors. This is suboptimal, because there’s no way for those functions to know the set of all possible levels or their optimal order. Instead, use the argument stringsAsFactors = FALSE to suppress this behaviour, and then manually convert character vectors to factors using your knowledge of the data.

Matrices and arrays

  1. Matrices and arrays are created with matrix() and array(), or by using the assignment form of dim():
c <- 1:6
dim(c) <- c(3, 2)
c
#>      [,1] [,2]
#> [1,]    1    4
#> [2,]    2    5
#> [3,]    3    6
  1. c() generalises to cbind() and rbind() for matrices, and to abind() (provided by the abind package) for arrays. You can transpose a matrix with t(); the generalised equivalent for arrays is aperm().

Data frames

  1. Since a data frame is a list of vectors, it is possible for a data frame to have a column that is a list:
df <- data.frame(x = 1:3)
df$y <- list(1:2, 1:3, 1:4)
df
#>   x          y
#> 1 1       1, 2
#> 2 2    1, 2, 3
#> 3 3 1, 2, 3, 4
  1. However, when a list is given to data.frame(), it tries to put each item of the list into its own column, so this fails;
  2. A workaround is to use I(), which causes data.frame() to treat the list as one unit;
  3. I() adds the AsIs class to its input, but this can usually be safely ignored;
  4. Use list and array columns with caution: many functions that work with data frames assume that all columns are atomic vectors.

Note: character是字符,string 是字符串。

相关文章

网友评论

      本文标题:advanced R. Data structure

      本文链接:https://www.haomeiwen.com/subject/hbhwthtx.html