美文网首页
[R]-Data structures

[R]-Data structures

作者: 43daf5f8181f | 来源:发表于2016-08-04 22:29 被阅读14次

    Cite: http://adv-r.had.co.nz/Data-structures.html

    R's base data structures can be organised by their dimensionality (1d, 2d, or nd) and whether they're homogeneous (all contents must be of the same type) or heterogeneous (the contents can be of different types). This gives rise to the five data types most often used in data analysis:

    Homogeneous Heterogeneous
    1d (vector) Atomic vector List
    2d Matrix Data frame
    nd Array -

    Note that R has no 0-dimensional, or scalar types. Individual numbers or strings, which you might think would be scalars, are actually vectors of length one.

    Given an object, the best way to understand what data structures it’s composed of is to use str():

    vector and matrix are just aliases for one- and two-dimensional array respectively.

    Vector

    The basic data structure in R is the vector. Vectors come in two flavours: atomic vector and list. They have three common properties:

    • Type, typeof(), what it is.
    • Length, length(), how many elements it contains.
    • Attributes, attributes(), additional arbitrary metadata.
    Atomic vector

    There are four common types of atomic vectors: logical, integer, double (often called numeric), and character. There are two rare types that I will not discuss further: complex and raw. Atomic vectors are usually created with c(), short for combine.

    Atomic vectors are always flat, even if you nest c()’s:

    c(1, c(2, c(3, 4)))
    #> [1] 1 2 3 4
    # the same as
    c(1, 2, 3, 4)
    #> [1] 1 2 3 4
    

    Given a vector, you can determine its type with typeof(), or check if it's a specific type with an "is" function:

    is.character()
    is.double()
    is.integer()
    is.logical()
    # or, more generally
    is.atomic()
    
    # examples
    int_var <- c(1L, 6L, 10L)
    typeof(int_var)
    #> [1] "integer"
    is.integer(int_var)
    #> [1] TRUE
    is.atomic(int_var)
    #> [1] TRUE
    
    dbl_var <- c(1, 2.5, 4.5)
    typeof(dbl_var)
    #> [1] "double"
    is.double(dbl_var)
    #> [1] TRUE
    is.atomic(dbl_var)
    #> [1] TRUE
    

    is.numeric() 相当于 is.integer() | is.double():

    is.numeric(int_var)
    #> [1] TRUE
    is.numeric(dbl_var)
    #> [1] TRUE
    
    List

    You construct lists by using list() instead of c():

    x <- list(1:3, "a", c(TRUE, FALSE, TRUE), c(2.3, 5.9))
    str(x)
    #> List of 4
    #>  $ : int [1:3] 1 2 3
    #>  $ : chr "a"
    #>  $ : logi [1:3] TRUE FALSE TRUE
    #>  $ : num [1:2] 2.3 5.9
    

    Lists are sometimes called recursive vectors, because a list can contain other lists:

    x <- list(list(list(list())))
    str(x)
    #> List of 1
    #>  $ :List of 1
    #>   ..$ :List of 1
    #>   .. ..$ : list()
    is.recursive(x)
    #> [1] TRUE
    

    c() will combine several lists into one. If given a combination of atomic vectors and lists, c() will coerce the vectors to lists before combining them. Compare the results of list() and c():

    x <- list(list(1, 2), c(3, 4))
    y <- c(list(1, 2), c(3, 4))
    str(x)
    #> List of 2
    #>  $ :List of 2
    #>   ..$ : num 1
    #>   ..$ : num 2
    #>  $ : num [1:2] 3 4
    str(y)
    #> List of 4
    #>  $ : num 1
    #>  $ : num 2
    #>  $ : num 3
    #>  $ : num 4
    

    You can turn a list into an atomic vector with unlist(). If the elements of a list have different types, unlist() uses the same coercion rules as c().

    Lists are used to build up many of the more complicated data structures in R. For example, both data frames (described in data frames) and linear models objects (as produced by lm()) are lists:

    is.list(mtcars)
    #> [1] TRUE
    
    mod <- lm(mpg ~ wt, data = mtcars)
    is.list(mod)
    #> [1] TRUE
    

    相关文章

      网友评论

          本文标题:[R]-Data structures

          本文链接:https://www.haomeiwen.com/subject/llchsttx.html