美文网首页
R 基础知识:数据结构(list & factor)

R 基础知识:数据结构(list & factor)

作者: 小盐罐儿 | 来源:发表于2020-02-06 11:58 被阅读0次

    R语言中基本的数据单位是向量(vector),通过对于向量的堆叠我们能组合出更进阶的数据结构。这些数据结构包括弹性容器:list;有阶阶层的向量:factor;数据框:data.frame;二维的向量:matrix;阵列:array。

    1. list

    若我们用向量c()存储数据,因为其中有文字向量,所以在所有数据都会变成文字向量。

    # 以朱元璋为例
    name = "zhuyuanzhang"
    nickname = c("zhuchongba","zhuguorui","zhubaba")
    gender = "man"
    profession = "emperor"
    birthAndDead = "1328-1398"
    age = 70
    zyz = c(name,nickname,gender,profession,birthAndDead,age)
    zyz
    
    # [1] "zhuyuanzhang" "zhuchongba"   "zhuguorui"   
    # [4] "zhubaba"      "man"          "emperor"     
    # [7] "1328-1398"    "70"          
    

    若我们用list来储存,将宣告好的不同数据存储进去。我们可以看到list的强大收纳能力,他用数个[[索引值]]将各个向量依序放入其中。在list中各个向量依然保有自己的类型。

    zyzList = list(name,nickname,gender,profession,birthAndDead,age)
    zyzList
    # [[1]]
    # [1] "zhuyuanzhang"
    # [[2]]
    # [1] "zhuchongba" "zhuguorui"  "zhubaba"   
    # [[3]]
    # [1] "man"
    # [[4]]
    # [1] "emperor"
    # [[5]]
    # [1] "1328-1398"
    # [[6]]
    # [1] 70
    
    # 查看各个数据的类型
    str(zyzList) 
    # List of 6
    #  $ : chr "zhuyuanzhang"
    #  $ : chr [1:3] "zhuchongba" "zhuguorui" "zhubaba"
    #  $ : chr "man"
    #  $ : chr "emperor"
    #  $ : chr "1328-1398"
    #  $ : num 70
    

    list中选取数据的时候可以用[[索引值]]、[[“名称”]]和$名称。

    # 上方的list只能有[[索引值]]来查找,因为我们没有给各个变数命名
    zyzList[[2]][2]
    # [1] "zhuguorui
    
    # 增加下变数名称,就可以用[[“名称”]]和$名称来查找数据了
    zyzListPro = list(Name = name,
                      Nickname = nickname,
                      Gender = gender,
                      Profession = profession,
                      BandD = birthAndDead,
                      Age = age)
    zyzListPro[["Nickname"]]
    zyzListPro[["Nickname"]][1]
    zyzListPro$Nickname
    zyzListPro$Nickname[1]
    # [1] "zhuchongba" "zhuguorui"  "zhubaba"   
    # [1] "zhuchongba"
    # [1] "zhuchongba" "zhuguorui"  "zhubaba"   
    # [1] "zhuchongba"
    
    # 可以任意组合提取
    sprintf("%s is a %s, maybe his age is %.2f",
            zyzListPro$Name,
            zyzListPro$Profession,
            zyzListPro$Age*2)
    # [1] "zhuyuanzhang is a emperor, maybe his age is 140.00"
    
    1.factor

    R语言中很独特的一个数据结构就是factor。他是一种有阶层的向量(levels),一般叫他因子或是因素向量。

    fourSeasons = c("spring", "summer", "autumn", "winter")
    class(fourSeasons) # 文字向量
    # [1] "character"
    
    fourSeasonsFactor = factor(fourSeasons)
    fourSeasonsFactor
    class(fourSeasonsFactor) # 因子
    # [1] spring summer autumn winter
    # Levels: autumn spring summer winter
    # [1] "factor"
    

    预设的阶层(lever)是以A-Z的字母排序的(Levels: autumn spring summer winter),不过设置factor中的参数ordered = TRUE 与 levels = ,就可以根据自己的偏好做排序。

    fourSeasonsFactor = factor(fourSeasons,ordered = TRUE,levels = c("summer", "winter", "spring", "autumn"))
    fourSeasonsFactor
    # [1] spring summer autumn winter
    # Levels: summer < winter < spring < autumn
    

    Console印出的lever会按照大小关系做排序,所以factor很适合用在有隐含顺序意义的文字向量。

    temperatures = c("warm", "hot", "cold")
    temperaturesFactor = factor(temperatures, ordered = TRUE, levels = c("cold", "warm", "hot"))
    temperaturesFactor
    # [1] warm hot  cold
    # Levels: cold < warm < hot
    
    file

    本文由博客一文多发平台 OpenWrite 发布!

    相关文章

      网友评论

          本文标题:R 基础知识:数据结构(list & factor)

          本文链接:https://www.haomeiwen.com/subject/mnbyxhtx.html