美文网首页
R基础三(数据框)

R基础三(数据框)

作者: 多啦A梦的时光机_648d | 来源:发表于2020-02-19 00:00 被阅读0次

数据框

数据框是一种表格式的数据结构,通常是由数据构成的一个矩形数组,行表示观测,列表示变量。
数据框实际是一个列表,列表中的元素是向量,向量构成数据框的列,所以数据框是矩形结构(但不是矩阵,矩阵必须为同一数据类型),数据框列必须同一类型,而列可以不同。并且数据框的列必须命名。

  • 1.创建数据框(data.frame())
>state <- data.frame(state.name,state.abb,state.region,state.x77)
> state
                   state.name state.abb  state.region Population Income Illiteracy
Alabama               Alabama        AL         South       3615   3624        2.1
Alaska                 Alaska        AK          West        365   6315        1.5
Arizona               Arizona        AZ          West       2212   4530        1.8
Arkansas             Arkansas        AR         South       2110   3378        1.9
California         California        CA          West      21198   5114        1.1
Colorado             Colorado        CO          West       2541   4884        0.7
Connecticut       Connecticut        CT     Northeast       3100   5348        1.1
Delaware             Delaware        DE         South        579   4809        0.9
Florida               Florida        FL         South       8277   4815        1.3
Georgia               Georgia        GA         South       4931   4091        2.0
Hawaii                 Hawaii        HI          West        868   4963        1.9
Idaho                   Idaho        ID          West        813   4119        0.6
Illinois             Illinois        IL North Central      11197   5107        0.9
Indiana               Indiana        IN North Central       5313   4458        0.7
Iowa                     Iowa        IA North Central       2861   4628        0.5
Kansas                 Kansas        KS North Central       2280   4669        0.6
Kentucky             Kentucky        KY         South       3387   3712        1.6
Louisiana           Louisiana        LA         South       3806   3545        2.8
Maine                   Maine        ME     Northeast       1058   3694        0.7
Maryland             Maryland        MD         South       4122   5299        0.9
Massachusetts   Massachusetts        MA     Northeast       5814   4755        1.1
Michigan             Michigan        MI North Central       9111   4751        0.9
Minnesota           Minnesota        MN North Central       3921   4675        0.6
Mississippi       Mississippi        MS         South       2341   3098        2.4
Missouri             Missouri        MO North Central       4767   4254        0.8
Montana               Montana        MT          West        746   4347        0.6
Nebraska             Nebraska        NE North Central       1544   4508        0.6
Nevada                 Nevada        NV          West        590   5149        0.5
New Hampshire   New Hampshire        NH     Northeast        812   4281        0.7
New Jersey         New Jersey        NJ     Northeast       7333   5237        1.1
New Mexico         New Mexico        NM          West       1144   3601        2.2
New York             New York        NY     Northeast      18076   4903        1.4
North Carolina North Carolina        NC         South       5441   3875        1.8
North Dakota     North Dakota        ND North Central        637   5087        0.8
Ohio                     Ohio        OH North Central      10735   4561        0.8
Oklahoma             Oklahoma        OK         South       2715   3983        1.1
Oregon                 Oregon        OR          West       2284   4660        0.6
Pennsylvania     Pennsylvania        PA     Northeast      11860   4449        1.0
Rhode Island     Rhode Island        RI     Northeast        931   4558        1.3
South Carolina South Carolina        SC         South       2816   3635        2.3
South Dakota     South Dakota        SD North Central        681   4167        0.5
Tennessee           Tennessee        TN         South       4173   3821        1.7
Texas                   Texas        TX         South      12237   4188        2.2
Utah                     Utah        UT          West       1203   4022        0.6
Vermont               Vermont        VT     Northeast        472   3907        0.6
Virginia             Virginia        VA         South       4981   4701        1.4
Washington         Washington        WA          West       3559   4864        0.6
West Virginia   West Virginia        WV         South       1799   3617        1.4
Wisconsin           Wisconsin        WI North Central       4589   4468        0.7
Wyoming               Wyoming        WY          West        376   4566        0.6
               Life.Exp Murder HS.Grad Frost   Area
Alabama           69.05   15.1    41.3    20  50708
Alaska            69.31   11.3    66.7   152 566432
Arizona           70.55    7.8    58.1    15 113417
Arkansas          70.66   10.1    39.9    65  51945
California        71.71   10.3    62.6    20 156361
Colorado          72.06    6.8    63.9   166 103766
Connecticut       72.48    3.1    56.0   139   4862
Delaware          70.06    6.2    54.6   103   1982
Florida           70.66   10.7    52.6    11  54090
Georgia           68.54   13.9    40.6    60  58073
Hawaii            73.60    6.2    61.9     0   6425
Idaho             71.87    5.3    59.5   126  82677
Illinois          70.14   10.3    52.6   127  55748
Indiana           70.88    7.1    52.9   122  36097
Iowa              72.56    2.3    59.0   140  55941
Kansas            72.58    4.5    59.9   114  81787
Kentucky          70.10   10.6    38.5    95  39650
Louisiana         68.76   13.2    42.2    12  44930
Maine             70.39    2.7    54.7   161  30920
Maryland          70.22    8.5    52.3   101   9891
Massachusetts     71.83    3.3    58.5   103   7826
Michigan          70.63   11.1    52.8   125  56817
Minnesota         72.96    2.3    57.6   160  79289
Mississippi       68.09   12.5    41.0    50  47296
Missouri          70.69    9.3    48.8   108  68995
Montana           70.56    5.0    59.2   155 145587
Nebraska          72.60    2.9    59.3   139  76483
Nevada            69.03   11.5    65.2   188 109889
New Hampshire     71.23    3.3    57.6   174   9027
New Jersey        70.93    5.2    52.5   115   7521
New Mexico        70.32    9.7    55.2   120 121412
New York          70.55   10.9    52.7    82  47831
North Carolina    69.21   11.1    38.5    80  48798
North Dakota      72.78    1.4    50.3   186  69273
Ohio              70.82    7.4    53.2   124  40975
Oklahoma          71.42    6.4    51.6    82  68782
Oregon            72.13    4.2    60.0    44  96184
Pennsylvania      70.43    6.1    50.2   126  44966
Rhode Island      71.90    2.4    46.4   127   1049
South Carolina    67.96   11.6    37.8    65  30225
South Dakota      72.08    1.7    53.3   172  75955
Tennessee         70.11   11.0    41.8    70  41328
Texas             70.90   12.2    47.4    35 262134
Utah              72.90    4.5    67.3   137  82096
Vermont           71.64    5.5    57.1   168   9267
Virginia          70.08    9.5    47.8    85  39780
Washington        71.72    4.3    63.5    32  66570
West Virginia     69.48    6.7    41.6   100  24070
Wisconsin         72.48    3.0    54.5   149  54464
Wyoming           70.29    6.9    62.9   173  97203

如果想将数据存入R中进行分析,则可以将每个内容存为向量,然后利用data.frame进行合并即可。

  • 2.访问数据框
  1. 通过索引访问
> state[1]  ##state数据框第1列
> state[c(2,4)]  ##访问第2和第4列
> state[-1]  ##负索引则是删除该列

2.利用行和列的名字访问

> state[,"state.abb"]   ##访问列名
 [1] AL AK AZ AR CA CO CT DE FL GA HI ID IL IN IA KS KY LA ME MD MA MI MN MS MO MT
[27] NE NV NH NJ NM NY NC ND OH OK OR PA RI SC SD TN TX UT VT VA WA WV WI WY
50 Levels: AK AL AR AZ CA CO CT DE FL GA HI IA ID IL IN KS KY LA MA MD ME ... WY
> state["Washington",]   ##访问行名
           state.name state.abb state.region Population Income Illiteracy Life.Exp
Washington Washington        WA         West       3559   4864        0.6    71.72
           Murder HS.Grad Frost  Area
Washington    4.3    63.5    32 66570

3.采用$的方式访问

> plot(women$height,women$weight)
进行线性回归(lm()函数)---直接给出列名
lm(weight ~height, data=women)
Call:
lm(formula = weight ~ height, data = women)

Coefficients:
(Intercept)       height  
     -87.52         3.45 
    1. R还提供attach和with函数的方法
  1. attach是加载数据框到R搜索目录中。
    使用完之后用detach()函数取消加载
> attach(mtcars)
> names(mtcars)
 [1] "mpg"  "cyl"  "disp" "hp"   "drat" "wt"   "qsec" "vs"   "am"   "gear"
[11] "carb"
> mpg   ##就可以不用mtcars$mpg的方式访问数据框了。
 [1] 21.0 21.0 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 17.8 16.4 17.3 15.2 10.4
[16] 10.4 14.7 32.4 30.4 33.9 21.5 15.5 15.2 13.3 19.2 27.3 26.0 30.4 15.8 19.7
[31] 15.0 21.4
> detach(mtcars)
> mpg
错误:找不到对象'mpg'

2.with(数据框,{列名})

> with(mtcars,{mpg})
 [1] 21.0 21.0 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 17.8 16.4 17.3 15.2 10.4
[16] 10.4 14.7 32.4 30.4 33.9 21.5 15.5 15.2 13.3 19.2 27.3 26.0 30.4 15.8 19.7
[31] 15.0 21.4
> with(mtcars,sum({mpg}))
[1] 642.9
    1. 双中括号访问(返回向量而不是列表)
> mtcars[['mpg']]
 [1] 21.0 21.0 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 17.8 16.4 17.3 15.2 10.4
[16] 10.4 14.7 32.4 30.4 33.9 21.5 15.5 15.2 13.3 19.2 27.3 26.0 30.4 15.8 19.7
[31] 15.0 21.4
单双中括号的区别

相关文章

网友评论

      本文标题:R基础三(数据框)

      本文链接:https://www.haomeiwen.com/subject/qvrnfhtx.html