R语言几个解决方法总结

作者: 浪尖儿 | 来源:发表于2016-08-17 17:34 被阅读0次

作为R语言新手,在使用R语言的过程中遇到了很多的问题。本文总结了几个常见问题的解决方法,希望对需要的人能有所帮助。

R语言几个解决方法总结

初学R语言会遇到很多问题,都是细细碎碎的小点,但是最困难的是,以前也没有系统的学过Matlab,对于类似语言的使用尤其是在一些思想上一时半会转变不过来。只好边学边用先积累了,这里把遇到的一些问题总结一下记录下来。每个问题都是独立的,彼此之间没有什么联系。

sprintf调用C函数sprintf,可以用来格式化字符串

> sprintf("%04d", 1)
[1] "0001"
> sprintf("%04d", 104)
[1] "0104"
> sprintf("%010d", 104)
[1] "0000000104"

安装data.table

data.table是个好用的包,安装方法可以参考这里:https://class.coursera.org/getdata-008/forum/thread?thread_id=58

Here is how I installed the data.table package:

Used my browser to download data.table_1.9.4.zip from page http://cran.r-project.org/web/packages/data.table/index.html

Put the downloaded file in my R working directory.

> install.packages("data.table_1.9.4.zip", repos=NULL)
> install.packages("plyr")
> install.packages("Rcpp")
> install.packages("rshape2")
> install.packages("chron")

Once done with that, I could do:
> library(data.table)
and everything else worked.

order的用法

对一个vector或者data.frame排序可以使用order函数,关于order和rank函数的使用和结果解释需要注意:

You can use the order() function directly without resorting to add-on tools -- see this simpler answer which uses a trick right from the top of the example(order) code:
R> dd[with(dd, order(-z, b)), ]
    b x y z
4 Low C 9 22 Med D 3 11  Hi A 8 13  Hi A 9 1
Edit some 2+ years later: It was just asked how to do this by column index. The answer is to simply pass the desired sorting column(s) to the order() function:
R> dd[ order(-dd[,4], dd[,1]), ]
    b x y z
4 Low C 9 22 Med D 3 11  Hi A 8 13  Hi A 9 1
R> 
rather than using the name of the column (and with() for easier/more direct access).

关于order函数结果的解释:

The definition of order is that a[order(a)] is in increasing order. This works with your example, where the correct order is the fourth, second, first, then third element.
You may have been looking for rank, which returns the rank of the elements
R> a <- c(4.1, 3.2, 6.1, 3.1)
R> order(a)
[1] 4 2 1 3
R> rank(a)
[1] 3 2 4 1
so rank tells you what order the numbers are in, order tells you how to get them in ascending order.
plot(a, rank(a)/length(a)) will give a graph of the CDF. To see why order is useful, though, try plot(a, rank(a)/length(a),type="S") which gives a mess, because the data are not in increasing order
If you did
oo<-order(a)
plot(a[oo],rank(a[oo])/length(a),type="S")
or simply
oo<-order(a)
plot(a[oo],(1:length(a))/length(a)),type="S")
you get a line graph of the CDF.

判断vector中是否有某一个元素

v <- c('a','b','c','e')
'b' %in% v
## returns TRUE

match('b',v)
## returns the first location of 'b', in this case: 2

> x <- sample(1:10)
> x
 [1]  4  5  9  3  8  1  6 10  7  2
> match(c(4,8),x)
 [1] 1 5
match only returns the first encounter of a match, as you requested.
For multiple matching, %in% is the way to go :
> x <- sample(1:4,10,replace=T)
> x
 [1] 3 4 3 3 2 3 1 1 2 2
> which(x %in% c(2,4))[1]  2  5  9 10

关于给vector中添加元素

Here are several ways to do it. All of them are discouraged. Appending to an object in a for loop causes the entire object to be copied on every iteration, which causes a lot of people to say "R is slow", or "R loops should be avoided".

# one way
for (i in 1:length(values))
  vector[i] <- values[i]

# another way
for (i in 1:length(values))
  vector <- c(vector, values[i])

# yet another way?!?
for (v in values)
  vector <- c(vector, v)
# ... more ways

help("append") would have answered your question and saved the time it took you to write this question (but would have caused you to develop bad habits). ;-)

Note that vector <- c() isn't an empty vector; it's NULL. If you want an empty character vector, use vector <- character().

Also note, as BrodieG pointed out in the comments: if you absolutely must use a for loop, then at least pre-allocate the entire vector before the loop. This will be much faster than appending for larger vectors.

set.seed(21)
values <- sample(letters, 1e4, TRUE)
vector <- character(0)# slow
system.time( for (i in 1:length(values)) vector[i] <- values[i] )
#  user  system elapsed
#  0.340  0.000  0.343
vector <- character(length(values))# fast(er)
system.time( for (i in 1:length(values)) vector[i] <- values[i] )
#  user  system elapsed 
#  0.024  0.000  0.023

要注意的是,这里有性能方面的问题。

删除data.frame中的一列

> head(data)
   chr       genome region
1 chr1 hg19_refGene    CDS
2 chr1 hg19_refGene   exon
3 chr1 hg19_refGene    CDS
4 chr1 hg19_refGene   exon
5 chr1 hg19_refGene    CDS
6 chr1 hg19_refGene   exon



You can set it to NULL.
> Data$genome <- NULL
> head(Data)
   chr region
1 chr1    CDS
2 chr1   exon
3 chr1    CDS
4 chr1   exon
5 chr1    CDS
6 chr1   exon

As pointed out in the comments, here are some other possibilities:
Data[2] <- NULL    # Wojciech Sobala
Data[[2]] <- NULL  # same as above
Data <- Data[,-2]  # Ian Fellows
Data <- Data[-2]   # same as above

You can remove multiple columns via:    
Data[1:2] <- list(NULL)  # Marek
Data[1:2] <- NULL        # does not work!

Be careful with matrix-subsetting though, as you can end up with a vector:    
Data <- Data[,-(2:3)]             # vector
Data <- Data[,-(2:3),drop=FALSE]  # still a data.frame

从字符串中去除括号

string <- "log(M)"
gsub("log", "", string) # Works just fine
gsub("log(", "", string) #breaks
# Error in gsub("log(", "", test) : 
#   invalid regular expression 'log(', reason 'Missing ')''

Escape the parenthesis with a double-backslash:
要用双斜线来转义括号
gsub("log\\(", "", string)

相关文章

  • R语言几个解决方法总结

    作为R语言新手,在使用R语言的过程中遇到了很多的问题。本文总结了几个常见问题的解决方法,希望对需要的人能有所帮助。...

  • R 函数学习 - sort()、rank()、order()、a

    R语言中排序有几个基本函数:sort()、rank()、order()、arrange() 一、总结 sort()...

  • R语言基础--数据类型-总结

    R语言基础--数据类型-总结 1、R语言基础--数据类型之向量 2、R语言基础--数据类型之因子 3、R语言基础-...

  • day5 阿来

    继续学习R语言 R语言数据学习 数据R语言学习.png 数据输入 数据输出 总结 R语言学习的第二天,熟悉了很多操...

  • R语言实践学习(系列1)

    初涉R语言这本书,看了前两章,有了如下总结。 1.首先是R语言数据分析的流程,如下图。 总结:R语言运行进行数据分...

  • R语言总结

    突然发现需要注意以下几点:样本情况的理解---描述统计概念----统计推断和非参检验----反过来样本统计量估计-...

  • 2019-01-08总结

    2018.1.6听小洁R语言课后总结 小杰的ppt上的图片特别好,很直观,所以留下这几个基础概念,看了这几个图片我...

  • 学习小组Day4-Freeman

    R语言学习笔记 学习总结 R语言需要不断的去使用,才能学好,不然学了也可能会忘

  • R语言[ ]用法总结

    Summary: 中括号里面使用的函数要么返回行号/列号,要么返回布尔值。 例如: 1. 筛选:kkk[which...

  • React之PureComponent

    本篇目录 React避免重复渲染举例 PureComponent原理问题解决方法immutable.js 总结 R...

网友评论

    本文标题:R语言几个解决方法总结

    本文链接:https://www.haomeiwen.com/subject/cgwmsttx.html