> sprintf("%04d", 1)
[1] "0001"
> sprintf("%04d", 104)
[1] "0104"
> sprintf("%010d", 104)
[1] "0000000104"
Here is how I installed the data.table package:
Used my browser to download data.table_1.9.4.zip from page http://cran.r-project.org/web/packages/data.table/index.html
Put the downloaded file in my R working directory.
> install.packages("data.table_1.9.4.zip", repos=NULL)
> install.packages("plyr")
> install.packages("Rcpp")
> install.packages("rshape2")
> install.packages("chron")
Once done with that, I could do:
> library(data.table)
and everything else worked.
You can use the order() function directly without resorting to add-on tools -- see this simpler answer which uses a trick right from the top of the example(order) code:
R> dd[with(dd, order(-z, b)), ]
b x y z
4 Low C 9 22 Med D 3 11 Hi A 8 13 Hi A 9 1
Edit some 2+ years later: It was just asked how to do this by column index. The answer is to simply pass the desired sorting column(s) to the order() function:
R> dd[ order(-dd[,4], dd[,1]), ]
b x y z
4 Low C 9 22 Med D 3 11 Hi A 8 13 Hi A 9 1
rather than using the name of the column (and with() for easier/more direct access).
The definition of order is that a[order(a)] is in increasing order. This works with your example, where the correct order is the fourth, second, first, then third element.
You may have been looking for rank, which returns the rank of the elements
R> a <- c(4.1, 3.2, 6.1, 3.1)
R> order(a)
[1] 4 2 1 3
R> rank(a)
[1] 3 2 4 1
so rank tells you what order the numbers are in, order tells you how to get them in ascending order.
plot(a, rank(a)/length(a)) will give a graph of the CDF. To see why order is useful, though, try plot(a, rank(a)/length(a),type="S") which gives a mess, because the data are not in increasing order
If you did
or simply
you get a line graph of the CDF.
v <- c('a','b','c','e')
'b' %in% v
## returns TRUE
## returns the first location of 'b', in this case: 2
> x <- sample(1:10)
> x
[1] 4 5 9 3 8 1 6 10 7 2
> match(c(4,8),x)
[1] 1 5
match only returns the first encounter of a match, as you requested.
For multiple matching, %in% is the way to go :
> x <- sample(1:4,10,replace=T)
> x
[1] 3 4 3 3 2 3 1 1 2 2
> which(x %in% c(2,4))[1] 2 5 9 10
Here are several ways to do it. All of them are discouraged. Appending to an object in a for loop causes the entire object to be copied on every iteration, which causes a lot of people to say "R is slow", or "R loops should be avoided".
# one way
for (i in 1:length(values))
vector[i] <- values[i]
# another way
for (i in 1:length(values))
vector <- c(vector, values[i])
# yet another way?!?
for (v in values)
vector <- c(vector, v)
# ... more ways
help("append") would have answered your question and saved the time it took you to write this question (but would have caused you to develop bad habits). ;-)
Note that vector <- c() isn't an empty vector; it's NULL. If you want an empty character vector, use vector <- character().
Also note, as BrodieG pointed out in the comments: if you absolutely must use a for loop, then at least pre-allocate the entire vector before the loop. This will be much faster than appending for larger vectors.
values <- sample(letters, 1e4, TRUE)
vector <- character(0)# slow
system.time( for (i in 1:length(values)) vector[i] <- values[i] )
# user system elapsed
# 0.340 0.000 0.343
vector <- character(length(values))# fast(er)
system.time( for (i in 1:length(values)) vector[i] <- values[i] )
# user system elapsed
# 0.024 0.000 0.023
> head(data)
chr genome region
1 chr1 hg19_refGene CDS
2 chr1 hg19_refGene exon
3 chr1 hg19_refGene CDS
4 chr1 hg19_refGene exon
5 chr1 hg19_refGene CDS
6 chr1 hg19_refGene exon
You can set it to NULL.
> Data$genome <- NULL
> head(Data)
chr region
1 chr1 CDS
2 chr1 exon
3 chr1 CDS
4 chr1 exon
5 chr1 CDS
6 chr1 exon
As pointed out in the comments, here are some other possibilities:
Data[2] <- NULL # Wojciech Sobala
Data[[2]] <- NULL # same as above
Data <- Data[,-2] # Ian Fellows
Data <- Data[-2] # same as above
You can remove multiple columns via:
Data[1:2] <- list(NULL) # Marek
Data[1:2] <- NULL # does not work!
Be careful with matrix-subsetting though, as you can end up with a vector:
Data <- Data[,-(2:3)] # vector
Data <- Data[,-(2:3),drop=FALSE] # still a data.frame
string <- "log(M)"
gsub("log", "", string) # Works just fine
gsub("log(", "", string) #breaks
# Error in gsub("log(", "", test) :
# invalid regular expression 'log(', reason 'Missing ')''
Escape the parenthesis with a double-backslash:
gsub("log\\(", "", string)