美文网首页
R读取数据因数据中存在单引号和#号的问题

R读取数据因数据中存在单引号和#号的问题

作者: 浩瀚之宇 | 来源:发表于2019-04-20 17:22 被阅读0次

最初发现问题的解决

单引号

> mitoCarta2 = read.table("/media/xxx/Human.MitoCarta2.0_subset.txt", comment.char = "#", sep="\t", header=T)
Error in scan(file = file, what = what, sep = sep, quote = quote, dec = dec,  : 
  line 43 did not have 10 elements

Tnon_mito 55 ACPP 5'-NT|ACP-3|ACP3|PAP|P15309 acid phosphatase, prostate ENSG00000014257 chr3 132036210 132087146 NA
去掉单引号

#号

> mitoCarta2 = read.table("/media/xxx/Human.MitoCarta2.0_subset.txt", sep="\t", header=T)
Error in scan(file = file, what = what, sep = sep, quote = quote, dec = dec,  : 
  line 16193 did not have 10 elements

Tmito 134266 GRPEL2 Mt-GrpE#2|Q8TAA5 GrpE-like 2, mitochondrial (E. coli) ENSG00000164284 chr5 148724976 148734146 Mitochondria (APE, Supportive)
Tnon_mito 195814 SDR16C5 RDH#2|RDH-E2|RDHE2|Q8N3Y7 short chain dehydrogenase/reductase family 16C, member 5 ENSG00000170786 chr8 57212569 57233241 NA

引号问题,新发现解决方案:read.table读取文本数据换成read.csv就不报错

> hsa_gene_info = read.table("/media/xxx/hsa_gene_info.txt", sep="\t",header=F)
Error in scan(file = file, what = what, sep = sep, quote = quote, dec = dec,  : 
  line 47 did not have 8 elements

查看:

> hsa_gene_info[47,]
   V1   V2              V3               V4 V5     V6
47 55 ACPP ENSG00000014257 5'-NT|ACP-3|ACP3  3 3q22.1
                           V7             V8
47 acid phosphatase, prostate protein-coding

换成read.csv读取不报错:

> hsa_gene_info = read.csv("/media/xxx/hsa_gene_info.txt", sep="\t",header=F)

相关文章

网友评论

      本文标题:R读取数据因数据中存在单引号和#号的问题

      本文链接:https://www.haomeiwen.com/subject/pnylgqtx.html