INFO fields
Name | Brief description |
---|---|
AC | allele count in genotypes, for each ALT allele, in the same order as listed |
AF | allele frequency for each ALT allele in the same order as listed (use this when estimated from primary data, not called genotypes) |
AN | total number of alleles in called genotypes |
example 1
example 2
Genotype fields
GT : genotype, encoded as allele values separated by either of / or |. The allele values are 0 for the reference allele (what is in the REF field), 1 for the first allele listed in ALT, 2 for the second allele list in ALT and so on.
For diploid calls examples could be 0/1, 1 | 0, or 1/2, etc.
- 0/0 : the sample is homozygous reference
- 0/1 : the sample is heterozygous, carrying 1 copy of each of the REF and ALT alleles
- 1/1 : the sample is homozygous alternate
For haploid calls, e.g. on Y, male nonpseudoautosomal X, or mitochondrion, only one allele value should be given; a triploid call might look like 0/0/1.
If a call cannot be made for a sample at a given locus, ‘.’ should be specified for each missing allele 5 in the GT field (for example ‘./.’ for a diploid genotype and ‘.’ for haploid genotype).
Subset vcf
# extract list of samples from VCF
bcftools view -S sample.txt input.vcf -Oz -o sample.vcf
# remove list of samples from VCF
bcftools view -S ^sample.txt input.vcf -Oz -o sample.vcf
# or
vcftools --vcf input.vcf --recode --recode-INFO-all --stdout --remove-indv sample1 --remove-indv sample2 --remove-indv sample3 > sample.vcf
网友评论