美文网首页
Week 4-Experimental Technologies

Week 4-Experimental Technologies

作者: 英天 | 来源:发表于2017-08-10 10:46 被阅读0次

    Lecture 7 Gathering and Analyzing Large Data Sets

    7.A Experimental Technologies

    1. Genomics

    Started with development of Microarrays
    Extract mRNA → Convert to cDNA by reverse transcriptase→ Couple to dyes →hybridize →visualize → Computationally analyze to separate signal from noise
    1)
    Sequencing the whole genome
    Deep Sequencing - Repeated sequencing of a DNA fragment – region of interest in a chromosome Substantial increase in sensitivity and accuracy
    SNPs -- Single nucleotide polymorphisms - single base pair variations in the genome that occur with and relatively high frequency
    CNVs -- copy number variation - alterations in DNA structure such a region of the chromosome is abnormally duplicated or deleted
    Exome Sequencing -- Sequencing the expressed genome Separate the part of the whole genome that codes for proteins (the exons) and then sequence

    CHiP- Seq Sequencing transcription factor bound DNA
    CHiP - chromatin – immunoprecipitation - using an antibody against a transcription factor of interest
    2)
    RNA Seq - Sequencing the expressed mRNA
    Extract and fragment RNA-Convert fragments to cDNA-Sequence DNA fragments and map on to reference genome
    3)
    DNA Methylation 甲基化:Addition of methyl groups to C in DNA in mammals
    Typically 5’ position of C in CpG dinucleotides are methylated leading to inhibition of gene expression。表观遗传学研究内容

    Detection by genome wide bisulfite sequencing
    Bisulfite converts C→ U but not Me-C

    MicroRNAs miRNAs
    Small ( 21-25 nucleotide) RNA - regulates gene expression Can be sequenced using RNA-Seq starting with size selected RNAs。Around 1100 human mirs mir-001, or mir-123 or mir-500

    2. Proteomics

    Phosphoproteomics measuring phosphorylated peptides Ser, Thr or Tyr

    3. Metabolomics:The full set of metabolites found in a cell, tissue organism-Useful in understanding how phenotypic changes occur or not

    7.B Analyzing Large Data Set

    1. Heatmap:

    From HHMI : A –free 26 slide tutorial on how to analyze DNA microarray data
    http://www.hhmi.org/biointeractive/howanalyze-dna-microarray-data

    2. Statistical tests:

    T-tests can be used to test if two sets of data are significantly different from each other. Generally used if the test statistic follows a normal distribution
    ANOVA analysis of variance – commonly used to test the null hypothesis零假设 and determine if there is difference between any two groups when there are more than two groups in an experiment. Significance at a user defined value, p value of 0.05 or 0.01
    Mann–Whitney non–parametric test 非参数检验of the null hypothesis. Non-parametric means there is no assumption regarding the distribution of the test statistic不对测试的分布情况进行假定
    Cluster Analysis – putting entities (e.g.) genes into groups such that entities within a group are more closely related to each other than to entities in another group. Often used to identify groups of genes expressed (or repressed) under a specified condition (perturbation, duration of treatment etc)

    3. Gene-Set Enrichment Analysis:

    4. Cufflinks and Cuffdiff

    An open source program that maps RNA-Seq reads to a reference genome to identify transcripts and estimate relative abundance
    Cuffdiff can be used to detect change in expression levels of individual transcripts http://cufflinks.cbcb.umd.edu

    5. Genome-wide Association Studies全基因组关联研究

    Identification of variations in DNA sequence that are associated with increased risk of a disease
    Most often focused on SNPs
    Define phenotype: categorical or quantitative

    Assemble patient population for control and disease group

    Sequence whole genome – for better established cases – SNP-Chips
    Use of appropriate statistical test to establish association of SNPs with increased risk of disease
    Bush W.S and Moore J. H. (2012) PloS Comp Bio 8 : issue 12 e1002822

    6.Proteomics Technologies

    7.Gene-Ontology基因本体论

    A bioinformatics resource that allows you to categorize genes/gene products (proteins)www.geneontology.org
    It contains three categories: ‘Biological Process’, ‘Cellular Component’, ‘Molecular Function’
    Each of these categories is organized in a hierarchical 高低不等manner:

    • More nonspecific terms are called Parents主条目 which have more specific terms are called Children
    • The relationship between Parents and Children is further characterized by GO relations (e.g.: ‘is a’, ‘part of’, ‘has part’, ‘regulates’)

    8.A Network Building & Analysis and Data Organization

    1. Graph Theory
    2. Bayesian Networks
    3. Networks: Undirected graphs, directed graphs, sign-specified directed graphs,
    4. Networks relevant to cellular systems biology: Cell signaling networks, PPI, Gene regulatory networks
    5. Bioinformatics
      Genes- Genomics
      DNA Sequences and Sequence Analysis - GenBank
      Proteins
      Database of Protein Structures - PDB
      Protein characteristics - UniProtKB
      National Center for Biotechnology Information at the National Library of Medicine www.ncbi.nlm.nih.gov
    6. Database of Cell Signaling

    KEGG : Kyoto Encyclopedia of Genes and Genomes: a database of biological functions and systems including pathways
    Pathway Commons: Biological pathways from multiple organisms
    GEO :Gene Expression Omnibus genomics data base supported by NCBI - microarray and sequence based data
    OMIM: Online Mendelian Inheritance in Man - catalog of human genes and genetic disorders and traits
    ENCODE: Encyclopedia of DNA Elements – all functional elements in the human DNA sequence

    8.B Building Networks from Large Datasets

    1. Genes2Networks and Lists2Networks
      Combines lists of genes and proteins from an experiment with a background network of all known interactions (for species of interest) to produce a network of interest
    2. Tracing Pathways with ChEA and KEA
    3. From Expression Patterns to Regulatory Networks-Expression2Kinases X2K
    4. Visualization of Networks
      Pajek-http://vlado.fmf.uni-lj.si/pub/networks/doc/gd.01/Pajek2.png
      Cytoscape
    5. Visualizing Large-scale Dynamics
      GATE: Grid Analysis of Time Series Expression

    相关文章

      网友评论

          本文标题:Week 4-Experimental Technologies

          本文链接:https://www.haomeiwen.com/subject/nygilxtx.html