Zhang A. Advanced analysis of gene expression microarray data (Singapore, 2006). - ОГЛАВЛЕНИЕ / CONTENTS
Навигация

Архив выставки новых поступлений | Отечественные поступления | Иностранные поступления | Сиглы
ОбложкаZhang A. Advanced analysis of gene expression microarray data. - Singapore: World Scientific, 2006. - xv, 339 p.: ill. - Bibliogr.: p.307-330. - Ind.: p.331-338. - ISBN 981-256-645-7
 

Место хранения: 021 | Институт цитологии и генетики CO РАН | Новосибирск | Библиотека

Оглавление / Contents
 
Preface	 ...................................................... vii

1. Introduction ................................................. 1
   1.1  The Microarray: Key to Functional Genomics and
        Systems Biology ......................................... 1
   1.2  Applications of Microarray .............................. 2
        1.2.1  Gene Expression Profiles in Different Tissues .... 3
        1.2.2  Developmental Genetics ........................... 3
        1.2.3  Gene Expression Patterns in Model Systems ........ 3
        1.2.4  Differential Gene Expression Patterns in
               Diseases ......................................... 4
        1.2.5  Gene Expression Patterns in Pathogens ............ 5
        1.2.6  Gene Expression in Response to Drug Treatments ... 6
        1.2.7  Genotypic Analysis ............................... 7
        1.2.8  Mutation Screening of Disease Genes .............. 7
   1.3  Framework of Microarray Data Analysis ................... 8
   1.4  Summary ................................................ 11
2.  Basic Concepts of Molecular Biology ........................ 13
   2.1  Introduction ........................................... 13
   2.2  Cells .................................................. 13
   2.3  Proteins ............................................... 15
   2.4  Nucleic Acids .......................................... 19
        2.4.1  DNA ............................................. 19
        2.4.2  RNA ............................................. 22
   2.5  Central Dogma of Molecular Biology ..................... 22
        2.5.1  Genes and the Genetic Code ...................... 23
        2.5.2  Transcription and Gene Expression ............... 25
        2.5.3  Translation and Protein Synthesis ............... 26
   2.6  Genotype and Phenotype ................................. 27
   2.7  Summary ................................................ 30
3.  Overview of Microarray Experiments ......................... 31
   3.1  Introduction ........................................... 31
   3.2  Microarray Chip Manufacture ............................ 32
        3.2.1  Deposition-Based Manufacture .................... 33
        3.2.2  In Situ Manufacture ............................. 34
               3.2.2.1  The Affymetrix GeneChip ................ 35
   3.3  Steps of Microarray Experiments ........................ 36
        3.3.1  Sample Preparation and Labeling ................. 36
        3.3.2  Hybridization ................................... 39
        3.3.3  Image Scanning .................................. 39
   3.4  Image Processing ....................................... 40
   3.5  Microarray Data Cleaning and Preprocessing ............. 42
        3.5.1  Data Transformation ............................. 42
        3.5.2  Missing Value Estimation ........................ 43
   3.6  Data Normalization ..................................... 45
        3.6.1  Global Normalization Approaches ................. 46
               3.6.1.1  Standardization ........................ 46
               3.6.1.2  Iterative linear regression ............ 46
        3.6.2  Intensity-Dependent Normalization ............... 47
               3.6.2.1  LOWESS: Locally weighted linear
                        regression ............................. 47
               3.6.2.2  Distribution normalization ............. 49
   3.7  Summary ................................................ 49
4  Analysis of Differentially-Expressed Genes .................. 51
   4.1  Introduction ........................................... 51
   4.2  Basic Concepts in Statistics ........................... 53
        4.2.1  Statistical Inference ........................... 53
        4.2.2  Hypothesis Test ................................. 54
   4.3  Fold Change Methods .................................... 56
        4.3.1  fc-fold Change .................................. 56
        4.3.2  Unusual Ratios .................................. 57
        4.3.3  Model-Based Methods ............................. 60
   4.4  Parametric Tests ....................................... 62
        4.4.1  Paired t-Test ................................... 62
        4.4.2  Unpaired t-Test ................................. 63
        4.4.3  Variants of t-Test .............................. 64
   4.5  Non-Parametric Tests ................................... 65
        4.5.1  Classical Non-Parametric Statistics ............. 65
        4.5.2  Other Non-Parametric Statistics ................. 66
        4.5.3  Bootstrap Analysis .............................. 67
   4.6  Multiple Testing ....................................... 69
        4.6.1  Family-Wise Error Rate .......................... 70
               4.6.1.1  Sidak correction and Bonferroni
                        correction ............................. 70
               4.6.1.2  Holm's step-wise correction ............ 71
        4.6.2  False Discovery Rate ............................ 71
        4.6.3  Permutation Correction .......................... 72
        4.6.4  SAM: Significance Analysis of Microarrays ....... 73
   4.7  ANOVA: Analysis of Variance ............................ 77
        4.7.1  One-Way ANOVA ................................... 79
        4.7.2  Two-Way ANOVA ................................... 80
   4.8  Summary ................................................ 82
5  Gene-Based Analysis ......................................... 83
   5.1  Introduction ........................................... 83
   5.2  Proximity Measurement for Gene Expression Data ......... 85
        5.2.1  Euclidean Distance .............................. 85
        5.2.2  Correlation Coefficient ......................... 86
               5.2.2.1  Pearson's correlation coefficient ...... 86
               5.2.2.2  Jackknife correlation .................. 88
               5.2.2.3  Spearman's rank-order correlation ...... 88
        5.2.3  Kullback-Leibler Divergence ..................... 88
   5.3  Partition-Based Approaches ............................. 90
        5.3.1  K-means and its Variations ...................... 90
        5.3.2  SOM and its Extensions .......................... 92
        5.3.3  Graph-Theoretical Approaches .................... 94
               5.3.3.1  HCS and CLICK .......................... 94
               5.3.3.2  CAST: Cluster affinity search
                        technique .............................. 96
        5.3.4  Model-Based Clustering .......................... 98
   5.4  Hierarchical Approaches ................................ 99
        5.4.1  Agglomerative Algorithms ........................ 99
        5.4.2  Divisive Algorithms ............................ 102
               5.4.2.1  DAA: Deterministic annealing
                        algorithm ............................. 102
               5.4.2.2  SPC: Super-paramagnetic clustering .... 103
   5.5  Density-Based Approaches .............................. 104
        5.5.1  DBSCAN ......................................... 105
        5.5.2  OPTICS ......................................... 106
        5.5.3  DENCLUE ........................................ 107
   5.6  GPX: Gene Pattern eXplorer ............................ 110
        5.6.1  The Attraction Tree ............................ 115
               5.6.1.1  The distance measure .................. 115
               5.6.1.2  The density definition ................ 116
               5.6.1.3  The attraction tree ................... 118
               5.6.1.4  An example of attraction tree ......... 120
        5.6.2  Interactive Exploration of Coherent Patterns ... 122
               5.6.2.1  Generating the index list ............. 123
               5.6.2.2  The coherent pattern index and its
                        graph ................................. 125
               5.6.2.3  Drilling down to subgroups ............ 126
        5.6.3  Experimental Results ........................... 128
               5.6.3.1  Interactive exploration of Iyer's
                        data and Spellman's data .............. 129
               5.6.3.2  Comparison with other algorithms ...... 129
        5.6.4  Efficiency and Scalability ..................... 134
   5.7  Cluster Validation .................................... 135
        5.7.1  Homogeneity and Separation ..................... 136
        5.7.2  Agreement with Reference Partition ............. 137
        5.7.3  Reliability of Clusters ........................ 138
               5.7.3.1  P-value of a cluster .................. 138
               5.7.3.2  Prediction strength ................... 139
   5.8  Summary ............................................... 139
6  Sample-Based Analysis ...................................... 141
   6.1  Introduction .......................................... 141
   6.2  Selection of Informative Genes ........................ 144
        6.2.1  Supervised Approaches .......................... 145
               6.2.1.1  Differentially expressed genes ........ 145
               6.2.1.2  Gene pairs ............................ 146
               6.2.1.3  Virtual genes ......................... 148
               6.2.1.4  Genetic algorithms .................... 150
        6.2.2  Unsupervised Approaches ........................ 152
               6.2.2.1  PCA: Principal component analysis ..... 152
               6.2.2.2  Gene shaving .......................... 154
   6.3  Class Prediction ...................................... 155
        6.3.1  Linear Discriminant Analysis ................... 155
        6.3.2  Instance-Based Classification .................. 158
               6.3.2.1  KNN: k-Nearest Neighbor ............... 158
               6.3.2.2  Weighted voting ....................... 159
        6.3.3  Decision Trees ................................. 160
        6.3.4  Support Vector Machines ........................ 162
   6.4  Class Discovery ....................................... 163
        6.4.1  Problem statement .............................. 165
        6.4.2  CLIFF: CLustering via Iterative Feature
               Filtering ...................................... 165
               6.4.2.1  The sample-partition process .......... 166
               6.4.2.2  The gene-filtering process ............ 167
        6.4.3  ESPD: Empirical Sample Pattern Detection ....... 168
               6.4.3.1  Measurements for phenotype structure
                        detection ............................. 168
               6.4.3.2  Algorithms ............................ 173
               6.4.3.3  Experimental results .................. 184
   6.5  Classification Validation ............................. 190
        6.5.1  Prediction Accuracy ............................ 190
        6.5.2  Prediction Reliability ......................... 191
   6.6  Summary ............................................... 192
7  Pattern-Based Analysis ..................................... 195
   7.1  Introduction .......................................... 195
   7.2  Mining Association Rules .............................. 197
        7.2.1  Concepts of Association-Rule Mining ............ 198
        7.2.2  The Apriori Algorithm .......................... 200
        7.2.3  The FP-Growth Algorithm ........................ 201
        7.2.4  The CARPENTER Algorithm ........................ 202
        7.2.5  Generating Association Rules in Microarray
               Data ........................................... 204
               7.2.5.1  Rule filtering ........................ 205
               7.2.5.2  Rule grouping ......................... 206
   7.3  Mining Pattern-Based Clusters in Microarray Data ...... 207
        7.3.1  Heuristic Approaches ........................... 208
               7.3.1.1  Coupled two-way clustering (CTWC) ..... 208
               7.3.1.2  Plaid model ........................... 209
               7.3.1.3  Biclustering and δ-Clusters ........... 210
        7.3.2  Deterministic Approaches ....................... 211
               7.3.2.1  δ-pCluster ............................ 211
               7.3.2.2  OP-Cluster ............................ 213
   7.4  Mining Gene-Sample-Time Microarray Data ............... 214
        7.4.1  Three-dimensional Microarray Data .............. 214
        7.4.2  Coherent Gene Clusters ......................... 215
               7.4.2.1  Problem description ................... 217
               7.4.2.2  Maximal coherent sample sets .......... 219
               7.4.2.3  The mining algorithms ................. 222
               7.4.2.4  Experimental results .................. 227
        7.4.3  Tri-Clusters ................................... 232
               7.4.3.1  The tri-cluster model ................. 232
               7.4.3.2  Properties of tri-clusters ............ 234
               7.4.3.3  Mining tri-clusters ................... 235
   7.5  Summary ............................................... 238
8  Visualization of Microarray Data ........................... 239
   8.1  Introduction .......................................... 239
   8.2  Single-Array Visualization ............................ 241
        8.2.1  Box Plot ....................................... 242
        8.2.2  Histogram ...................................... 243
        8.2.3  Scatter Plot ................................... 244
        8.2.4  Gene Pies ...................................... 246
   8.3  Multi-Array Visualization ............................. 247
        8.3.1  Global Visualizations .......................... 247
        8.3.2  Optimal Visualizations ......................... 249
        8.3.3  Projection Visualization ....................... 250
   8.4  VizStruct ............................................. 251
        8.4.1  Fourier Harmonic Projections ................... 253
               8.4.1.1  Discrete-time signal paradigm ......... 253
               8.4.1.2  The Fourier harmonic projection
                        algorithm ............................. 254
        8.4.2  Properties of FHPs ............................. 257
               8.4.2.1  Basic properties ...................... 257
               8.4.2.2  Advanced properties ................... 258
               8.4.2.3  Harmonic equivalency .................. 260
               8.4.2.4  Effects of harmonic twiddle power
                        index ................................. 261
        8.4.3  Enhancements of Fourier Harmonic Projections ... 263
        8.4.4  Exploratory Visualization of Gene Profiling .... 265
               8.4.4.1  Microarray data sets for
                        visualization ......................... 265
               8.4.4.2  Identification of informative genes ... 265
               8.4.4.3  Classifier construction and
                        evaluation ............................ 265
               8.4.4.4  Dimension arrangement ................. 267
               8.4.4.5  Visualization of various data sets .... 270
               8.4.4.6  Comparison of FFHP to Sammon's
                        mapping ............................... 275
        8.4.5  Confirmative Visualization of Gene Time-
               series ......................................... 277
               8.4.5.1  Data sets for visualization ........... 277
               8.4.5.2  The harmonic projection approach ...... 278
               8.4.5.3  Rat kidney data set ................... 278
               8.4.5.4  Yeast-A data set ...................... 279
               8.4.5.5  Yeast-B data set ...................... 282
   8.5  Summary ............................................... 282
9  New Trends in Mining Gene Expression Microarray Data ....... 285
   9.1  Introduction .......................................... 285
   9.2  Meta-Analysis of Microarray Data ...................... 285
        9.2.1  Meta-Analysis of Differential Genes ............ 286
        9.2.2  Meta-Analysis of Co-Expressed Genes ............ 287
   9.3  Semi-Supervised Clustering ............................ 288
        9.3.1  General Semi-Supervised Clustering
               Algorithms ..................................... 289
        9.3.2  A Seed-Generation Approach ..................... 291
               9.3.2.1  Seed-generation methods ............... 291
               9.3.2.2  Pattern-selection rules ............... 292
               9.3.2.3  The framework for the seed-
                        generation approach ................... 295
   9.4  Integration of Gene Expression Data with Other Data ... 296
        9.4.1  A Probabilistic Model for Joint Mining ......... 299
        9.4.2  A Graph-Based Model for Joint Mining ........... 300
   9.5  Summary ............................................... 304
10 Conclusion ................................................. 305

Bibliography .................................................. 307

Index ......................................................... 331


Архив выставки новых поступлений | Отечественные поступления | Иностранные поступления | Сиглы
 

[О библиотеке | Академгородок | Новости | Выставки | Ресурсы | Библиография | Партнеры | ИнфоЛоция | Поиск]
  © 1997–2024 Отделение ГПНТБ СО РАН  

Документ изменен: Wed Feb 27 14:20:32 2019. Размер: 21,008 bytes.
Посещение N 1853 c 20.10.2009