Parida L. Pattern discovery in bioinformatics (Boca Raton, 2008). - ОГЛАВЛЕНИЕ / CONTENTS
Навигация

Архив выставки новых поступлений | Отечественные поступления | Иностранные поступления | Сиглы
ОбложкаParida L. Pattern discovery in bioinformatics: theory & algorithms. - Boca Raton: Chapman & Hall / CRC, 2008. - 526 p.: ill. - (Chapman & Hall / CRC mathematical and computational biology series). - Ref.: p.503-513. - Ind.: p.515-526. - ISBN 1-58488-549-1
 

Оглавление / Contents
 
1.  Introduction ................................................ 1
    1.1.  Ubiquity of Patterns .................................. 1
    1.2.  Motivations from Biology .............................. 2
    1.3.  The Need for Rigor .................................... 2
    1.4.  Who is a Reader of this Book? ......................... 3
          1.4.1.  About this book ............................... 4

I.   THE FUNDAMENTALS ........................................... 7

2.  Basic Algorithmics .......................................... 9
    2.1.  Introduction .......................................... 9
    2.2.  Graphs ................................................ 9
    2.3.  Tree Problem 1: Minimum Spanning Tree ................ 14
          2.3.1.  Prim's algorithm ............................. 17
    2.4.  Tree Problem 2: Steiner Tree ......................... 21
    2.5.  Tree Problem 3: Minimum Mutation Labeling ............ 22
          2.5.1.  Fitch's algorithm ............................ 23
    2.6.  Storing & Retrieving Elements ........................ 27
    2.7.  Asymptotic Functions ................................. 30
    2.8.  Recurrence Equations ................................. 32
          2.8.1.  Counting binary trees ........................ 34
          2.8.2.  Enumerating unrooted trees
                  (Priifer sequence) ........................... 36
    2.9.  NP-Complete Class of Problems ........................ 40
    2.10. Exercises ............................................ 41

3.  Basic Statistics ........................................... 47
    3.1.  Introduction ......................................... 47
    3.2.  Basic Probability .................................... 48
          3.2.1.  Probability space foundations ................ 48
          3.2.2.  Multiple events (Bayes' theorem) ............. 50
          3.2.3.  Inclusion-exclusion principle ................ 51
          3.2.4.  Discrete probability space ................... 54
          3.2.5.  Algebra of random variables .................. 57
          3.2.6.  Expectations ................................. 58
          3.2.7.  Discrete probability distribution
                  (binomial, Poisson) .......................... 60
          3.2.8.  Continuous probability distribution
                  (normal) ..................................... 64
          3.2.9.  Continuous probability space (O is K) ........ 66
    3.3.  The Bare Trnth abont Inferential Statistics .......... 69
          3.3.1.  Probability distribution invariants .......... 70
          3.3.2.  Samples & summary statistics ................. 72
          3.3.3.  The central limit theorem .................... 77
          3.3.4.  Statistical significance (p-value) ........... 80
    3.4.  Summary .............................................. 82
    3.5.  Exercises ............................................ 82

4.  What Are Patterns? ......................................... 89
    4.1.  Introduction ......................................... 89
    4.2.  Common Thread ........................................ 90
    4.3.  Pattern Duality ...................................... 90
          4.3.1.  Operators on p ............................... 92
    4.4.  Irredundant Patterns ................................. 92
          4.4.1.  Special case: maximality ..................... 93
          4.4.2.  Transitivity of redundancy ................... 95
          4.4.3.  Uniqueness property .......................... 95
          4.4.4.  Case studies ................................. 96
    4.5.  Constrained Patterns ................................. 99
    4.6.  When is a Pattern Specification Nontrivial? .......... 99
    4.7.  Classes of Patterns ................................. 100
    4.8.  Exercises ........................................... 103

II.  PATTERNS ON LINEAR STRINGS ............................... 111

5.  Modeling the Stream of Life ............................... 113
    5.1.  Introduction ........................................ 113
    5.2.  Modeling a Biopolymer ............................... 113
          5.2.1.  Repeats in DNA .............................. 114
          5.2.2.  Directionality of biopolymers ............... 115
          5.2.3.  Modeling a random permutation ............... 117
          5.2.4.  Modeling a random string .................... 119
    5.3.  Bernoulli Scheme .................................... 120
    5.4.  Markov Chain ........................................ 121
          5.4.1.  Stationary distribution ..................... 123
          5.4.2.  Computing probabilities ..................... 127
    5.5.  Hidden Markov Model (HMM) ........................... 128
          5.5.1.  The decoding problem (Viterbi algorithm) .... 130
    5.6.  Comparison of the Schemes ........................... 133
    5.7.  Conclusion .......................................... 133
    5.8.  Exercises ........................................... 134

6.  String Pattern Specifications ............................. 139
    6.1.  Introduction ........................................ 139
    6.2.  Notation ............................................ 140
    6.3.  Solid Patterns ...................................... 142
          6.3.1.  Maximality .................................. 144
          6.3.2.  Counting the maximal patterns ............... 144
    6.4.  Rigid Patterns ...................................... 149
          6.4.1.  Maximal rigid patterns ...................... 150
          6.4.2.  Enumerating maximal rigid patterns .......... 152
          6.4.3.  Density-constrained patterns ................ 156
          6.4.4.  Quorum-constrained patterns ................. 157
          6.4.5.  Large-|Σ| input ............................. 158
          6.4.6.  Irredundant patterns ........................ 160
    6.5.  Extensible Patterns ................................. 164
          6.5.1.  Maximal extensible patterns ................. 165
    6.6.  Generalizations ..................................... 165
          6.6.1.  Homologous sets ............................. 165
          6.6.2.  Sequence on reals ........................... 167
    6.7.  Exercises ........................................... 170

7.  Algorithms & Pattern Statistics ........................... 183
    7.1.  Introduction ........................................ 183
    7.2.  Discovery Algorithm ................................. 183
    7.3.  Pattern Statistics .................................. 191
    7.4.  Rigid Patterns ...................................... 191
    7.5.  Extensible Patterns ................................. 193
          7.5.1.  Nondegenerate extensible patterns ........... 194
          7.5.2.  Degenerate extensible patterns .............. 196
          7.5.3.  Correction factor for the dot character ..... 197
    7.6.  Measure of Surprise ................................. 198
          7.6.1.  z-score ..................................... 199
          7.6.2.  χ-square ratio .............................. 199
          7.6.3.  Interplay of combinatorics & statistics ..... 200
    7.7.  Applications ........................................ 201
    7.8.  Exercises ........................................... 203

8.  Motif Learning ............................................ 213
    8.1.  Introduction: Local Multiple Alignment .............. 213
    8.2.  Probabilistic Model: Motif Profile .................. 214
    8.3.  The Learning Problem ................................ 215
    8.4.  Importance Measure .................................. 216
          8.4.1.  Statistical significance .................... 216
          8.4.2  Information content .......................... 219
    8.5.  Algorithms to Learn a Motif Profile ................. 220
    8.6.  An Expectation Maximization Framework ............... 222
          8.6.1.  The initial estimate ρ0 ..................... 222
          8.6.2.  Estimating z given ρ ........................ 223
          8.6.3.  Estimating ρ given z ........................ 224
    8.7.  A Gibbs Sampling Strategy ........................... 227
          8.7.1.  Estimating ρ given an alignment ............. 227
          8.7.2.  Estimating background probabilities
                  given Z ..................................... 228
          8.7.3.  Estimating Z given ρ ........................ 228
    8.8.  Interpreting the Motif Profile in Terms of ρ ........ 229
    8.9.  Exercises ........................................... 230

9.  The Subtle Motif .......................................... 235
    9.1.  Introduction: Consensus Motif ....................... 235
    9.2.  Combinatorial Model: Subtle Motif ................... 236
    9.3.  Distance between Motifs ............................. 238
    9.4.  Statistics of Subtle Motifs ......................... 240
    9.5.  Performance Score ................................... 245
    9.6.  Enumeration Schemes ................................. 246
          9.6.1.  Neighbor enumeration (exact) ................ 246
          9.6.2.  Submotif enumeration (inexact) .............. 249
    9.7.  A Combinatorial Algorithm ........................... 252
    9.8.  A Probabilistic Algorithm ........................... 255
    9.9.  A Modular Solution .................................. 257
    9.10. Conclusion .......................................... 259
    9.11. Exercises ........................................... 260

III. PATTERNS ON META-DATA .................................... 263

10. Permutation Patterns ...................................... 265
    10.1. Introduction ........................................ 265
          10.1.1. Notation .................................... 266
    10.2. How Many Permutation Patterns? ...................... 267
    10.3. Maximality .......................................... 268
          10.3.1. P=1: Linear notation & PQ trees ............. 269
          10.3.2. P>1: Linear notation? ....................... 271
    10.4. Parikh Mapping-based Algorithm ...................... 273
          10.4.1. Tagging technique ........................... 275
          10.4.2. Time complexity analysis .................... 275
    10.5. Intervals ........................................... 278
          10.5.1. The naive algorithm ......................... 280
          10.5.2. The Uno-Yagiura RC algorithm ................ 281
    10.6. Intervals to PQ Trees ............................... 294
          10.6.1. Irreducible intervals ....................... 295
          10.6.2. Encoding intervals as a PQ tree ............. 297
    10.7. Applications ........................................ 307
          10.7.1. Case study I: Human and rat ................. 308
          10.7.2. Case study II: E. Coli K-12 and
                  B. Subtilis ................................. 309
    10.8. Conclusion .......................................... 311
    10.9. Exercises ........................................... 312

11. Permutation Pattern Probabilities ......................... 323
    11.1. Introduction ........................................ 323
    11.2. Unstructured Permutations ........................... 323
          11.2.1. Multinomial coefficients .................... 325
          11.2.2. Patterns with multiplicities ................ 328
    11.3. Structured Permutations ............................. 329
          11.3.1. P-arrangement ............................... 330
          11.3.2. An incremental method ....................... 331
          11.3.3. An upper bound on P-arrangements** .......... 336
          11.3.4. A lower bound on P-arrangements ............. 341
          11.3.5. Estimating the number of frontiers .......... 342
          11.3.6. Combinatorics to probabilities .............. 345
    11.4. Exercises ........................................... 346

12. Topological Motifs ........................................ 355
    12.1. Introduction ........................................ 355
          12.1.1. Graph notation .............................. 355
    12.2. What Are Topological Motifs? ........................ 356
          12.2.1. Combinatorics in topologies ................. 357
          12.2.2. Input with self-isomorphisms ................ 358
    12.3. The Topological Motif ............................... 359
          12.3.1. Maximally ................................... 367
    12.4. Compact Topological Motifs .......................... 369
          12.4.1. Occurrence-isomorphisms ..................... 369
          12.4.2. Vertex indistinguishability ................. 372
          12.4.3. Compact list ................................ 373
          12.4.4. Compact vertex, edge & motif ................ 373
          12.4.5. Maximal compact lists ....................... 374
          12.4.6. Conjugates of compact lists ................. 374
          12.4.7. Characteristics of compact lists ............ 378
          12.4.8. Maximal operations on compact lists ......... 380
          12.4.9. Maximal subsets of location lists ........... 381
          12.4.10.Binary relations on compact lists ........... 384
          12.4.11.Compact motifs from compact lists ........... 384
    12.5. The Discovery Method ................................ 392
          12.5.1. The algorithm ............................... 393
    12.6. Related Classical Problems .......................... 399
    12.7. Applications ........................................ 400
    12.8. Conclusion .......................................... 402
    12.9. Exercises ........................................... 402

13. Set-Theoretic Algorithmic Tools ........................... 417
    13.1. Introduction ........................................ 417
    13.2. Some Basic Properties of Finite Sets ................ 418
    13.3. Partial Order Graph G(S, E) of Sets ................. 419
          13.3.1. Reduced partial order graph ................. 420
          13.3.2. Straddle graph .............................. 421
    13.4. Boolean Closure of Sets ............................. 423
          13.4.1. Intersection closure ........................ 423
          13.4.2. Union closure ............................... 424
    13.5. Consecutive (Linear) Arrangement of Set Members ..... 426
          13.5.1. PQ trees .................................... 426
          13.5.2. Straddling sets ............................. 429
    13.6. Maximal Set Intersection Problem (maxSIP) ........... 434
          13.6.1. Ordered enumeration trie .................... 435
          13.6.2. Depth first traversal of the trie ........... 436
    13.7. Minimal Set Intersection Problem (minSIP) ........... 447
          13.7.1. Algorithm ................................... 447
          13.7.2. Minimal from maximal sets ................... 448
    13.8. Multi-Sets .......................................... 450
          13.8.1. Ordered enumeration trie of multi-sets ...... 451
          13.8.2. Enumeration algorithm ....................... 453
    13.9. Adapting the Enumeration Scheme ..................... 455
    13.10.Exercises ........................................... 458

14. Expression & Partial Order Motifs ......................... 469
    14.1. Introduction ........................................ 469
          14.1.1. Motivation .................................. 470
    14.2. Extracting (Monotone CNF) Boolean Expressions ....... 471
          14.2.1. Extracting biclusters ....................... 475
          14.2.2. Extracting patterns in microarrays .......... 478
    14.3. Extracting Partial Orders ........................... 480
          14.3.1. Partial orders .............................. 480
          14.3.2. Partial order construction problem .......... 481
          14.3.3. Excess in partial orders .................... 483
    14.4. Statistics of Partial Orders ........................ 485
          14.4.1. Computing Cex(B) ............................ 489
    14.5. Redescriptions ...................................... 493
    14.6. Application: Partial Order of Expressions ........... 494
    14.7. Summary ............................................. 495
    14.8. Exercises ........................................... 496

References .................................................... 503

Index ......................................................... 515


Архив выставки новых поступлений | Отечественные поступления | Иностранные поступления | Сиглы
 

[О библиотеке | Академгородок | Новости | Выставки | Ресурсы | Библиография | Партнеры | ИнфоЛоция | Поиск]
  © 1997–2024 Отделение ГПНТБ СО РАН  

Документ изменен: Wed Feb 27 14:20:24 2019. Размер: 20,813 bytes.
Посещение N 2265 c 06.10.2009