I. Introduction ................................................. 1
1. Introduction ................................................. 3
1.1. Defining the Area ....................................... 3
1.2. A Typical Architecture of a Multimedia Data Mining
System .................................................. 7
1.3. The Content and the Organization of This Book ........... 8
1.4. The Audience of This Book .............................. 10
1.5. Further Readings ....................................... 11
II. Theory and Techniques ...................................... 13
2. Feature and Knowledge Representation for Multimedia Data .... 15
2.1. Introduction ........................................... 15
2.2. Basic Concepts ......................................... 16
2.2.1. Digital Sampling ................................ 17
2.2.2. Media Types ..................................... 18
2.3. Feature Representation ................................. 22
2.3.1. Statistical Features ............................ 23
2.3.2. Geometric Features .............................. 29
2.3.3. Meta Features ................................... 32
2.4. Knowledge Representation ............................... 32
2.4.1. Logic Representation ............................ 33
2.4.2. Semantic Networks ............................... 34
2.4.3. Frames .......................................... 36
2.4.4. Constraints ..................................... 38
2.4.5. Uncertainty Representation ...................... 41
2.5. Summary ................................................ 44
3. Statistical Mining Theory and Techniques .................... 45
3.1. Introduction ........................................... 45
3.2. Bayesian Learning ...................................... 47
3.2.1. Bayes Theorem ................................... 47
3.2.2. Bayes Optimal Classifier ........................ 49
3.2.3. Gibbs Algorithm ................................. 50
3.2.4. Naive Bayes Classifier .......................... 50
3.2.5. Bayesian Belief Networks ........................ 52
3.3. Probabilistic Latent Semantic Analysis ................. 56
3.3.1. Latent Semantic Analysis ........................ 57
3.3.2. Probabilistic Extension to Latent Semantic
Analysis ........................................ 58
3.3.3. Model Fitting with the EM Algorithm ............. 60
3.3.4. Latent Probability Space and Probabilistic
Latent Semantic Analysis ........................ 61
3.3.5. Model Overfitting and Tempered EM ............... 62
3.4. Latent Dirichlet Allocation for Discrete Data
Analysis ............................................... 63
3.4.1. Latent Dirichlet Allocation ..................... 64
3.4.2. Relationship to Other Latent Variable Models .... 66
3.4.3. Inference in LDA ................................ 69
3.4.4. Parameter Estimation in LDA ..................... 70
3.5. Hierarchical Dirichlet Process ......................... 72
3.6. Applications in Multimedia Data Mining ................. 73
3.7. Support Vector Machines ................................ 74
3.8. Maximum Margin Learning for Structured Output Space .... 81
3.9. Boosting ............................................... 88
3.10.Multiple Instance Learning ............................. 91
3.10.1.Establish the Mapping between the Word Space
and the Image-VRep Space ........................ 93
3.10.2.Word-to-Image Querying .......................... 95
3.10.3.Image-to-Image Querying ......................... 95
3.10.4.Image-to-Word Querying .......................... 96
3.10.5.Multimodal Querying ............................. 96
3.10.6.Scalability Analysis ............................ 97
3.10.7.Adaptability Analysis ........................... 97
3.11.Semi-Supervised Learning .............................. 101
3.11.1.Supervised Learning ............................ 104
3.11.2.Semi-Supervised Learning ....................... 106
3.11.3.Semiparametric Regularized Least Squares ....... 109
3.11.4.Semiparametric Regularized Support Vector
Machines ....................................... 111
3.11.5.Semiparametric Regularization Algorithm ........ 113
3.11.6.Transductive Learning and Semi-Supervised
Learning ....................................... 113
3.11.7.Comparisons with Other Methods ................. 114
3.12.Summary ............................................... 115
4. Soft Computing Based Theory and Techniques ................. 117
4.1. Introduction .......................................... 117
4.2. Characteristics of the Paradigms of Soft Computing .... 118
4.3. Fuzzy Set Theory ...................................... 119
4.3.1. Basic Concepts and Properties of Fuzzy Sets .... 119
4.3.2. Fuzzy Logic and Fuzzy Inference Rules .......... 123
4.3.3. Fuzzy Set Application in Multimedia Data
Mining ......................................... 124
4.4. Artificial Neural Networks ............................ 125
4.4.1. Basic Architectures of Neural Networks ......... 125
4.4.2. Supervised Learning in Neural Networks ......... 131
4.4.3. Reinforcement Learning in Neural Networks ...... 136
4.5. Genetic Algorithms .................................... 140
4.5.1. Genetic Algorithms in a Nutshell ............... 140
4.5.2. Comparison of Conventional and Genetic
Algorithms for an Extremum Search .............. 145
4.6. Summary ............................................... 150
III. Multimedia Data Mining Application Examples ........... 153
5. Image Database Modeling — Semantic Repository Training ..... 155
5.1. Introduction .......................................... 155
5.2. Background ............................................ 156
5.3. Related Work .......................................... 157
5.4. Image Features and Visual Dictionaries ................ 159
5.4.1. Image Features ................................. 159
5.4.2. Visual Dictionary .............................. 160
5.5. a-Semantics Graph and Fuzzy Model for Repositories .... 163
5.5.1. a-Semantics Graph .............................. 163
5.5.2. Fuzzy Model for Repositories ................... 166
5.6. Classification Based Retrieval Algorithm .............. 168
5.7. Experiment Results .................................... 170
5.7.1. Classification Performance on a Controlled
Database ....................................... 170
5.7.2. Classification Based Retrieval Results ......... 172
5.8. Summary ............................................... 180
6. Image Database Modeling - Latent Semantic Concept
Discovery .................................................. 181
6.1. Introduction .......................................... 181
6.2. Background and Related Work ........................... 182
6.3. Region Based Image Representation ..................... 185
6.3.1. Image Segmentation ............................. 185
6.3.2. Visual Token Catalog ........................... 188
6.4. Probabilistic Hidden Semantic Model ................... 191
6.4.1. Probabilistic Database Model ................... 191
6.4.2. Model Fitting with EM .......................... 192
6.4.3. Estimating the Number of Concepts .............. 194
6.5. Posterior Probability Based Image Mining and
Retrieval ............................................. 194
6.6. Approach Analysis ..................................... 196
6.7. Experimental Results .................................. 199
6.8. Summary ............................................... 205
7. A Multimodal Approach to Image Data Mining and Concept
Discovery .................................................. 209
7.1. Introduction .......................................... 209
7.2. Background ............................................ 210
7.3. Related Work .......................................... 211
7.4. Probabilistic Semantic Model .......................... 213
7.4.1. Probabilistically Annotated Image Model ........ 213
7.4.2. EM Based Procedure for Model Fitting ........... 215
7.4.3. Estimating the Number of Concepts .............. 216
7.5. Model Based Image Annotation and Multimodal Image
Mining and Retrieval .................................. 217
7.5.1. Image Annotation and Image-to-Text Querying .... 217
7.5.2. Text-to-image Querying ......................... 218
7.6. Experiments ........................................... 219
7.6.1. Dataset and Feature Sets ....................... 220
7.6.2. Evaluation Metrics ............................. 221
7.6.3. Results of Automatic Image Annotation .......... 221
7.6.4. Results of Single Word Text-to-image
Querying ....................................... 224
7.6.5. Results of Image-to-image Querying ............. 224
7.6.6. Results of Performance Comparisons with Pure
Text Indexing Methods .......................... 226
7.7. Summary ............................................... 228
8. Concept Discovery and Mining in a Video Database ........... 231
8.1. Introduction .......................................... 231
8.2. Background ............................................ 232
8.3. Related Work .......................................... 233
8.4. Video Categorization .................................. 235
8.4.1. Naive Bayes Classifier ......................... 237
8.4.2. Maximum Entropy Classifier ..................... 238
8.4.3. Support Vector Machine Classifier .............. 240
8.4.4. Combination of Meta Data and Content Based
Classifiers .................................... 241
8.5. Query Categorization .................................. 242
8.6. Experiments ........................................... 244
8.6.1. Data Sets ...................................... 244
8.6.2. Video Categorization Results ................... 246
8.6.3. Query Categorization Results ................... 251
8.6.4. Search Relevance Results ....................... 253
8.7. Summary ............................................... 255
9. Concept Discovery and Mining in an Audio Database .......... 257
9.1. Introduction .......................................... 257
9.2. Background and Related Work ........................... 258
9.3. Feature Extraction .................................... 260
9.4. Classification Method ................................. 263
9.5. Experimental Results .................................. 263
9.6. Summary ............................................... 269
References ................................................. 271
Index ......................................................... 291
|