Dragan Obradovic
obrad at
Mon Mar 11 10:41:29 EST 1996
"An Information-Theoretic Approach to Neural Computing"
Gustavo Deco and Dragan Obradovic
(Springer Verlag)
Full details at:
ISBN 0-387-94666-7
Neural networks provide a powerful new technology to model and control
nonlinear and complex systems. In this book, the authors present a
detailed formulation of neural networks from the information-theoretic
viewpoint. They show how this perspective provides new insights into
the design theory of neural networks. In particular they show how these
methods may be applied to the topics of supervised and unsupervised
learning including feature extraction, linear and non-linear
independent component analysis, and Boltzmann machines.
Readers are assumed to have a basic understanding of neural networks,
but all the relevant concepts from information theory are carefully
introduced and explained. Consequently, readers from several different
scientific disciplines, notably cognitive scientists, engineers,
physicists, statisticians, and computer scientists, will find this to
be a very valuable introduction to this topic.
Acknowledgments vi
Foreword vii
CHAPTER 1 Introduction 1
CHAPTER 2 Preliminaries of Information Theory and Neural
Networks 7
Elements of Information Theory 8
Entropy and Information 8
Joint Entropy and Conditional Entropy 9
Kullback-Leibler Entropy 9
Mutual Information 10
Differential Entropy, Relative Entropy and Mutual
Information 11
Chain Rules 13
Fundamental Information Theory Inequalities 15
Coding Theory 21
Elements of the Theory of Neural Networks 23
Neural Network Modeling 23
Neural Architectures 24
Learning Paradigms 27
Feedforward Networks: Backpropagation 28
Stochastic Recurrent Networks: Boltzmann Machine 31
Unsupervised Competitive Learning 35
Biological Learning Rules 36
PART I: Unsupervised Learning
CHAPTER 3 Linear Feature Extraction: Infomax Principle 41
Principal Component Analysis: Statistical Approach 42
PCA and Diagonalization of the Covariance Matrix 42
PCA and Optimal Reconstruction 45
Neural Network Algorithms and PCA 51
Information Theoretic Approach: Infomax 57
Minimization of Information Loss Principle and Infomax
Principle 58
Upper Bound of Information Loss 59
Information Capacity as a Lyapunov Function of the
General Stochastic Approximation 61
CHAPTER 4 Independent Component Analysis: General Formulation
and Linear Case 65
ICA-Definition 67
General Criteria for ICA 68
Cumulant Expansion Based Criterion for ICA 69
Mutual Information as Criterion for ICA 73
Linear ICA 79
Gaussian Input Distribution and Linear ICA 81
Networks With Anti-Symmetric Lateral Connections 84
Networks With Symmetric Lateral Connections 86
Examples of Learning with Symmetric and Anti-Symmetric
Networks 89
Learning in Gaussian ICA with Rotation Matrices: PCA 91
Relationship Between PCA and ICA in Gaussian Input Case 93
Linear Gaussian ICA and the Output Dimension Reduction 94
Linear ICA in Arbitrary Input Distribution 95
Some Properties of Cumulants at the Output of a Linear
Transformation 95
The Edgeworth Expansion Criteria and Theorem 4.6.2 99
Algorithms for Output Factorization in the Non-Gaussian
Case 100
Experimental Results of Linear ICA Algorithms in the
Non-Gaussian Case 102
CHAPTER 5 Nonlinear Feature Extraction: Boolean Stochastic
Networks 109
Infomax Principle for Boltzmann Machines 110
Learning Model 110
Examples of Infomax Principle in Boltzmann Machine 113
Redundancy Minimization and Infomax for the Boltzmann
Machine 119
Learning Model 119
Numerical Complexity of the Learning Rule 124
Factorial Learning Experiments 124
Receptive Fields Formation from a Retina 129
Appendix 132
CHAPTER 6 Nonlinear Feature Extraction: Deterministic Neural
Networks 135
Redundancy Reduction by Triangular Volume Conserving
Architectures 136
Networks with Linear, Sigmoidal and Higher Order
Activation Functions 140
Simulations and Results 142
Unsupervised Modeling of Chaotic Time Series 146
Dynamical System Modeling 147
Redundancy Reduction by General Symplectic
Architectures 156
General Entropy Preserving Nonlinear Maps 156
Optimizing a Parameterized Symplectic Map 157
Density Estimation and Novelty Detection 159
Example: Theory of Early Vision 163
Theoretical Background 164
Retina Model 165
PART II: Supervised Learning
CHAPTER 7 Supervised Learning and Statistical Estimation 169
Statistical Parameter Estimation - Basic Definitions 171
Cramer-Rao Inequality for Unbiased Estimators 172
Maximum Likelihood Estimators 175
Maximum Likelihood and the Information Measure 176
Maximum A Posteriori Estimation 178
Extensions of MLE to Include Model Selection 179
Akaike's Information Theoretic Criterion (AIC) 179
Minimal Description Length and Stochastic Complexity 183
Generalization and Learning on the Same Data Set 185
CHAPTER 8 Statistical Physics Theory of Supervised Learning
and Generalization 187
Statistical Mechanics Theory of Supervised Learning 188
Maximum Entropy Principle 189
Probability Inference with an Ensemble of Networks 192
Information Gain and Complexity Analysis 195
Learning with Higher Order Neural Networks 198
Partition Function Evaluation 198
Information Gain in Polynomial Networks 202
Numerical Experiments 203
Learning with General Feedforward Neural Networks 205
Partition Function Approximation 205
Numerical Experiments 207
Statistical Theory of Unsupervised and Supervised
Factorial Learning 208
Statistical Theory of Unsupervised Factorial Learning 208
Duality Between Unsupervised and Maximum Likelihood
Based Supervised Learning 213
CHAPTER 9 Composite Networks 219
Cooperation and Specialization in Composite Networks 220
Composite Models as Gaussian Mixtures 222
CHAPTER 10 Information Theory Based Regularizing Methods 225
Theoretical Framework 226
Network Complexity Regulation 226
Network Architecture and Learning Paradigm 227
Applications of the Mutual Information Based Penalty
Term 231
Regularization in Stochastic Potts Neural Network 237
Neural Network Architecture 237
Simulations 239
References 243
Index 259
Ordering information:
US $49.95, DM 76
Dr. Gustavo Deco and Dr. Dragan Obradovic
Siemens AG
ZFE T SN 4 Corporate Research and Development
Otto-Hahn-Ring 6 Phone: +49/89/636-49499
D-81739 Munich Fax: +49/89/636-49767
Germany E-Mail: Dragan.Obradovic at
Gustavo.Deco at
More information about the Connectionists
mailing list