We investigate the complexity of neural learning problems in the form of
three basic questions---representation, modification, and construction---related
to algorithms that learn from examples in neural networks.
The first part is concerned with representations of example sets by
weight vectors. We establish bounds for the range of integer weights of
McCulloch-Pitts neurons computing linearly separable Boolean functions
in case that these functions are incompletely specified. To this end, we
introduce the concept of weight complexity as a measure for the degree
of difficulty to represent linear dichotomies of sets of binary words by
McCulloch-Pitts neurons. We give upper and lower bounds for the weight
complexity of linear dichotomies in terms of their cardinality. The upper
bound established results for completely specified functions in an improvement
by a factor of 2 compared to the previously known bound. Concerning lower
bounds we develop methods to construct linear dichotomies of linear cardinality
having exponential weight complexity. Furthermore, we investigate weight
complexity in the average case and generalize the previously known lower
bound for completely specified functions to linear dichotomies. Finally,
we consider linear dichotomies consisting of binary words with bounded
Hamming-weight and show that their weight complexity can be exponential
even if their Hamming-weight is at most 3. For Hamming-weight at most 2
we prove that exponential weight complexity is equivalent to requiring
an exponential threshold.
The second part considers learning to be modification of representations.
We define identification criteria for simple learning rules in McCulloch-Pitts
neurons that are weaker than exact identification. We show that the Perceptron
learning rule cannot satisfy any of these criteria in a polynomial number
of learning steps. For other known learning rules based on the error correction
principle we show similar results and survey further issues already treated
in the literature.
The last part is devoted to the general problem of constructing representations
efficiently. We focus on the task of mapping example sets to weights in
neural architectures. The consistency problem and the approximation problem
are variants of this task asking for the mere existence of appropriate
weights. In the thesis, we propose two parameters, coincidence and heaviness,
to characterize example sets. Using both to restrict the set of instances,
we create from each problem an infinite class of problems whose complexity
may vary with the choice of these parameters. For architectures with known
NP-complete consistency and approximation problem we show NP-completeness
already to occur when coincidence and heaviness are constant. We determine
some of these constants optimally, thereby proving interesting dichotomy
theorems. Finally, for two-layered architectures with more than two hidden
nodes we show NP-completeness of the consistency problem to hold for a
larger class of network functions than previously known.
Some NP-completeness results obtained in the thesis are based on two
new variants of the satisfiability problem. We give their NP-completeness
proofs in the Appendix. They may also be interesting for the sake of their
own and helpful in finding further NP-complete problems.
Table of Contents
1. Introduction
1.1 Notation
1.2 McCulloch-Pitts Neurons and Linearly Separable Functions
1.3 Learning from Examples
1.4 Computational Complexity
2. Linearly Separable Sets and Their Weight Complexity
2.1 Properties of Linearly Separable Sets
2.2 Weight Complexity
2.3 Upper Bounds
2.4 Lower Bounds
2.5 Average Case Weight Complexity
2.6 Weight Complexity of Light Sets
2.7 Remarks and Open Questions
3 Simple Learning Rules for McCulloch-Pitts Neurons
3.1 Learning from Errors: The Perceptron Learning Rule
3.2 Exact Identification
3.3 pac-Identification
3.4 Identification of Variable Order
3.5 Identification of Signs
3.6 Other Simple Learning Rules
3.7 Remarks and Open Questions
4 Consistency Problems
4.1 Neural Variants
4.2 Neurons with Bounded Weights
4.3 Two-layered Architectures
4.4 More than Two Hidden Nodes
4.5 Cascade Architectures
4.6 An Approximation Problem
4.7 Remarks and Open Questions
A Two NP-complete Satisfiability Problems
A.1 The 1-Disjoint Positive 1-IN-3SAT Problem
A.2 The 1-Disjoint 3SET-SPLITTING Problem
Summary
References
Author Index
Acknowledgements
|