NEURAL NETWORKS FOR STATISTICAL PATTERN RECOGNITION

ICT International Doctoral School, Trento (Jan 28th - Feb 1st, 2019)

(Click here to jump directly to the program of the course)


Aims and scope

The course reviews the main facets of parametric and nonparametric statistical pattern recognition, as well as the fundamentals of supervised and unsupervised learning in neural networks (either shallow or deep). Proper probabilistic interpretations of neural nets are given. Traditional and leading-edge algorithms for the estimation of posterior probabilities, scaled-likelihoods, and probability density functions via neural networks are handed out, possibly fitting the optimality criteria (i.e., maximum-a-posteriori, or maximum likelihood) underlying robust statistical pattern recognition. Examples, results of simulations, and real-life applications are presented.


Handbooks and bibliography

1. Duda, Hart & Stork, "Pattern Classification - Second Edition". J. Wiley, 2001.

2. C. Bishop, "Neural Networks for Pattern Recognition". Oxford University Press, 1995.

3. Ian Goodfellow, Yoshua Bengio and Aaron Courville , "Deep Learning". MIT Press, 2016.

4. Edmondo Trentin, Luca Lusnig, Fabio Cavalli. Parzen neural networks: Fundamentals, properties, and an application to forensic anthropology. Neural Networks 97: 137-151 (2018).

5. Edmondo Trentin. Soft-Constrained Neural Networks for Nonparametric Density Estimation. Neural Processing Letters 48(2): 915-932 (2018).

6. Edmondo Trentin. Maximum-Likelihood Estimation of Neural Mixture Densities: Model, Algorithm, and Preliminary Experimental Evaluation. Proc. of ANNPR 2018: 178-189, Springer (2018).

7. Edmondo Trentin, Stefan Scherer, Friedhelm Schwenker. Emotion recognition from speech signals via a probabilistic echo-state network. Pattern Recognition Letters 66: 4-12 (2015).

8. Marco Bongini, Leonardo Rigutini, Edmondo Trentin. Recursive Neural Networks for Density Estimation Over Generalized Random Graphs. IEEE Trans. Neural Netw. Learning Syst. 29(11): 5441-5458 (2018).


Lecture Hours

Mon to Fri 2pm-5.30pm (the lessons are in the meeting room - ground floor, FBK northern building).


Useful links and stuff

Note: you can download the slides of the course by clicking on the "PROGRAM" of the subject (see below).

1) UCI Machine Learning Repository of benchmark datasets of classification and regression problems, including documentation and bibliography.

2) Original neural network software simulator for Linux (NeuroSimulator.tgz). Save the file in a directory, enter the directory, and type "tar -xvzf NeuroSimulator.tgz".

3) Web page with a list of neural simulators for different operating systems (commercial/public-domain).


PROGRAM (TENTATIVE)

1. Review of Bayes decision theory. Pattern classification. Feature extraction. Bayes theorem, optimality of Bayes decision rule, rewriting of the discriminant functions into equivalent forms, case studies.

2. Review of artificial neural networks (ANN). Definitions, MLPs and deep architectures, supervised learning, mixtures of experts, autoencoders, application to pattern recognition, universality, estimation of class-posterior probabilities, estimation of scaled-likelihoods, radial basis functions.

3. Parametric estimation techniques.. Equivalence between supervised and unsupervised setups. Maximum likelihood (ML) approach. Mixture densities and GMMs. From GMMs to k-Means clustering to competitive neural nets.

4. Nonparametric estimation. . General framework. Parzen window, kn-nearest neighbor. Examples. Pros and cons. Nearest-neighbor and k-Nearest Neighbor classifiers.

5. Density estimation via Parzen neural networks (PNN).. Training algorithm. Practical matters, model selection via cross-validated likelihood, application to pattern classification. Overview of the theoretical properties of PNNs (complexity, modeling capabilities, asymptotic convergence in probability). Graphical demos. Application to sex determination from human crania.

6. Nonparametric pdf estimates via soft-constrained ANNs that satisfy Kolmogorov's axioms of probability. Markov Chain Monte Carlo solution to the computation of the numeric integral of the ANN; technique for sampling from the ANN. Results of simulations.

7. Neural Mixture Models (NMM) for the estimation of mixture densities. The relevance of mixture densities, difference between mixture densities and mixture density models, ML soft-constrained training of the NMM. Results of simulations.

8. Sequence processing: hard-constrained RBF-based ML density estimation over sequences of patterns encoded via the Echo State Network. Application to emotion recognition from speech signals.

9. Graph processing: hard-constrained RBF-based ML density estimation over graphs (i.e., structured patterns, or relations) encoded via the recursive/graph neural networks. Applications to density estimation over graphs, graph clustering, and graph classification.



        Based on Dream Template