Top Document: comp.ai.neural-nets FAQ, Part 1 of 7: Introduction Previous Document: Who is concerned with NNs? Next Document: How many kinds of Kohonen networks exist? See reader questions & answers on this topic! - Help others by sharing your knowledge There are many many kinds of NNs by now. Nobody knows exactly how many. New ones (or at least variations of old ones) are invented every week. Below is a collection of some of the most well known methods, not claiming to be complete. The two main kinds of learning algorithms are supervised and unsupervised. o In supervised learning, the correct results (target values, desired outputs) are known and are given to the NN during training so that the NN can adjust its weights to try match its outputs to the target values. After training, the NN is tested by giving it only input values, not target values, and seeing how close it comes to outputting the correct target values. o In unsupervised learning, the NN is not provided with the correct results during training. Unsupervised NNs usually perform some kind of data compression, such as dimensionality reduction or clustering. See "What does unsupervised learning learn?" The distinction between supervised and unsupervised methods is not always clear-cut. An unsupervised method can learn a summary of a probability distribution, then that summarized distribution can be used to make predictions. Furthermore, supervised methods come in two subvarieties: auto-associative and hetero-associative. In auto-associative learning, the target values are the same as the inputs, whereas in hetero-associative learning, the targets are generally different from the inputs. Many unsupervised methods are equivalent to auto-associative supervised methods. For more details, see "What does unsupervised learning learn?" Two major kinds of network topology are feedforward and feedback. o In a feedforward NN, the connections between units do not form cycles. Feedforward NNs usually produce a response to an input quickly. Most feedforward NNs can be trained using a wide variety of efficient conventional numerical methods (e.g. see "What are conjugate gradients, Levenberg-Marquardt, etc.?") in addition to algorithms invented by NN reserachers. o In a feedback or recurrent NN, there are cycles in the connections. In some feedback NNs, each time an input is presented, the NN must iterate for a potentially long time before it produces a response. Feedback NNs are usually more difficult to train than feedforward NNs. Some kinds of NNs (such as those with winner-take-all units) can be implemented as either feedforward or feedback networks. NNs also differ in the kinds of data they accept. Two major kinds of data are categorical and quantitative. o Categorical variables take only a finite (technically, countable) number of possible values, and there are usually several or more cases falling into each category. Categorical variables may have symbolic values (e.g., "male" and "female", or "red", "green" and "blue") that must be encoded into numbers before being given to the network (see "How should categories be encoded?") Both supervised learning with categorical target values and unsupervised learning with categorical outputs are called "classification." o Quantitative variables are numerical measurements of some attribute, such as length in meters. The measurements must be made in such a way that at least some arithmetic relations among the measurements reflect analogous relations among the attributes of the objects that are measured. For more information on measurement theory, see the Measurement Theory FAQ at ftp://ftp.sas.com/pub/neural/measurement.html. Supervised learning with quantitative target values is called "regression." Some variables can be treated as either categorical or quantitative, such as number of children or any binary variable. Most regression algorithms can also be used for supervised classification by encoding categorical target values as 0/1 binary variables and using those binary variables as target values for the regression algorithm. The outputs of the network are posterior probabilities when any of the most common training methods are used. Here are some well-known kinds of NNs: 1. Supervised 1. Feedforward o Linear o Hebbian - Hebb (1949), Fausett (1994) o Perceptron - Rosenblatt (1958), Minsky and Papert (1969/1988), Fausett (1994) o Adaline - Widrow and Hoff (1960), Fausett (1994) o Higher Order - Bishop (1995) o Functional Link - Pao (1989) o MLP: Multilayer perceptron - Bishop (1995), Reed and Marks (1999), Fausett (1994) o Backprop - Rumelhart, Hinton, and Williams (1986) o Cascade Correlation - Fahlman and Lebiere (1990), Fausett (1994) o Quickprop - Fahlman (1989) o RPROP - Riedmiller and Braun (1993) o RBF networks - Bishop (1995), Moody and Darken (1989), Orr (1996) o OLS: Orthogonal Least Squares - Chen, Cowan and Grant (1991) o CMAC: Cerebellar Model Articulation Controller - Albus (1975), Brown and Harris (1994) o Classification only o LVQ: Learning Vector Quantization - Kohonen (1988), Fausett (1994) o PNN: Probabilistic Neural Network - Specht (1990), Masters (1993), Hand (1982), Fausett (1994) o Regression only o GNN: General Regression Neural Network - Specht (1991), Nadaraya (1964), Watson (1964) 2. Feedback - Hertz, Krogh, and Palmer (1991), Medsker and Jain (2000) o BAM: Bidirectional Associative Memory - Kosko (1992), Fausett (1994) o Boltzman Machine - Ackley et al. (1985), Fausett (1994) o Recurrent time series o Backpropagation through time - Werbos (1990) o Elman - Elman (1990) o FIR: Finite Impulse Response - Wan (1990) o Jordan - Jordan (1986) o Real-time recurrent network - Williams and Zipser (1989) o Recurrent backpropagation - Pineda (1989), Fausett (1994) o TDNN: Time Delay NN - Lang, Waibel and Hinton (1990) 3. Competitive o ARTMAP - Carpenter, Grossberg and Reynolds (1991) o Fuzzy ARTMAP - Carpenter, Grossberg, Markuzon, Reynolds and Rosen (1992), Kasuba (1993) o Gaussian ARTMAP - Williamson (1995) o Counterpropagation - Hecht-Nielsen (1987; 1988; 1990), Fausett (1994) o Neocognitron - Fukushima, Miyake, and Ito (1983), Fukushima, (1988), Fausett (1994) 2. Unsupervised - Hertz, Krogh, and Palmer (1991) 1. Competitive o Vector Quantization o Grossberg - Grossberg (1976) o Kohonen - Kohonen (1984) o Conscience - Desieno (1988) o Self-Organizing Map o Kohonen - Kohonen (1995), Fausett (1994) o GTM: - Bishop, Svensén and Williams (1997) o Local Linear - Mulier and Cherkassky (1995) o Adaptive resonance theory o ART 1 - Carpenter and Grossberg (1987a), Moore (1988), Fausett (1994) o ART 2 - Carpenter and Grossberg (1987b), Fausett (1994) o ART 2-A - Carpenter, Grossberg and Rosen (1991a) o ART 3 - Carpenter and Grossberg (1990) o Fuzzy ART - Carpenter, Grossberg and Rosen (1991b) o DCL: Differential Competitive Learning - Kosko (1992) 2. Dimension Reduction - Diamantaras and Kung (1996) o Hebbian - Hebb (1949), Fausett (1994) o Oja - Oja (1989) o Sanger - Sanger (1989) o Differential Hebbian - Kosko (1992) 3. Autoassociation o Linear autoassociator - Anderson et al. (1977), Fausett (1994) o BSB: Brain State in a Box - Anderson et al. (1977), Fausett (1994) o Hopfield - Hopfield (1982), Fausett (1994) 3. Nonlearning 1. Hopfield - Hertz, Krogh, and Palmer (1991) 2. various networks for optimization - Cichocki and Unbehauen (1993) References: Ackley, D.H., Hinton, G.F., and Sejnowski, T.J. (1985), "A learning algorithm for Boltzman machines," Cognitive Science, 9, 147-169. Albus, J.S (1975), "New Approach to Manipulator Control: The Cerebellar Model Articulation Controller (CMAC)," Transactions of the ASME Journal of Dynamic Systems, Measurement, and Control, September 1975, 220-27. Anderson, J.A., and Rosenfeld, E., eds. (1988), Neurocomputing: Foundatons of Research, Cambridge, MA: The MIT Press. Anderson, J.A., Silverstein, J.W., Ritz, S.A., and Jones, R.S. (1977) "Distinctive features, categorical perception, and probability learning: Some applications of a neural model," Psychological Rveiew, 84, 413-451. Reprinted in Anderson and Rosenfeld (1988). Bishop, C.M. (1995), Neural Networks for Pattern Recognition, Oxford: Oxford University Press. Bishop, C.M., Svensén, M., and Williams, C.K.I (1997), "GTM: A principled alternative to the self-organizing map," in Mozer, M.C., Jordan, M.I., and Petsche, T., (eds.) Advances in Neural Information Processing Systems 9, Cambrideg, MA: The MIT Press, pp. 354-360. Also see http://www.ncrg.aston.ac.uk/GTM/ Brown, M., and Harris, C. (1994), Neurofuzzy Adaptive Modelling and Control, NY: Prentice Hall. Carpenter, G.A., Grossberg, S. (1987a), "A massively parallel architecture for a self-organizing neural pattern recognition machine," Computer Vision, Graphics, and Image Processing, 37, 54-115. Carpenter, G.A., Grossberg, S. (1987b), "ART 2: Self-organization of stable category recognition codes for analog input patterns," Applied Optics, 26, 4919-4930. Carpenter, G.A., Grossberg, S. (1990), "ART 3: Hierarchical search using chemical transmitters in self-organizing pattern recognition architectures. Neural Networks, 3, 129-152. Carpenter, G.A., Grossberg, S., Markuzon, N., Reynolds, J.H., and Rosen, D.B. (1992), "Fuzzy ARTMAP: A neural network architecture for incremental supervised learning of analog multidimensional maps," IEEE Transactions on Neural Networks, 3, 698-713 Carpenter, G.A., Grossberg, S., Reynolds, J.H. (1991), "ARTMAP: Supervised real-time learning and classification of nonstationary data by a self-organizing neural network," Neural Networks, 4, 565-588. Carpenter, G.A., Grossberg, S., Rosen, D.B. (1991a), "ART 2-A: An adaptive resonance algorithm for rapid category learning and recognition," Neural Networks, 4, 493-504. Carpenter, G.A., Grossberg, S., Rosen, D.B. (1991b), "Fuzzy ART: Fast stable learning and categorization of analog patterns by an adaptive resonance system," Neural Networks, 4, 759-771. Chen, S., Cowan, C.F.N., and Grant, P.M. (1991), "Orthogonal least squares learning for radial basis function networks," IEEE Transactions on Neural Networks, 2, 302-309. Cichocki, A. and Unbehauen, R. (1993). Neural Networks for Optimization and Signal Processing. NY: John Wiley & Sons, ISBN 0-471-93010-5. Desieno, D. (1988), "Adding a conscience to competitive learning," Proc. Int. Conf. on Neural Networks, I, 117-124, IEEE Press. Diamantaras, K.I., and Kung, S.Y. (1996) Principal Component Neural Networks: Theory and Applications, NY: Wiley. Elman, J.L. (1990), "Finding structure in time," Cognitive Science, 14, 179-211. Fahlman, S.E. (1989), "Faster-Learning Variations on Back-Propagation: An Empirical Study", in Touretzky, D., Hinton, G, and Sejnowski, T., eds., Proceedings of the 1988 Connectionist Models Summer School, Morgan Kaufmann, 38-51. Fahlman, S.E., and Lebiere, C. (1990), "The Cascade-Correlation Learning Architecture", in Touretzky, D. S. (ed.), Advances in Neural Information Processing Systems 2,, Los Altos, CA: Morgan Kaufmann Publishers, pp. 524-532. Fausett, L. (1994), Fundamentals of Neural Networks, Englewood Cliffs, NJ: Prentice Hall. Fukushima, K., Miyake, S., and Ito, T. (1983), "Neocognitron: A neural network model for a mechanism of visual pattern recognition," IEEE Transactions on Systems, Man, and Cybernetics, 13, 826-834. Fukushima, K. (1988), "Neocognitron: A hierarchical neural network capable of visual pattern recognition," Neural Networks, 1, 119-130. Grossberg, S. (1976), "Adaptive pattern classification and universal recoding: I. Parallel development and coding of neural feature detectors," Biological Cybernetics, 23, 121-134 Hand, D.J. (1982) Kernel Discriminant Analysis, Research Studies Press. Hebb, D.O. (1949), The Organization of Behavior, NY: John Wiley & Sons. Hecht-Nielsen, R. (1987), "Counterpropagation networks," Applied Optics, 26, 4979-4984. Hecht-Nielsen, R. (1988), "Applications of counterpropagation networks," Neural Networks, 1, 131-139. Hecht-Nielsen, R. (1990), Neurocomputing, Reading, MA: Addison-Wesley. Hertz, J., Krogh, A., and Palmer, R. (1991). Introduction to the Theory of Neural Computation. Addison-Wesley: Redwood City, California. Hopfield, J.J. (1982), "Neural networks and physical systems with emergent collective computational abilities," Proceedings of the National Academy of Sciences, 79, 2554-2558. Reprinted in Anderson and Rosenfeld (1988). Jordan, M. I. (1986), "Attractor dynamics and parallelism in a connectionist sequential machine," In Proceedings of the Eighth Annual conference of the Cognitive Science Society, pages 531-546. Lawrence Erlbaum. Kasuba, T. (1993), "Simplified Fuzzy ARTMAP," AI Expert, 8, 18-25. Kohonen, T. (1984), Self-Organization and Associative Memory, Berlin: Springer. Kohonen, T. (1988), "Learning Vector Quantization," Neural Networks, 1 (suppl 1), 303. Kohonen, T. (1995/1997), Self-Organizing Maps, Berlin: Springer-Verlag. First edition was 1995, second edition 1997. See http://www.cis.hut.fi/nnrc/new_book.html for information on the second edition. Kosko, B.(1992), Neural Networks and Fuzzy Systems, Englewood Cliffs, N.J.: Prentice-Hall. Lang, K. J., Waibel, A. H., and Hinton, G. (1990), "A time-delay neural network architecture for isolated word recognition," Neural Networks, 3, 23-44. Masters, T. (1993). Practical Neural Network Recipes in C++, San Diego: Academic Press. Masters, T. (1995) Advanced Algorithms for Neural Networks: A C++ Sourcebook, NY: John Wiley and Sons, ISBN 0-471-10588-0 Medsker, L.R., and Jain, L.C., eds. (2000), Recurrent Neural Networks: Design and Applications, Boca Raton, FL: CRC Press, ISBN 0-8493-7181-3. Minsky, M.L., and Papert, S.A. (1969/1988), Perceptrons, Cambridge, MA: The MIT Press (first edition, 1969; expanded edition, 1988). Moody, J. and Darken, C.J. (1989), "Fast learning in networks of locally-tuned processing units," Neural Computation, 1, 281-294. Moore, B. (1988), "ART 1 and Pattern Clustering," in Touretzky, D., Hinton, G. and Sejnowski, T., eds., Proceedings of the 1988 Connectionist Models Summer School, 174-185, San Mateo, CA: Morgan Kaufmann. Mulier, F. and Cherkassky, V. (1995), "Self-Organization as an Iterative Kernel Smoothing Process," Neural Computation, 7, 1165-1177. Nadaraya, E.A. (1964) "On estimating regression", Theory Probab. Applic. 10, 186-90. Oja, E. (1989), "Neural networks, principal components, and subspaces," International Journal of Neural Systems, 1, 61-68. Orr, M.J.L. (1996), "Introduction to radial basis function networks," http://www.anc.ed.ac.uk/~mjo/papers/intro.ps or http://www.anc.ed.ac.uk/~mjo/papers/intro.ps.gz Pao, Y. H. (1989), Adaptive Pattern Recognition and Neural Networks, Reading, MA: Addison-Wesley Publishing Company, ISBN 0-201-12584-6. Pineda, F.J. (1989), "Recurrent back-propagation and the dynamical approach to neural computation," Neural Computation, 1, 161-172. Reed, R.D., and Marks, R.J, II (1999), Neural Smithing: Supervised Learning in Feedforward Artificial Neural Networks, Cambridge, MA: The MIT Press, ISBN 0-262-18190-8. Riedmiller, M. and Braun, H. (1993), "A Direct Adaptive Method for Faster Backpropagation Learning: The RPROP Algorithm", Proceedings of the IEEE International Conference on Neural Networks 1993, San Francisco: IEEE. Rosenblatt, F. (1958), "The perceptron: A probabilistic model for information storage and organization in the brain., Psychological Review, 65, 386-408. Rumelhart, D.E., Hinton, G.E., and Williams, R.J. (1986), "Learning internal representations by error propagation", in Rumelhart, D.E. and McClelland, J. L., eds. (1986), Parallel Distributed Processing: Explorations in the Microstructure of Cognition, Volume 1, 318-362, Cambridge, MA: The MIT Press. Sanger, T.D. (1989), "Optimal unsupervised learning in a single-layer linear feedforward neural network," Neural Networks, 2, 459-473. Specht, D.F. (1990) "Probabilistic neural networks," Neural Networks, 3, 110-118. Specht, D.F. (1991) "A Generalized Regression Neural Network", IEEE Transactions on Neural Networks, 2, Nov. 1991, 568-576. Wan, E.A. (1990), "Temporal backpropagation: An efficient algorithm for finite impulse response neural networks," in Proceedings of the 1990 Connectionist Models Summer School, Touretzky, D.S., Elman, J.L., Sejnowski, T.J., and Hinton, G.E., eds., San Mateo, CA: Morgan Kaufmann, pp. 131-140. Watson, G.S. (1964) "Smooth regression analysis", Sankhy{\=a}, Series A, 26, 359-72. Werbos, P.J. (1990), "Backpropagtion through time: What it is and how to do it," Proceedings of the IEEE, 78, 1550-1560. Widrow, B., and Hoff, M.E., Jr., (1960), "Adaptive switching circuits," IRE WESCON Convention Record. part 4, pp. 96-104. Reprinted in Anderson and Rosenfeld (1988). Williams, R.J., and Zipser, D., (1989), "A learning algorithm for continually running fully recurrent neurla networks," Neural Computation, 1, 270-280. Williamson, J.R. (1995), "Gaussian ARTMAP: A neural network for fast incremental learning of noisy multidimensional maps," Technical Report CAS/CNS-95-003, Boston University, Center of Adaptive Systems and Department of Cognitive and Neural Systems. User Contributions:Comment about this article, ask questions, or add new information about this topic:Top Document: comp.ai.neural-nets FAQ, Part 1 of 7: Introduction Previous Document: Who is concerned with NNs? Next Document: How many kinds of Kohonen networks exist? Part1 - Part2 - Part3 - Part4 - Part5 - Part6 - Part7 - Single Page [ Usenet FAQs | Web FAQs | Documents | RFC Index ] Send corrections/additions to the FAQ Maintainer: saswss@unx.sas.com (Warren Sarle)
Last Update March 27 2014 @ 02:11 PM
|
PDP++ is a neural-network simulation system written in C++, developed as an advanced version of the original PDP software from McClelland and Rumelhart's "Explorations in Parallel Distributed Processing Handbook" (1987). The software is designed for both novice users and researchers, providing flexibility and power in cognitive neuroscience studies. Featured in Randall C. O'Reilly and Yuko Munakata's "Computational Explorations in Cognitive Neuroscience" (2000), PDP++ supports a wide range of algorithms. These include feedforward and recurrent error backpropagation, with continuous and real-time models such as Almeida-Pineda. It also incorporates constraint satisfaction algorithms like Boltzmann Machines, Hopfield networks, and mean-field networks, as well as self-organizing learning algorithms, including Self-organizing Maps (SOM) and Hebbian learning. Additionally, it supports mixtures-of-experts models and the Leabra algorithm, which combines error-driven and Hebbian learning with k-Winners-Take-All inhibitory competition. PDP++ is a comprehensive tool for exploring neural network models in cognitive neuroscience.