Search the FAQ Archives

3 - A - B - C - D - E - F - G - H - I - J - K - L - M
N - O - P - Q - R - S - T - U - V - W - X - Y - Z
faqs.org - Internet FAQ Archives

comp.ai.neural-nets FAQ, Part 2 of 7: Learning
Section - What is PNN?

( Part1 - Part2 - Part3 - Part4 - Part5 - Part6 - Part7 - Single Page )
[ Usenet FAQs | Web FAQs | Documents | RFC Index | Forum archive ]


Top Document: comp.ai.neural-nets FAQ, Part 2 of 7: Learning
Previous Document: What is ART?
Next Document: What is GRNN?
See reader questions & answers on this topic! - Help others by sharing your knowledge

PNN or "Probabilistic Neural Network" is Donald Specht's term for kernel
discriminant analysis. (Kernels are also called "Parzen windows".) You can
think of it as a normalized RBF network in which there is a hidden unit
centered at every training case. These RBF units are called "kernels" and
are usually probability density functions such as the Gaussian. The
hidden-to-output weights are usually 1 or 0; for each hidden unit, a weight
of 1 is used for the connection going to the output that the case belongs
to, while all other connections are given weights of 0. Alternatively, you
can adjust these weights for the prior probabilities of each class. So the
only weights that need to be learned are the widths of the RBF units. These
widths (often a single width is used) are called "smoothing parameters" or
"bandwidths" and are usually chosen by cross-validation or by more esoteric
methods that are not well-known in the neural net literature; gradient
descent is not used. 

Specht's claim that a PNN trains 100,000 times faster than backprop is at
best misleading. While they are not iterative in the same sense as backprop,
kernel methods require that you estimate the kernel bandwidth, and this
requires accessing the data many times. Furthermore, computing a single
output value with kernel methods requires either accessing the entire
training data or clever programming, and either way is much slower than
computing an output with a feedforward net. And there are a variety of
methods for training feedforward nets that are much faster than standard
backprop. So depending on what you are doing and how you do it, PNN may be
either faster or slower than a feedforward net. 

PNN is a universal approximator for smooth class-conditional densities, so
it should be able to solve any smooth classification problem given enough
data. The main drawback of PNN is that, like kernel methods in general, it
suffers badly from the curse of dimensionality. PNN cannot ignore irrelevant
inputs without major modifications to the basic algorithm. So PNN is not
likely to be the top choice if you have more than 5 or 6 nonredundant
inputs. For modified algorithms that deal with irrelevant inputs, see
Masters (1995) and Lowe (1995). 

But if all your inputs are relevant, PNN has the very useful ability to tell
you whether a test case is similar (i.e. has a high density) to any of the
training data; if not, you are extrapolating and should view the output
classification with skepticism. This ability is of limited use when you have
irrelevant inputs, since the similarity is measured with respect to all of
the inputs, not just the relevant ones. 

References: 

   Hand, D.J. (1982) Kernel Discriminant Analysis, Research Studies Press. 

   Lowe, D.G. (1995), "Similarity metric learning for a variable-kernel
   classifier," Neural Computation, 7, 72-85, 
   http://www.cs.ubc.ca/spider/lowe/pubs.html 

   McLachlan, G.J. (1992) Discriminant Analysis and Statistical Pattern
   Recognition, Wiley. 

   Masters, T. (1993). Practical Neural Network Recipes in C++, San Diego:
   Academic Press. 

   Masters, T. (1995) Advanced Algorithms for Neural Networks: A C++
   Sourcebook, NY: John Wiley and Sons, ISBN 0-471-10588-0 

   Michie, D., Spiegelhalter, D.J. and Taylor, C.C. (1994) Machine
   Learning, Neural and Statistical Classification, Ellis Horwood; this book
   is out of print but available online at 
   http://www.amsta.leeds.ac.uk/~charles/statlog/ 

   Scott, D.W. (1992) Multivariate Density Estimation, Wiley. 

   Specht, D.F. (1990) "Probabilistic neural networks," Neural Networks, 3,
   110-118. 

User Contributions:

Comment about this article, ask questions, or add new information about this topic:




Top Document: comp.ai.neural-nets FAQ, Part 2 of 7: Learning
Previous Document: What is ART?
Next Document: What is GRNN?

Part1 - Part2 - Part3 - Part4 - Part5 - Part6 - Part7 - Single Page

[ Usenet FAQs | Web FAQs | Documents | RFC Index ]

Send corrections/additions to the FAQ Maintainer:
saswss@unx.sas.com (Warren Sarle)





Last Update March 27 2014 @ 02:11 PM