



Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Community
Ask the community for help and clear up your study doubts
Discover the best universities in your country according to Docsity users
Free resources
Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors
An introduction to neural networks, exploring their relationship to biological neurons and the role of computational neuroscience. It covers the basics of neural networks, including nodes, activation functions, and types of nodes. The document also discusses the use of neural networks as function approximators and their application to classification tasks. Students will gain a solid foundation in the principles of neural networks and their relevance to artificial intelligence programming.
Typology: Study notes
1 / 6
This page cannot be seen from the preview
Don't miss anything!
Neural Networks Chris Brooks Department of Computer Science University of San Francisco ?? Department of Computer Science — University of San Francisco – p.1/
Much of what we’ve studied so far can be classified as AI. symbolic Focus on symbols and relations between them. Search, logic, decision trees, The underlying assumption is that manipulation of symbols is the key requirement for intelligent behavior. behavior. subsymbolic Neural networks focus on Intelligent behavior emerges from the interaction of simple components. ?? Department of Computer Science — University of San Francisco – p.2/
Axon Cell body or Soma Nucleus Dendrite Synapses Axonal arborization Axon from another cellSynapse In biological neurons, signals are received by dendrites and propagated to other neurons via the axon. Signaling and firing is very complex Thought and behavior are produced through the in- teraction of thousands of neurons. ?? Department of Computer Science — University of San Francisco – p.3/
Computational neural networks are related to biological neural networks primarily by analogy Computational neuroscience studies the modeling of biologically plausible neurons. AI researchers are often more interested in developing effective algorithms. As with GAs, we draw upon ideas that are successful in nature and take the parts that are useful. ?? Department of Computer Science — University of San Francisco – p.4/
Neural networks are composed of nodes. These nodes are connected by links Abstraction of axons Each link has an associated weight that indicates the strength of the signal. Each node has a nonlinear activation function Governs node’s output as a function of the weighted sum of its inputs.
Many attribute-value pairs. Real-valued inputs Real or discrete target value Noisy or error-containing data Long training time OK Fast evaluation of test cases needed Ability of humans to understand the learned hypothesis is not important.
Output
LinksOutput FunctionActivation (^) InputFunction (^) InputLinks ) iin ( g =^ ia 1 − = (^0) a i a g in i j,iW ,i (^0) W Bias Weight j a threshold value Bias unit is used to control the How strong the weighted input signal must be for the node to fire. ?? Department of Computer Science — University of San Francisco – p.7/
Any nonlinear function can be used in principle. Two most common functions are: Step function (threshold function) - Outputs 1 if input positive, zero otherwise.
.^ x− 1 e^ 1+ Sigmoid/logistic function: Continuously differentiable Rapid change near threshold, gradual at extremes. ?? Department of Computer Science — University of San Francisco – p.8/
AND 1.5 =^0 W 1 =^1 W 1 =^2 W OR 1 =^2 W 1 =^1 W 0.5 =^0 W NOT 1 =^1 W 0.5 =^0 W Neural nets can easily be built to perform some standard logical operations using the threshold activation function. Change the threshold depending on the function needed ?? Department of Computer Science — University of San Francisco – p.9/
We can distinguish between three types of nodes: Input nodes Output nodes Hidden nodes We can also distinguish between types of networks Feed-forward networks: signals flow in one direction, no cycles. Recurrent networks: Cycles in signal propagation We’ll focus primarily on feedforward networks ?? Department of Computer Science — University of San Francisco – p.10/
Feedforward NNs fall into a family of algorithms called nonlinear function approximators The output of a NN is a function of its inputs Nonlinear activation function allows the representation of complex functions. By adjusting weights, we change the function being represented NNs are often used to efficiently approximate complex functions from data.
NNs also perform classification very well. Map inputs into one or more outputs. Output range is split into discrete “classes” Very useful for learning tasks where “what to look for” is not known Face recognition, handwriting recognition, driving a car
What about cases where we can’t learn the function exactly? Function is not linearly separable In this case, we want to perform as well as possible. We’ll interpret this to mean minimizing the sum of squared error.
in training set. d for 2 )do −^ dt( ?? Department of Computer Science — University of San Francisco – p.19/
We can visualize this as a search through a space of weights Defining E in this way gives a parabolic space with a single global minimum (for linear units). By following the gradient in this space, we find the combination of weights that minimizes error. output - what is the real number unthresholded Use computed by the weighted sum of inputs? ?? Department of Computer Science — University of San Francisco – p.20/
Gradient descent requires us to follow the steepest slope down the error surface. We consider the derivative of E with respect to each weight After derivation, we find that the updating rule (called the Delta rule) is: α =^ iw ∆ ∑ idx)do − dt(D∈^ d is^ do is expected output and^ dt is the training set, D Where actual output. ?? Department of Computer Science — University of San Francisco – p.21/
Often it is not practical to compute a global weight change for the entire training set. Instead, we want to update weights incrementally. Observe one piece of data, then update ix)o − t(α = iw Our update rule is then: Like the perceptron learning rule, except that unthresholded output is used. ) typically usedα Smaller step size ( No theoretical guarantees of convergence ?? Department of Computer Science — University of San Francisco – p.22/
While perceptrons have the advantage of a simple learning algorithm, their computational limitations are a problem. What if we add another “hidden” layer? Computational power increases With one hidden layer, can represent any continuous function With two hidden layers, can represent any function Problem: How to find the correct weights for hidden nodes?
Input units Hidden units ia Output units j,i W j a k,j W k a
Backpropagation is an extension of the perceptron learning algorithm to deal with multiple layers of nodes. Nodes use sigmoid activation function
.^ i input 1 −e^ 1+ ) =i input( g ))i input(g − )(1i input(g ) =i input(′ g gives us the ′g We will still “follow the gradient”, where gradient. ?? Department of Computer Science — University of San Francisco – p.25/
Notation:
Updating input-hidden weights: Idea: each hidden node is responsible for a fraction of the .i∆ error in according to the strength of the connection^ i∆ Divide between the hidden and output node. )) input(g − )(1 input(g =^ j Delta ∑ i∆j,iW^ i Update rule for input-hidden weights: j∆ ∗ k input ∗ α + k,jW = k,j W ?? Department of Computer Science — University of San Francisco – p.27/
The whole algorithm can be summed up as: While not done: for d in training set Apply inputs of d, propagate forward. in output layer i for node ) output −^ expt( ∗ ) output − (1 ∗ output =^ i Delta for each hidden node ∗ ) output − (1 ∗ output =^ j Delta ∑ i∆k,iW Adjust each weight j input ∗ i∆ ∗ α + j,iW = j,i W ?? Department of Computer Science — University of San Francisco – p.28/
When to stop training? Fixed number of iterations Total error below a set threshold Convergence - no change in weights
An advantage of neural networks is that they don’t require an explicit description of relationships between features. They don’t really learn explanations Nicely suited to problems that don’t have a symbolic answer They’re very good at finding patterns in input data.