Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Neural Networks: Understanding the Connection between Biology and Computer Science - Prof., Study notes of Computer Science

An introduction to neural networks, exploring their relationship to biological neurons and the role of computational neuroscience. It covers the basics of neural networks, including nodes, activation functions, and types of nodes. The document also discusses the use of neural networks as function approximators and their application to classification tasks. Students will gain a solid foundation in the principles of neural networks and their relevance to artificial intelligence programming.

Typology: Study notes

Pre 2010

Uploaded on 07/30/2009

koofers-user-qgo
koofers-user-qgo 🇺🇸

10 documents

1 / 6

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Artificial Intelligence Programming
Neural Networks
Chris Brooks
Department of Computer Science
University of San Francisco
Departmentof Computer Science —University of San Francisco p.1/??
19-0: Neural networks
Much of what we’ve studied so far can be classified as
symbolic AI.
Focus on symbols and relations between them.
Search, logic, decision trees,
The underlying assumption is that manipulation of
symbols is the key requirement for intelligent behavior.
Neural networks focus on subsymbolic behavior.
Intelligent behavior emerges from the interaction of simple
components.
Departmentof Computer Science —University of San Francisco p.2/??
19-1: Biology vs Computer Science
Axon
Cell body or Soma
Nucleus
Dendrite
Synapses
Axonal arborization
Axon from another cell
Synapse
In biological neurons,
signals are received by
dendrites and propagated
to other neurons via the
axon.
Signaling and firing is
very complex
Thought and behavior are
produced through the in-
teraction of thousands of
neurons.
Departmentof Computer Science —University of San Francisco p.3/??
19-2: Biology vs Computer Science
Computational neural networks are related to biological
neural networks primarily by analogy
Computational neuroscience studies the modeling of
biologically plausible neurons.
AI researchers are often more interested in developing
effective algorithms.
As with GAs, we draw upon ideas that are successful
in nature and take the parts that are useful.
Departmentof Computer Science —University of San Francisco p.4/??
19-3: Computational Neural Networks
Neural networks are composed of nodes.
These nodes are connected by links
Abstraction of axons
Each link has an associated weight that indicates the
strength of the signal.
Each node has a nonlinear activation function
Governs node’s output as a function of the weighted
sum of its inputs.
Departmentof Computer Science —University of San Francisco p.5/??
19-4: Appropriate tasks for neural
learning
Many attribute-value pairs.
Real-valued inputs
Real or discrete target value
Noisy or error-containing data
Long training time OK
Fast evaluation of test cases needed
Ability of humans to understand the learned hypothesis is
not important.
Departmentof Computer Science —University of San Francisco p.6/??
pf3
pf4
pf5

Partial preview of the text

Download Neural Networks: Understanding the Connection between Biology and Computer Science - Prof. and more Study notes Computer Science in PDF only on Docsity!

Artificial Intelligence Programming

Neural Networks Chris Brooks Department of Computer Science University of San Francisco ?? Department of Computer Science — University of San Francisco – p.1/

Neural networks 19-0:

Much of what we’ve studied so far can be classified as AI. symbolic Focus on symbols and relations between them. Search, logic, decision trees, The underlying assumption is that manipulation of symbols is the key requirement for intelligent behavior. behavior. subsymbolic Neural networks focus on Intelligent behavior emerges from the interaction of simple components. ?? Department of Computer Science — University of San Francisco – p.2/

Biology vs Computer Science 19-1:

Axon Cell body or Soma Nucleus Dendrite Synapses Axonal arborization Axon from another cellSynapse In biological neurons, signals are received by dendrites and propagated to other neurons via the axon. Signaling and firing is very complex Thought and behavior are produced through the in- teraction of thousands of neurons. ?? Department of Computer Science — University of San Francisco – p.3/

Biology vs Computer Science 19-2:

Computational neural networks are related to biological neural networks primarily by analogy Computational neuroscience studies the modeling of biologically plausible neurons. AI researchers are often more interested in developing effective algorithms. As with GAs, we draw upon ideas that are successful in nature and take the parts that are useful. ?? Department of Computer Science — University of San Francisco – p.4/

Computational Neural Networks 19-3:

Neural networks are composed of nodes. These nodes are connected by links Abstraction of axons Each link has an associated weight that indicates the strength of the signal. Each node has a nonlinear activation function Governs node’s output as a function of the weighted sum of its inputs.

Appropriate tasks for neural 19-4:

learning

Many attribute-value pairs. Real-valued inputs Real or discrete target value Noisy or error-containing data Long training time OK Fast evaluation of test cases needed Ability of humans to understand the learned hypothesis is not important.

Computational Neural Networks 19-5:

Output

LinksOutput FunctionActivation (^) InputFunction (^) InputLinks ) iin ( g =^ ia 1 − = (^0) a i a g in i j,iW ,i (^0) W Bias Weight j a threshold value Bias unit is used to control the How strong the weighted input signal must be for the node to fire. ?? Department of Computer Science — University of San Francisco – p.7/

Activation functions 19-6:

Any nonlinear function can be used in principle. Two most common functions are: Step function (threshold function) - Outputs 1 if input positive, zero otherwise.

.^ x− 1 e^ 1+ Sigmoid/logistic function: Continuously differentiable Rapid change near threshold, gradual at extremes. ?? Department of Computer Science — University of San Francisco – p.8/

Examples 19-7:

AND 1.5 =^0 W 1 =^1 W 1 =^2 W OR 1 =^2 W 1 =^1 W 0.5 =^0 W NOT 1 =^1 W 0.5 =^0 W Neural nets can easily be built to perform some standard logical operations using the threshold activation function. Change the threshold depending on the function needed ?? Department of Computer Science — University of San Francisco – p.9/

Types of nodes 19-8:

We can distinguish between three types of nodes: Input nodes Output nodes Hidden nodes We can also distinguish between types of networks Feed-forward networks: signals flow in one direction, no cycles. Recurrent networks: Cycles in signal propagation We’ll focus primarily on feedforward networks ?? Department of Computer Science — University of San Francisco – p.10/

Feedforward Networks as 19-9:

Function Approximators

Feedforward NNs fall into a family of algorithms called nonlinear function approximators The output of a NN is a function of its inputs Nonlinear activation function allows the representation of complex functions. By adjusting weights, we change the function being represented NNs are often used to efficiently approximate complex functions from data.

Classification with Neural 19-10:

Networks

NNs also perform classification very well. Map inputs into one or more outputs. Output range is split into discrete “classes” Very useful for learning tasks where “what to look for” is not known Face recognition, handwriting recognition, driving a car

Gradient Descent and the Delta 19-17:

rule

What about cases where we can’t learn the function exactly? Function is not linearly separable In this case, we want to perform as well as possible. We’ll interpret this to mean minimizing the sum of squared error.

2 / = 1 E

in training set. d for 2 )do −^ dt( ?? Department of Computer Science — University of San Francisco – p.19/

Gradient Descent and the Delta 19-18:

rule

We can visualize this as a search through a space of weights Defining E in this way gives a parabolic space with a single global minimum (for linear units). By following the gradient in this space, we find the combination of weights that minimizes error. output - what is the real number unthresholded Use computed by the weighted sum of inputs? ?? Department of Computer Science — University of San Francisco – p.20/

Gradient Descent and the Delta 19-19:

rule

Gradient descent requires us to follow the steepest slope down the error surface. We consider the derivative of E with respect to each weight After derivation, we find that the updating rule (called the Delta rule) is: α =^ iw ∆ ∑ idx)do − dt(D∈^ d is^ do is expected output and^ dt is the training set, D Where actual output. ?? Department of Computer Science — University of San Francisco – p.21/

Incremental learning 19-20:

Often it is not practical to compute a global weight change for the entire training set. Instead, we want to update weights incrementally. Observe one piece of data, then update ix)o − t(α = iw Our update rule is then: Like the perceptron learning rule, except that unthresholded output is used. ) typically usedα Smaller step size ( No theoretical guarantees of convergence ?? Department of Computer Science — University of San Francisco – p.22/

Multilayer Networks 19-21:

While perceptrons have the advantage of a simple learning algorithm, their computational limitations are a problem. What if we add another “hidden” layer? Computational power increases With one hidden layer, can represent any continuous function With two hidden layers, can represent any function Problem: How to find the correct weights for hidden nodes?

Multilayer Network Example 19-22:

Input units Hidden units ia Output units j,i W j a k,j W k a

Backpropagation 19-23:

Backpropagation is an extension of the perceptron learning algorithm to deal with multiple layers of nodes. Nodes use sigmoid activation function

.^ i input 1 −e^ 1+ ) =i input( g ))i input(g − )(1i input(g ) =i input(′ g gives us the ′g We will still “follow the gradient”, where gradient. ?? Department of Computer Science — University of San Francisco – p.25/

Backprogagation 19-24:

Notation:

  • vector of network outputs )x(w h
    • desired output for a training example y
  • output of the jth hidden unit.^ j a
  • output of the ith output unit.^ i o
  • output for the ith training example^ i t :i Output error for output node ))i input(g − (1 ∗ )i input(g ∗ )io −^ it = (^ i ∆ Weight updating (hidden-output): i∆ ∗ ja ∗ α + j,iW = j,i W ?? Department of Computer Science — University of San Francisco – p.26/

Backpropagation 19-25:

Updating input-hidden weights: Idea: each hidden node is responsible for a fraction of the .i∆ error in according to the strength of the connection^ i∆ Divide between the hidden and output node. )) input(g − )(1 input(g =^ j Delta ∑ i∆j,iW^ i Update rule for input-hidden weights: j∆ ∗ k input ∗ α + k,jW = k,j W ?? Department of Computer Science — University of San Francisco – p.27/

Backpropagation Algorithm 19-26:

The whole algorithm can be summed up as: While not done: for d in training set Apply inputs of d, propagate forward. in output layer i for node ) output −^ expt( ∗ ) output − (1 ∗ output =^ i Delta for each hidden node ∗ ) output − (1 ∗ output =^ j Delta ∑ i∆k,iW Adjust each weight j input ∗ i∆ ∗ α + j,iW = j,i W ?? Department of Computer Science — University of San Francisco – p.28/

Stopping conditions 19-27:

When to stop training? Fixed number of iterations Total error below a set threshold Convergence - no change in weights

Face Recognition with Neural 19-28:

Nets

An advantage of neural networks is that they don’t require an explicit description of relationships between features. They don’t really learn explanations Nicely suited to problems that don’t have a symbolic answer They’re very good at finding patterns in input data.