Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Speech Recognition and Natural Language Processing - Lecture Slides | CS 2710, Study notes of Computer Science

Material Type: Notes; Class: FOUNDTNS OF ARTIFICL INTELLGNC; Subject: Computer Science; University: University of Pittsburgh; Term: Unknown 1989;

Typology: Study notes

Pre 2010

Uploaded on 09/02/2009

koofers-user-p61-1
koofers-user-p61-1 🇺🇸

10 documents

1 / 9

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
1
CS 2710 Foundations of AI
CS 2710 Foundations of AI
Lecture 27
Milos Hauskrecht
milos@cs.pitt.edu
5329 Sennott Square
Appied AI topics
CS 2710 Foundations of AI
Topics in AI
Five main areas:
Problem solving and search
Logic and knowledge representations
Planning
Uncertainty
Learning
Other topics:
AI programming languages
Speech recognition
Natural language processing
Image understanding
Robotics
pf3
pf4
pf5
pf8
pf9

Partial preview of the text

Download Speech Recognition and Natural Language Processing - Lecture Slides | CS 2710 and more Study notes Computer Science in PDF only on Docsity!

CS 2710 Foundations of AI

CS 2710 Foundations of AI

Lecture 27

Milos Hauskrecht milos@cs.pitt.edu 5329 Sennott Square

Appied AI topics

Topics in AI

Five main areas:

  • Problem solving and search
  • Logic and knowledge representations
  • Planning
  • Uncertainty
  • Learning

Other topics:

  • AI programming languages
  • Speech recognition
  • Natural language processing
  • Image understanding
  • Robotics

CS 2710 Foundations of AI

Speech recognition

  • Objective : take acoustic signal and convert it to text

Sample Frequency: <18KHz Energy: 8-12bits

Frames: 10msecs long Features: for the frame (e.g. energy in some frequency band) Discretize features: e.g. to 256 values (8 bits)

Speech recognition

  • We want to determine the sequence of words that is most probable given the input signal
  • It is easier to define an acoustic model that relates:
  • This is like a diagnosis problem , we can use the Bayes rule:
  • Assume we have multiple possible word sequences:
  • The best word sequence :

P ( wordseq = w | signal = s )

P ( signal = s | wordseq = w )

s

w s s w w

Psignal

Pwordseq signal Psignal wordseq Pwordseq

w^1 , w^2 ,K w k

arg max i P ( signal s | wordseq w i^ ) P ( wordseq w i ) w

CS 2710 Foundations of AI

Speech recognition

HMM models of words

  • Example: word: tomato

P ( p = p 1 p 2 K pu | word = wi )

2 phones sequences

4 phones sequences

Speech recognition

HMM model of phones Example:

P ( s = s 1 s 2 K sr | phone = pq )

Many possible feature sequences: C1 C4 C C1 C1 C4 C C1 C1 C5 C4 C

CS 2710 Foundations of AI

Speech recognition

  • Finding the most probable path through an HMM for [m]
  • Example: sequence: C1 C3 C4 C

Natural language processing

Goal: Analyze and interpret the text in the natural language

  • Input: text sentences.
    • Speech recognition system
    • Optical character recognition (OCR)
    • Documents in the electronic form
  • Output:
    • Knowledge extracted from the text that supports various inferences
  • Processing (multi-step process):
    • Syntactic interpretation (parsing)
    • Semantic interpretation
    • Disambiguation & Incorporation

CS 2710 Foundations of AI

Image processing and vision

  • Classic image processing problem:
    • Analysis of image and extraction of information from the image
    • Can be used in many applications:
      • Scene analysis
      • Manipulation and navigation tasks
      • Image retrieval
  • Other image processing problems:
    • Image enhancement: degraded image should be improved to restore particular features
    • Storage and Compression: Large amounts of data need to be archived or transmitted
    • Visualization

Image processing

Image is defined by

  • a light intensity function over the image plane (Continuous) image is typically discretized
  • Image plane is discretized into:
    • Pixels arranged on the rectangular grid
    • Resolution of the grid determines the spatial quality of the discretization
  • Light intensity values are discretized into:
    • Integers values in some interval
  • Typical (black and white) image input:
    • 512x512 pixels
    • Light intensity: 8 bits – 512 types of gray

CS 2710 Foundations of AI

Image processing

Analysis of image and extraction of information from the image

  • Segmentation:
    • Division of the image to meaningful entities in the scene
    • Relies heavily on edge detection algorithms

Image processing and vision

Analysis of image and extraction of information from the image

  • To recognize (identify) the object from the image we need to compare it with the class pattern
  • Problem: The position, orientation and the scale of the object in the scene may vary
  • Solution: Use a set of basic transformations:
    • scaling,
    • translation,
    • rotation of the object
    • Transformations are relatively easy for 2D objects, much harder for 3-D objects
  • Other problems: light sources and shadows