Download Speech Recognition and Natural Language Processing - Lecture Slides | CS 2710 and more Study notes Computer Science in PDF only on Docsity!
CS 2710 Foundations of AI
CS 2710 Foundations of AI
Lecture 27
Milos Hauskrecht milos@cs.pitt.edu 5329 Sennott Square
Appied AI topics
Topics in AI
Five main areas:
- Problem solving and search
- Logic and knowledge representations
- Planning
- Uncertainty
- Learning
Other topics:
- AI programming languages
- Speech recognition
- Natural language processing
- Image understanding
- Robotics
CS 2710 Foundations of AI
Speech recognition
- Objective : take acoustic signal and convert it to text
Sample Frequency: <18KHz Energy: 8-12bits
Frames: 10msecs long Features: for the frame (e.g. energy in some frequency band) Discretize features: e.g. to 256 values (8 bits)
Speech recognition
- We want to determine the sequence of words that is most probable given the input signal
- It is easier to define an acoustic model that relates:
- This is like a diagnosis problem , we can use the Bayes rule:
- Assume we have multiple possible word sequences:
- The best word sequence :
P ( wordseq = w | signal = s )
P ( signal = s | wordseq = w )
s
w s s w w
Psignal
Pwordseq signal Psignal wordseq Pwordseq
w^1 , w^2 ,K w k
arg max i P ( signal s | wordseq w i^ ) P ( wordseq w i ) w
CS 2710 Foundations of AI
Speech recognition
HMM models of words
P ( p = p 1 p 2 K pu | word = wi )
2 phones sequences
4 phones sequences
Speech recognition
HMM model of phones Example:
P ( s = s 1 s 2 K sr | phone = pq )
Many possible feature sequences: C1 C4 C C1 C1 C4 C C1 C1 C5 C4 C …
CS 2710 Foundations of AI
Speech recognition
- Finding the most probable path through an HMM for [m]
- Example: sequence: C1 C3 C4 C
Natural language processing
Goal: Analyze and interpret the text in the natural language
- Input: text sentences.
- Speech recognition system
- Optical character recognition (OCR)
- Documents in the electronic form
- Output:
- Knowledge extracted from the text that supports various inferences
- Processing (multi-step process):
- Syntactic interpretation (parsing)
- Semantic interpretation
- Disambiguation & Incorporation
CS 2710 Foundations of AI
Image processing and vision
- Classic image processing problem:
- Analysis of image and extraction of information from the image
- Can be used in many applications:
- Scene analysis
- Manipulation and navigation tasks
- Image retrieval
- Other image processing problems:
- Image enhancement: degraded image should be improved to restore particular features
- Storage and Compression: Large amounts of data need to be archived or transmitted
- Visualization
Image processing
Image is defined by
- a light intensity function over the image plane (Continuous) image is typically discretized
- Image plane is discretized into:
- Pixels arranged on the rectangular grid
- Resolution of the grid determines the spatial quality of the discretization
- Light intensity values are discretized into:
- Integers values in some interval
- Typical (black and white) image input:
- 512x512 pixels
- Light intensity: 8 bits – 512 types of gray
CS 2710 Foundations of AI
Image processing
Analysis of image and extraction of information from the image
- Segmentation:
- Division of the image to meaningful entities in the scene
- Relies heavily on edge detection algorithms
Image processing and vision
Analysis of image and extraction of information from the image
- To recognize (identify) the object from the image we need to compare it with the class pattern
- Problem: The position, orientation and the scale of the object in the scene may vary
- Solution: Use a set of basic transformations:
- scaling,
- translation,
- rotation of the object
- Transformations are relatively easy for 2D objects, much harder for 3-D objects
- Other problems: light sources and shadows