Prepare for your exams
Get points
Guidelines and tips

Sell on Docsity

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search Store documents

The best documents sold by students who completed their studies

Search through all study resources

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Choose your next study program

Get in touch with the best universities in the world. Search through thousands of universities and official partners

Community

Ask the community

Ask the community for help and clear up your study doubts

University Rankings

Discover the best universities in your country according to Docsity users

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

From our blog

Exams and Study

Go to the blog

Speech Recognition and Natural Language Processing - Lecture Slides | CS 2710, Study notes of Computer Science

University of Pittsburgh (Pitt) - Medical Center-Health System Computer Science

Material Type: Notes; Class: FOUNDTNS OF ARTIFICL INTELLGNC; Subject: Computer Science; University: University of Pittsburgh; Term: Unknown 1989;

Typology: Study notes

Pre 2010

Uploaded on 09/02/2009

koofers-user-p61-1 🇺🇸

10 documents

1 / 9

This page cannot be seen from the preview

Don't miss anything!

CS 2710 Foundations of AI

Lecture 27

Milos Hauskrecht

milos@cs.pitt.edu

5329 Sennott Square

Appied AI topics

CS 2710 Foundations of AI

Topics in AI

Five main areas:

•Problem solving and search

•Logic and knowledge representations

•Planning

•Uncertainty

•Learning

Partial preview of the text

Download Speech Recognition and Natural Language Processing - Lecture Slides | CS 2710 and more Study notes Computer Science in PDF only on Docsity!

CS 2710 Foundations of AI

Lecture 27

Milos Hauskrecht milos@cs.pitt.edu 5329 Sennott Square

Appied AI topics

Topics in AI

Five main areas:

Problem solving and search
Logic and knowledge representations
Planning
Uncertainty
Learning

Other topics:

AI programming languages
Speech recognition
Natural language processing
Image understanding
Robotics

CS 2710 Foundations of AI

Speech recognition

Objective : take acoustic signal and convert it to text

Sample Frequency: <18KHz Energy: 8-12bits

Frames: 10msecs long Features: for the frame (e.g. energy in some frequency band) Discretize features: e.g. to 256 values (8 bits)

Speech recognition

We want to determine the sequence of words that is most probable given the input signal
It is easier to define an acoustic model that relates:
This is like a diagnosis problem , we can use the Bayes rule:
Assume we have multiple possible word sequences:
The best word sequence :

P ( wordseq = w | signal = s )

P ( signal = s | wordseq = w )

w s s w w

Psignal

Pwordseq signal Psignal wordseq Pwordseq

w^1 , w^2 ,K w k

arg max i P ( signal s | wordseq w i^ ) P ( wordseq w i ) w

CS 2710 Foundations of AI

Speech recognition

HMM models of words

Example: word: tomato

P ( p = p 1 p 2 K pu | word = wi )

2 phones sequences

4 phones sequences

Speech recognition

HMM model of phones Example:

P ( s = s 1 s 2 K sr | phone = pq )

Many possible feature sequences: C1 C4 C C1 C1 C4 C C1 C1 C5 C4 C …

CS 2710 Foundations of AI

Speech recognition

Finding the most probable path through an HMM for [m]
Example: sequence: C1 C3 C4 C

Natural language processing

Goal: Analyze and interpret the text in the natural language

Input: text sentences.
- Speech recognition system
- Optical character recognition (OCR)
- Documents in the electronic form
Output:
- Knowledge extracted from the text that supports various inferences
Processing (multi-step process):
- Syntactic interpretation (parsing)
- Semantic interpretation
- Disambiguation & Incorporation

CS 2710 Foundations of AI

Image processing and vision

Classic image processing problem:
- Analysis of image and extraction of information from the image
- Can be used in many applications:
  - Scene analysis
  - Manipulation and navigation tasks
  - Image retrieval
Other image processing problems:
- Image enhancement: degraded image should be improved to restore particular features
- Storage and Compression: Large amounts of data need to be archived or transmitted
- Visualization

Image processing

Image is defined by

a light intensity function over the image plane (Continuous) image is typically discretized
Image plane is discretized into:
- Pixels arranged on the rectangular grid
- Resolution of the grid determines the spatial quality of the discretization
Light intensity values are discretized into:
- Integers values in some interval
Typical (black and white) image input:
- 512x512 pixels
- Light intensity: 8 bits – 512 types of gray

CS 2710 Foundations of AI

Image processing

Analysis of image and extraction of information from the image

Segmentation:
- Division of the image to meaningful entities in the scene
- Relies heavily on edge detection algorithms

Image processing and vision

Analysis of image and extraction of information from the image

To recognize (identify) the object from the image we need to compare it with the class pattern
Problem: The position, orientation and the scale of the object in the scene may vary
Solution: Use a set of basic transformations:
- scaling,
- translation,
- rotation of the object
- Transformations are relatively easy for 2D objects, much harder for 3-D objects
Other problems: light sources and shadows

Speech Recognition and Natural Language Processing - Lecture Slides | CS 2710, Study notes of Computer Science

Related documents

Partial preview of the text

Download Speech Recognition and Natural Language Processing - Lecture Slides | CS 2710 and more Study notes Computer Science in PDF only on Docsity!

CS 2710 Foundations of AI

Lecture 27

Appied AI topics

Topics in AI

Speech recognition

Speech recognition

w s s w w

Speech recognition

Speech recognition

Speech recognition

Natural language processing

Image processing and vision

Image processing

Image processing

Image processing and vision