Prepare for your exams
Get points
Guidelines and tips

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search Store documents

The best documents sold by students who completed their studies

Search through all study resources

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Choose your next study program

Get in touch with the best universities in the world. Search through thousands of universities and official partners

Community

Ask the community

Ask the community for help and clear up your study doubts

University Rankings

Discover the best universities in your country according to Docsity users

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

From our blog

Exams and Study

Go to the blog

Notes on Machine Learning with Q&A, Study notes of Machine Learning

Rashtrasant Tukadoji Maharaj Nagpur University Machine Learning

This document is a comprehensive collection of Machine Learning questions and answers, designed for exam preparation. It covers a wide range of topics from the fundamentals of Machine Learning to advanced algorithms, providing clear explanations and solutions to commonly asked questions. The content includes practical problem-solving and in-depth answers to help students grasp key concepts effectively. Perfect for B.Tech students in Computer Science or related fields, this document will help you practice for exams like GATE, ESE, or university-level Machine Learning courses. Index of Topics: Basics of Machine Learning Supervised Learning Algorithms (e.g., Linear Regression, Decision Trees) Unsupervised Learning (e.g., K-Means, PCA) Neural Networks and Deep Learning Clustering and Classification Model Evaluation and Performance Metrics Frequently Asked Questions with Detailed Solutions Prepared By: SK Course: Machine Learning / Artificial Intelligence Format: PDF

Typology: Study notes

2023/2024

Available from 01/25/2025

shraddha-kosare 🇮🇳

2 documents

1 / 57

This page cannot be seen from the preview

Don't miss anything!

1 A) Explain different types of machine learning with suitable example.

1. Types of Machine Learning

# A) Types of Machine Learning

1. Supervised Learning

- Definition: Training a model on labeled data, where the input-output pairs are

known.

- Examples:

- Predicting house prices (input: size, location; output: price).

- Classifying emails as spam or not spam.

- Common Algorithms: Linear Regression, Logistic Regression, Random Forest,

Neural Networks.

2. Unsupervised Learning

- Definition: Training a model on unlabeled data to find hidden patterns or

intrinsic structures.

- Examples:

- Customer segmentation (clustering customers into groups based on behavior).

- Anomaly detection in credit card transactions.

- Common Algorithms: K-Means, Hierarchical Clustering, PCA (Principal

Component Analysis).

Partial preview of the text

Download Notes on Machine Learning with Q&A and more Study notes Machine Learning in PDF only on Docsity!

1 A) Explain different types of machine learning with suitable example.

Types of Machine Learning

A) Types of Machine Learning

Supervised Learning
- Definition: Training a model on labeled data, where the input-output pairs are known.
- Examples:
  - Predicting house prices (input: size, location; output: price).
  - Classifying emails as spam or not spam.
- Common Algorithms: Linear Regression, Logistic Regression, Random Forest, Neural Networks.
Unsupervised Learning
- Definition: Training a model on unlabeled data to find hidden patterns or intrinsic structures.
- Examples:
  - Customer segmentation (clustering customers into groups based on behavior).
  - Anomaly detection in credit card transactions.
- Common Algorithms: K-Means, Hierarchical Clustering, PCA (Principal Component Analysis).

Semi-Supervised Learning
- Definition: Combines a small amount of labeled data with a large amount of unlabeled data to improve learning accuracy.
- Examples:
  - Image classification where only a few images are labeled, but many are not.
- Common Algorithms: Self-training, Graph-based algorithms.
Reinforcement Learning
- Definition: Training an agent to take actions in an environment to maximize cumulative rewards.
- Examples:
  - Game-playing AI (like AlphaGo).
  - Autonomous vehicles learning to navigate roads.
- Common Algorithms: Q-Learning, Deep Q-Networks, Policy Gradients.

B) What are the various issues in machine learning and how to

overcome them

B) Issues in Machine Learning and How to Overcome Them

Overfitting
- Description: Model learns noise or specifics of the training data, failing to generalize.
- Solutions:
  - Use regularization techniques like L1/L2 penalties.
  - Use cross-validation to monitor performance.
  - Increase training data.

Continuously monitor and audit models.

Interpretability

Description: Models like neural networks are "black-box" and hard to interpret.
Solutions:
Use explainable AI tools (e.g., SHAP, LIME).
Prefer interpretable models for sensitive applications.

2 A) Explain Probably Approximately Correct Learning.

Probably Approximately Correct (PAC) Learning - Explained in Points

1. What is PAC Learning?

Introduced by Leslie Valiant in 1984.
Provides a theoretical framework to analyze the sample complexity and generalization of learning algorithms.
Ensures that a learning algorithm can:
- Learn a concept from a finite set of labeled training examples.
- Do so with high probability and low error.

2. Conditions for PAC Learning

To be considered PAC, a learning algorithm must satisfy the following:

Low Error:
- The hypothesis should have a small true error, defined as the probability of misclassification on examples from the true distribution.

Empirical error (on training data) should closely approximate true error.

Consistency:

The hypothesis should classify all training examples correctly.

High Probability:

The probability of success should be at least (1 - δ), where δ is the acceptable probability of failure.

3. Key Concepts

True Error: Error on the true distribution of data.
Empirical Error: Error on the training set.
Sample Complexity: Number of labeled examples needed for high confidence and accuracy.
Confidence ((1 - \delta)): Probability that the learning process succeeds.
Accuracy ((1 - \epsilon)): Maximum acceptable error in the hypothesis.

4. Advantages of PAC Learning

Theoretical Guarantees:

Quantifies how much data is needed for learning with high probability and low error.

Trade-off Analysis:

Helps balance hypothesis complexity vs. sample complexity.

Noise Robustness:

Analyzes the impact of noise on learning, guiding algorithm improvements.

Computational Complexity:

Applied to unsupervised grouping of data points based on similarity.

Reinforcement Learning:

Helps in learning policies to maximize rewards.

Active Learning:

Guides the selection of the most informative samples to minimize labeled data needs.

B) Imagine our hypothesis is not one rectangle but a union of two (or m

rectangles. What is the advantage of such a hypothesis class? Show that

any class can be represented by such a hypothesis class with large

enough m.

Hypothesis Class with Multiple Rectangles - Easy Explanation

1. Single Rectangle: Basic and Limited

A single rectangle can only represent one connected group of positive instances.
If the positive instances are scattered or in separate groups, a single rectangle fails to capture the pattern.

2. Multiple Rectangles: More Flexible

Using two or more rectangles:
- You can capture separate clusters of positive instances.
- Example: One rectangle covers instances near one corner, another rectangle covers instances in a different region.

3. Why Multiple Rectangles Work

Each rectangle represents a logical rule (e.g., (x) between 1-3 AND (y) between 2-4).
Combining rectangles is like using an OR condition:
- Rectangle 1 OR Rectangle 2 captures more patterns.
With enough rectangles, you can approximate almost any pattern, no matter how irregular.

4. Any Class Can Be Represented

In the worst case, each positive instance can have its own rectangle ((m = N), where (N) is the number of instances).
For simpler patterns, fewer rectangles are enough.

5. Advantages of Using Multiple Rectangles

Captures Complex Patterns:
- Works for disconnected or irregular clusters of data.
Flexible and Scalable:
- Add more rectangles for more complicated data.
Logical Representation:
- The system is like a set of "rules" combined with OR conditions.

(in) Conditional Probability.

(iii) Posterior Probability.

3A) Bayesian Decision Theory Explained in Points

Bayesian Decision Theory:
- A framework for decision-making under uncertainty.
- Uses probability to make decisions based on prior knowledge and new evidence.
- Incorporates prior probability, likelihood (conditional probability), and posterior probability to help make informed decisions.

(i) Prior Probability:

Definition: The probability of an event or hypothesis before observing new data.
Purpose: Encodes initial beliefs or knowledge about the likelihood of an event.
Example:
- Probability of rain on a random day based on historical weather data (e.g., 30% chance of rain).

(ii) Conditional Probability:

Definition: The probability of an event happening given that another event has already occurred.
Purpose: Updates the belief about the event based on new evidence.

Formula: P(A|B) , the probability of event ( A ) given event ( B ).
Example:
- Probability of rain, given that clouds are present, might be different from the prior probability.

(iii) Posterior Probability:

Definition: The updated probability of a hypothesis after observing new evidence.
Purpose: Combines the prior probability and new evidence to revise the belief about the event.
Formula:
Example:
- If the prior probability of rain is 30% and the evidence (clouds) increases the likelihood of rain, the posterior probability reflects this updated belief.

| 6 | Milk, Butter | | 7 | Bread, Eggs | | 8 | Milk, Butter, Eggs | | 9 | Bread, Milk, Butter | | 10 | Bread, Eggs | There are 10 transactions in total.

Support Calculation

Support for Bread & Butter (Bread, Butter):

Occurs in transactions: 1, 3, 5, 9 (4 occurrences).
Support(Bread, Butter) = 4/10 = 0.

Support for Milk & Eggs (Milk, Eggs):

Occurs in transactions: 4, 5, 8 (3 occurrences).
Support(Milk, Eggs) = 3/10 = 0.

Support for Bread & Milk (Bread, Milk):

Occurs in transactions: 1, 2, 5, 9 (4 occurrences).
Support(Bread, Milk) = 4/10 = 0.

Support for Milk & Butter (Milk, Butter):

Occurs in transactions: 1, 5, 6, 8, 9 (5 occurrences).
Support(Milk, Butter) = 5/10 = 0.

Confidence Calculation

Confidence for Bread → Butter (Bread → Butter):

Support(Bread → Butter) = Support(Bread, Butter) = 4/10.
Support(Bread) occurs in transactions: 1, 2, 3, 5, 9 (5 occurrences).
Confidence(Bread → Butter) = Support(Bread, Butter) / Support(Bread) = 4/5 =

Confidence for Milk → Eggs (Milk → Eggs):

Support(Milk, Eggs) = 3/10.
Support(Milk) occurs in transactions: 1, 2, 5, 6, 8, 9 (6 occurrences).
Confidence(Milk → Eggs) = Support(Milk, Eggs) / Support(Milk) = 3/6 = 0.

Confidence for Bread → Milk (Bread → Milk):

Support(Bread, Milk) = 4/10.
Support(Bread) occurs in transactions: 1, 2, 3, 5, 9 (5 occurrences).
Confidence(Bread → Milk) = Support(Bread, Milk) / Support(Bread) = 4/5 = 0.

Confidence for Milk → Butter (Milk → Butter):

Support(Milk, Butter) = 5/10.
Support(Milk) occurs in transactions: 1, 2, 5, 6, 8, 9 (6 occurrences).
Confidence(Milk → Butter) = Support(Milk, Butter) / Support(Milk) = 5/6 ≈ 0. Conclusion The calculations for support and confidence are correct. Here is the summary of the results again:

Examples : o Linear regression applied to non-linear data. o A decision tree with very few splits. 2. Variance
Definition : Variance refers to the error introduced by the model's sensitivity to small fluctuations in the training data.
High Variance : o The model captures noise along with the data patterns.

o Leads to overfitting , where the model performs well on training data but poorly on test data.

Examples : o A deep neural network trained without regularization. o A decision tree with too many splits. 3. The Trade-off
Key Idea : To achieve good generalization, a model must balance bias and variance. o High Bias + Low Variance : Simple models with underfitting. o Low Bias + High Variance : Complex models with overfitting.

B) Why is dimensionality reduction is important in machine learning?

State Pro's and Con's of them.

4B) Importance of Dimensionality Reduction in Machine Learning Importance of Dimensionality Reduction in Machine Learning (Points)

Reduces Overfitting : o Helps to prevent overfitting by simplifying the model and focusing on the most important features.
Improves Computation Time and Efficiency : o Reduces the number of features, thus lowering the computational cost and improving processing time.
Better Visualization : o Makes high-dimensional data easier to visualize, especially in 2D or 3D space.
Eliminates Redundancy : o Removes highly correlated features, improving model performance by focusing on unique, valuable data.
Feature Extraction : o Creates new features from the original dataset, retaining essential information for model performance. Pros of Dimensionality Reduction
Reduced Storage Requirements : o Less data means less storage space is needed for large datasets.

Notes on Machine Learning with Q&A, Study notes of Machine Learning

Related documents

Partial preview of the text

Download Notes on Machine Learning with Q&A and more Study notes Machine Learning in PDF only on Docsity!

1 A) Explain different types of machine learning with suitable example.

A) Types of Machine Learning

B) What are the various issues in machine learning and how to

overcome them

B) Issues in Machine Learning and How to Overcome Them

2 A) Explain Probably Approximately Correct Learning.

1. What is PAC Learning?

2. Conditions for PAC Learning

3. Key Concepts

4. Advantages of PAC Learning

B) Imagine our hypothesis is not one rectangle but a union of two (or m

rectangles. What is the advantage of such a hypothesis class? Show that

any class can be represented by such a hypothesis class with large

enough m.

1. Single Rectangle: Basic and Limited

2. Multiple Rectangles: More Flexible

3. Why Multiple Rectangles Work

4. Any Class Can Be Represented

5. Advantages of Using Multiple Rectangles

(in) Conditional Probability.

(iii) Posterior Probability.

Support for Bread & Butter (Bread, Butter):

Support for Milk & Eggs (Milk, Eggs):

Support for Bread & Milk (Bread, Milk):

Support for Milk & Butter (Milk, Butter):

Confidence for Bread → Butter (Bread → Butter):

Confidence for Milk → Eggs (Milk → Eggs):

Confidence for Bread → Milk (Bread → Milk):

Confidence for Milk → Butter (Milk → Butter):

B) Why is dimensionality reduction is important in machine learning?

State Pro's and Con's of them.