Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Regularization in machine learning, Study notes of Introduction to Machine Learning

Regularization in machine learning is a technique used to prevent overfitting by adding a penalty to the model's complexity. Overfitting occurs when a model learns not only the underlying patterns in the training data but also the noise, resulting in poor performance on new, unseen data. Regularization helps in creating simpler models that generalize better to new data.

Typology: Study notes

2022/2023

Uploaded on 07/30/2024

jay-kumar-9
jay-kumar-9 🇮🇳

7 documents

1 / 2

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Regularization in machine learning is a technique used to prevent overfitting by adding a penalty
to the model's complexity. Overfitting occurs when a model learns not only the underlying
patterns in the training data but also the noise, resulting in poor performance on new, unseen
data. Regularization helps in creating simpler models that generalize better to new data.
1. L1 Regularization (Lasso)
2. L2 Regularization (Ridge)
3. Elastic Net Regularization
Mechanism: Adds the absolute values of the coefficients to the loss function.
Cost Function = Loss + λ∑ Wi
Here, λ is the regularization parameter or hyperparameter it take any value for 0 to infinity, and
are the model coefficients.
Effect: L1 regularization tends to produce sparse models, where some of the coefficients
are exactly zero. This property makes Lasso useful for feature selection, as it effectively
removes less important features.
Use Cases: When you want to perform feature selection or when you have a large
number of features, and you suspect that many of them are irrelevant.
Mechanism: Adds the squared values of the coefficients to the loss function.
CostFunction = Loss + ∑ (Wi)2λ
Effect: L2 regularization distributes the penalty among all coefficients, leading to
smaller coefficients overall but rarely zeroing them out completely. It helps in
stabilizing the model by reducing the variance.
Use Cases: When you have many features, and you believe that most of them
contribute to the outcome, but you want to avoid large coefficients that might make
the model sensitive to small changes in the input data
Mechanism: Combines both L1 and L2 penalties. Cost Function = Loss = + λ∑ Wi + λ ∑
(Wi)2
Effect: Elastic Net benefits from both L1 and L2 regularization, promoting sparsity while
also maintaining some stability in the model.
Use Cases: When you have highly correlated features, Elastic Net can help by selecting
groups of correlated features together. It is also useful when you want a balance
between feature selection (L1) and coefficient shrinkage (L2).
1. Prevents Overfitting: By adding a penalty for large coefficients, regularization
discourages the model from fitting the noise in the training data, leading to better
generalization.
2. Simplifies the Model: Especially with L1 regularization, unimportant features can be
removed, resulting in a simpler and more interpretable model.
3. Stabilizes Predictions: Regularization reduces the variance of the model's predictions,
making them more stable and less sensitive to fluctuations in the input data.
pf2

Partial preview of the text

Download Regularization in machine learning and more Study notes Introduction to Machine Learning in PDF only on Docsity!

Regularization in machine learning is a technique used to prevent overfitting by adding a penalty to the model's complexity. Overfitting occurs when a model learns not only the underlying patterns in the training data but also the noise, resulting in poor performance on new, unseen data. Regularization helps in creating simpler models that generalize better to new data.

  1. L1 Regularization (Lasso)
  2. L2 Regularization (Ridge)
  3. Elastic Net Regularization
  • Mechanism: Adds the absolute values of the coefficients to the loss function. Cost Function = Loss + λ∑ ∣ Wi∣ Here, λ is the regularization parameter or hyperparameter it take any value for 0 to infinity, and are the model coefficients.
  • Effect: L1 regularization tends to produce sparse models, where some of the coefficients are exactly zero. This property makes Lasso useful for feature selection, as it effectively removes less important features.
  • Use Cases: When you want to perform feature selection or when you have a large number of features, and you suspect that many of them are irrelevant.

• Mechanism: Adds the squared values of the coefficients to the loss function.

Cost Function = Loss + λ∑ (Wi)

• Effect: L2 regularization distributes the penalty among all coefficients, leading to

smaller coefficients overall but rarely zeroing them out completely. It helps in

stabilizing the model by reducing the variance.

• Use Cases: When you have many features, and you believe that most of them

contribute to the outcome, but you want to avoid large coefficients that might make

the model sensitive to small changes in the input data

  • Mechanism: Combines both L1 and L2 penalties. Cost Function = Loss = + λ∑ ∣ Wi ∣+ λ ∑ (Wi)
  • Effect: Elastic Net benefits from both L1 and L2 regularization, promoting sparsity while also maintaining some stability in the model.
  • Use Cases: When you have highly correlated features, Elastic Net can help by selecting groups of correlated features together. It is also useful when you want a balance between feature selection (L1) and coefficient shrinkage (L2).
  1. Prevents Overfitting: By adding a penalty for large coefficients, regularization discourages the model from fitting the noise in the training data, leading to better generalization.
  2. Simplifies the Model: Especially with L1 regularization, unimportant features can be removed, resulting in a simpler and more interpretable model.
  3. Stabilizes Predictions: Regularization reduces the variance of the model's predictions, making them more stable and less sensitive to fluctuations in the input data.

In the context of linear regression, the standard objective function without regularization is: Cost Function=∑(yi - ŷi)2, where yi are the actual values and ŷi are the predicted values. With regularization, this becomes:

• L1 Regularization: Cost Function = ∑(yi - ŷi) + λ∑ ∣ Wi∣

• L2 Regularization: Cost Function = ∑(yi - ŷi) + λ∑ (Wi)

• Elastic Net: Cost Function = ∑(yi - ŷi) + λ∑ ∣ Wi ∣ + + λ∑ (Wi)

Regularization is a standard feature in many machine learning libraries. For example:

  • Ridge Regression: sklearn.linear_model.Ridge
  • Lasso Regression: sklearn.linear_model.Lasso
  • Elastic Net: sklearn.linear_model.ElasticNet
  • You can add L1, L2, or both penalties to layers using the kernel_regularizer parameter.