Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Neural Networks and Convolutional Neural Networks: Assignment 2, Assignments of Computer Science

A comprehensive assignment for a university course on neural networks and convolutional neural networks. It covers various aspects of neural network design, implementation, and application, including activation functions, gradient descent optimization, and convolutional operations. The assignment includes exercises that require students to implement different neural network architectures, analyze their performance, and apply convolutional filters for image processing tasks.

Typology: Assignments

2023/2024

Uploaded on 11/19/2024

anuj-yadav-16
anuj-yadav-16 🇮🇳

1 document

1 / 13

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
IIT BOMBAY
Assignment #2
Points: 80
Course: CS 725 Instructor: Preethi Jyothi
Due date: 11:59 pm, October 28, 2024
General Instructions
Download the file assgmt2.tgz from Moodle and extract the file to get a direc-
tory named assgmt2 with all the necessary files within.
Here are the TAs to approach for each part of the assignment:
Part I: Tejomay, Sabyasachi, Sona
Part II: Darshan, Snegha, Sabyasachi
Part III: Poulami, Sameer
Part IV: Amruta, Harsh
Part V: Soumen
For your final submission, create a directory named assgmt2 with the following
internal directory structure:
assgmt2/
|
+- nn_template.py
+- part2.pdf
+- loss_bgd_1.png,loss_bgd_2.png,loss_bgd_3.png
+- loss_adam.png
+- convolution_1d_template.py
+- template_2dconv.py
+- part5.py [EXTRA CREDIT]
+- kaggle.csv [EXTRA CREDIT]
Compress your submission directory using the command: tar -cvzf
[rollno1]_[rollno2].tgz assgmt2 and upload this .tgz to Moodle. Make sure
the filename is roll numbers of all team members delimited by “_". This sub-
mission is due on or before 11:59 pm on Oct 28, 2024. No extensions will be
entertained.
STRICTLY FOLLOW THE SUBMISSION GUIDELINES. Any deviation from
these guidelines will result in penalties.
1
pf3
pf4
pf5
pf8
pf9
pfa
pfd

Partial preview of the text

Download Neural Networks and Convolutional Neural Networks: Assignment 2 and more Assignments Computer Science in PDF only on Docsity!

IIT BOMBAY

Assignment

Points: 80

Course: CS 725 – Instructor: Preethi Jyothi Due date: 11:59 pm, October 28, 2024

General Instructions

  • Download the file assgmt2.tgz from Moodle and extract the file to get a direc- tory named assgmt2 with all the necessary files within.
  • Here are the TAs to approach for each part of the assignment: - Part I: Tejomay, Sabyasachi, Sona - Part II: Darshan, Snegha, Sabyasachi - Part III: Poulami, Sameer - Part IV: Amruta, Harsh - Part V: Soumen
  • For your final submission, create a directory named assgmt2 with the following internal directory structure:

assgmt2/ | +- nn_template.py +- part2.pdf +- loss_bgd_1.png,loss_bgd_2.png,loss_bgd_3.png +- loss_adam.png +- convolution_1d_template.py +- template_2dconv.py +- part5.py [EXTRA CREDIT] +- kaggle.csv [EXTRA CREDIT]

Compress your submission directory using the command: tar -cvzf [rollno1][rollno2].tgz assgmt2 and upload this .tgz to Moodle. Make sure the filename is roll numbers of all team members delimited by “". This sub- mission is due on or before 11:59 pm on Oct 28, 2024. No extensions will be entertained.

  • STRICTLY FOLLOW THE SUBMISSION GUIDELINES. Any deviation from these guidelines will result in penalties.

Part I: Implement a Feedforward Neural Network (25 points)

For this problem, you will implement a feedforward neural network training algo- rithm from scratch. This network will have:

  • An input layer
  • One or more hidden layers with configurable dimensions (via hidden_dims) and activation functions (via activations)
  • A single output node with a sigmoid activation and a binary cross-entropy loss

nn_template.py outlines the structure of the neural network with support for various activation functions and optimizers. Your task is to complete the missing portions of the code, labeled as TODOs in nn_template.py. This neural network is designed to classify data into binary classes (denoted by 0, 1).

Firstly, implement the sigmoid, tanh functions and their derivatives. These will be used in both the forward and the backward passes of the network. Complete the missing functions sigmoid(x) in TODO 1a , sigmoid_derivative(x) in TODO 1b , tanh(x) in TODO 1c and tanh_derivative(x) in TODO 1d. relu(x) and relu_derivative(x) are already implemented. Note that the sigmoid function is σ (x) = (^1) +^1 e−x , and its derivative is σ ′(x) = σ (x) · ( 1 − σ (x)). The tanh function is

tanh(x) = e

x (^) −e−x ex^ +e−x^ , and its derivative is tanh

′(x) = 1 − tanh (^2) (x). [2 pts]

Next, implement the following functions within the NN class.

  • init: Initialize the weights and biases for all layers, including the output layers. (E.g., in case of a single hidden layer, there are two sets of weights corresponding to the two affine layers between 1) the input and the hidden layer and, 2) the hidden layer and the output node). Use random initializations for all these weights by sampling from a Gaussian distribution with a mean of 0 and a standard deviation of 1. This is TODO 2. [3 pts]
  • forward: Compute the activations for all the hidden nodes and the output node. Use the activation function corresponding to each layer as provided in the pa- rameter activations. Use separate variables to compute the weighted sum of inputs coming into each node (e.g., z 1 = w 1 x + b 1 ), and the output after applying the activation function to the weighted sum (e.g., a 1 = g( z 1 )). This completes TODO 3a and TODO 3b. [6 pts]
  • backward: In this function, you will compute gradients of the loss function with respect to all the weights and biases. Recall the expression for the binary cross- entropy loss for predictions ˆy and target values y, given by L(y, ˆy):

L(y, ˆy) = −[y log( yˆ) + ( 1 − y) log( 1 − yˆ)]

The derivative of this loss with respect to ˆy is:

L

y y^ ˆ

1 − y 1 − yˆ

  • step_bgd: Implement the code for a vanilla batch gradient descent update within this function using a static learning rate. This corresponds to a gd_flag value of 1. This will complete TODO 5a. Update the weights and biases of the affine layers by performing a gradient descent step, using their corresponding gradients, delta_weights and delta_biases, respectively. [2 pts]

train: Over num_epochs, the function train calls the forward and backward passes and computes gradients for all the training examples in a batch (stored in X and y) and updates the weights using an optimizer. The training loop also calculates train and test losses after every epoch.

If your implementation is correct, your code will converge in less than 30 epochs us- ing batch_size = 100 and your test accuracy will be a perfect 1.0. During evaluation, we will also check your code on a new dataset.

Part II: Explainability in Neural Networks (20 points)

All modern neural networks, despite exhibiting remarkable performance, lack inter- pretability or explainability a. In this section, we will aim to design neural networks for two toy datasets that are both optimal (in not needing more layers than required for perfect separability) and interpretable.

You are given two binary classification datasets shown in Figure 2 and Figure 3. Points within the blue regions are labeled 1 and the rest are labeled 0. Assume that points on the boundaries are labeled 1. Design an optimal neural network for each dataset that perfectly classifies the points, and provide written explanations of what each neuron is aiming to do.

Note the following points when designing your neural network:

  1. Your solution has to be a simple feed-forward neural network, with minimum number of hidden layers. (NOTE: Partial points will be awarded for non-optimal but correct solutions that have more than the minimum number of layers.)
  2. Include an explanation of what each neuron in the hidden layers is doing.
  3. Inputs to the NN are x 1 and x 2. No transformations to the inputs are allowed.
  4. Assume the following threshold-based activation function g(x; T) for all the neurons in your neural network:

g(x; T) =

1 if x ≥ T 0 otherwise

  1. No skip and self connections are allowed. That is, neurons in the ith^ layer are only connected to neurons in the (i − 1 )th^ layer.

0 1 2 3 4 5

1

2

3

4

5

x 1 + x 2 = 4

x 1 + x 2 = 6

x 1 + x 2 = 9

x 1 + x 2 = 1

x 1 x 2

4 1

1 -

-1 9

-1 (^1)

-1 1

1 1

1 1

x 1 x 2

4 1

1 -

-1 -1 9

-1 (^1)

-1 1

(^1 )

1 1

(A)

(B)

g ( x ; T ) = (^) {^1 if^ x^ ≥^ T 0 otherwise

Figure 2: Bands of blue [8 pts]

5

4

3

1 2

0

1

1

2

2 3 4 5 6 7 8 9 10

3

4

5

6

7

8

9

10

Figure 3: Catch the star [12 pts]

Here’s a sample problem and its solution to illustrate what we expect in your answer. Note that you need to submit one pdf file titled part2.pdf that contains drawings of both neural networks and explanations for the hidden neurons. Feel free to use your code written in Part I to validate your answers. However, DO NOT submit any of these files. We will be grading solely based on your written solutions in the pdf.

aThere is an entire sub-field of deep learning that focuses on explainability. Here’s a survey paper.

Part IV: De-convoluting Convolutions (30 points)

1D Convolutions. The 1D convolution in machine learning is actually the discrete convolution between two functions f and g, defined on the set of integers, Z. The convolution of f and g is defined as:

( f ∗ g)[n] =

∞ ∑ m=−∞

f [m]g[n − m]

In the context of machine learning, since we deal with finite length arrays, the sum above is computed only for those indices that are within the bounds of the arrays. (For indices that are outside the bounds, say m which is greater than the length of array f , the corresponding discrete function at that point is taken to be zero). We will call f above as the “input" of the convolution, and g as the “kernel" or “filter" that is performing the convolution.

Padding. Let us quickly walk you through the concept of padding. Depending on what we want the output size to be, we can pad the input of the convolution (see Figure 4).

  • Full padding implies that the input is padded to the largest possible extent to overlap with the kernel.
  • Valid padding implies that the kernel is always positioned inside the input.
  • Same padding is used to obtain an output of the same size as the input. Imagine a 1 × 3 kernel and a 1 × 7 input array in Figure 4, to get the 1D analogues of padding.

Stride. Next, let us look at stride. Stride of a convolution operation dictates how many steps the kernel moves at each step. Stride is typically used to reduce the dimensions of the input; see Figure 5. Note that the stride and kernel size are two independent parameters. If the kernel size in Figure 5 was 4, and assuming valid padding, the output size would be 2 × 2. Thus, if f and g are arrays of length L (^) f and Lg respectively, then the length of the convolution with stride 1 and full padding of f and g is L (^) f + Lg − 1.

(A) 1D Conv. Implement a function Convolution_1D in the file convolution_1d_template.py that takes in two 1-dimensional numpy arrays, and computes the 1D convolution of the two arrays, with a given stride and ‘valid’ or ‘full’ as possible padding modes. [6 pts]

Figure 4: Full, same and valid padding

Figure 5: Illustration of a convolution with stride 2

(B) Two dice. You are given two dice, an n faced die and an m faced die. Both dice are biased, that is, the probabilities of landing on each face are not equal. You are given the probability mass functions of the two dice as two numpy arrays. For example, consider a 6-faced die A and a 3-faced die B with the following probability mass functions:

pA = [0.1, 0.2, 0.3, 0.1, 0.1, 0.2] pB = [0.3, 0.4, 0.3]

The two dice are rolled together. Let us try to compute the probability that the sum of the faces of the two dice is equal to k. As a concrete example, for dice A and B above, let us try to compute the probability that the sum of the faces is equal to 7. The possibilities and their associated probabilities are:

( (^6) A, 1B) : 0.2 × 0.3 = 0. ( (^5) A, 2B) : 0.1 × 0.4 = 0. ( (^4) A, 3B) : 0.1 × 0.3 = 0.

Figure 6: Illustration of edge detection using convolution.

Figure 7: Illustration of noise removal.

(E) Image Denoising. Various filters can be used to pick up high and low fre- quency components of an image. Thus, they can be used for image denoising. This is because noise in images is usually high frequency and certain filters can be used to remove this high frequency component. You are given an image with noise added to it, noisycutebird.png. Using only our helper functions load_image and save_image and the numpy library, write a function remove_noise to remove this noise while maintaining the sharpness of the image edges. See Figure 7. This function must take a square image patch as input, apply the function to the patch and return a single pixel value. Pass these to your movePatchOverImg function with the provided image and save the resulting image. [2 pts]

(F) Unsharp masking. Unsharp masking is a technique of creating the illusion of a sharp image by enhancing the intensity of the edges of the image. A mask of the edges is created and added to the image. We will do this using Gaussian blur, which is 2D convolution of an image with a Gaussian kernel.

gaussian(x, y) =

2 πσ^2

exp −

(x − size 2 − 1 )^2 + (y − size 2 − 1 )^2 2 σ^2

Figure 8: Illustration of unsharp masking

  1. Write a function create_gaussian_kernel that takes as input a filter size (odd number) and a σ , and returns a square Gaussian kernel of that size. Ensure that kernel values add up to 1. [2 pts]
  2. Write a function gaussian_blur to apply Gaussian blur to an image. This func- tion must take a square image patch as input, generate a Gaussian kernel of size 25 and σ 1.0 using create_gaussian_kernel and apply the kernel to the patch, returning a single pixel value. [2 pts]
  3. Write a function unsharp_masking that takes as input an image and a scal- ing factor and performs unsharp masking using the steps illustrated in Figure 8, to return the "sharp" image. Your unsharp_masking function should con- vert the image to grayscale. Use your movePatchOverImg function with your gaussian_blur function and a kernel size of 25. Take measures to prevent over- flow of array values. Perform unsharp masking on the given image with an appropriate scaling factor and save the output image. [5 pts] Use only our helper functions load_image and save_image and the numpy library. For all the tasks in this part, your saved output images should be grayscale.

Kaggle Competition: Extra Credit (5 points)

Challenge: Classification from Image Features.

Overview. In this Kaggle competition, you will solve a realistic classification task to predict a class label (0 to 99) based on a set of visual features extracted from the publicly available ImageNet dataset. The final model will be evaluated on a test dataset via Kaggle, and test performance will be measured using Accuracy. Competition link: You can join the competition on Kaggle: IIT Bombay CS 725 Assignment 2 (Autumn 2024). Please sign up on Kaggle using your IITB LDAP email ID, with your Kaggle “Display Name" set to the roll number of any member in your team. This is important for us to identify you on the leaderboard. Dataset description. You are given three CSV files:

  • train.csv: This file contains the training data with 64 features and a corre- sponding target label for each entry.
  • test.csv: This file contains the test data with 64 features but without the target label.
  • sample.csv: This file contains the submission format with predicted label for the test data. You will have to submit such a file with your test predictions. Each row in the data files represent an instance with the following columns:
  • ID: A unique identifier for each data point.
  • feature_0, feature_1, ..., feature_63: The 64 features extracted from the dataset.
  • label: The target label for each data point (only in train.csv).

Task description. Implement a classification model for the given problem. You are free to use any of the predefined neural network layers from PyTorch or other ML libraries with any choice of optimizers and regularization techniques. You do not need to stick to the code you’ve written in Part I. Tune the hyperparameters on a held- out set from train.csv to achieve best model performance on the test set. Predict the target label on the test dataset.

Evaluation. The performance of your model will be evaluated based on the classifi- cation accuracy calculated on the test dataset (automatically via Kaggle). Your model will be evaluated on the provided test set, where a random 50% of the examples are marked as private and the remaining are public. The final evaluation will be based on the private part of the test set, which will be revealed via the private leaderboard after the competition concludes.

Submission. Submit your source file named part5.py and a CSV file kaggle.csv with your predicted label for the test dataset, following the format in sample.csv. This is an extra credit problem. Top-scoring performers on the “Private Leader- board" (with a fairly relaxed threshold determined after the deadline passes) will be awarded up to 5 extra points.