Table of Contents:

Generating some data

Initialize the parameters, compute the class scores, compute the loss.

  • Computing the analytic gradient with backpropagation

Performing a parameter update

Putting it all together: training a softmax classifier, training a neural network.

In this section we’ll walk through a complete implementation of a toy Neural Network in 2 dimensions. We’ll first implement a simple linear classifier and then extend the code to a 2-layer Neural Network. As we’ll see, this extension is surprisingly simple and very few changes are necessary.

Lets generate a classification dataset that is not easily linearly separable. Our favorite example is the spiral dataset, which can be generated as follows:

neural network assignment github

Normally we would want to preprocess the dataset so that each feature has zero mean and unit standard deviation, but in this case the features are already in a nice range from -1 to 1, so we skip this step.

Training a Softmax Linear Classifier

Lets first train a Softmax classifier on this classification dataset. As we saw in the previous sections, the Softmax classifier has a linear score function and uses the cross-entropy loss. The parameters of the linear classifier consist of a weight matrix W and a bias vector b for each class. Lets first initialize these parameters to be random numbers:

Recall that we D = 2 is the dimensionality and K = 3 is the number of classes.

Since this is a linear classifier, we can compute all class scores very simply in parallel with a single matrix multiplication:

In this example we have 300 2-D points, so after this multiplication the array scores will have size [300 x 3], where each row gives the class scores corresponding to the 3 classes (blue, red, yellow).

The second key ingredient we need is a loss function, which is a differentiable objective that quantifies our unhappiness with the computed class scores. Intuitively, we want the correct class to have a higher score than the other classes. When this is the case, the loss should be low and otherwise the loss should be high. There are many ways to quantify this intuition, but in this example lets use the cross-entropy loss that is associated with the Softmax classifier. Recall that if \(f\) is the array of class scores for a single example (e.g. array of 3 numbers here), then the Softmax classifier computes the loss for that example as:

We can see that the Softmax classifier interprets every element of \(f\) as holding the (unnormalized) log probabilities of the three classes. We exponentiate these to get (unnormalized) probabilities, and then normalize them to get probabilites. Therefore, the expression inside the log is the normalized probability of the correct class. Note how this expression works: this quantity is always between 0 and 1. When the probability of the correct class is very small (near 0), the loss will go towards (positive) infinity. Conversely, when the correct class probability goes towards 1, the loss will go towards zero because \(log(1) = 0\). Hence, the expression for \(L_i\) is low when the correct class probability is high, and it’s very high when it is low.

Recall also that the full Softmax classifier loss is then defined as the average cross-entropy loss over the training examples and the regularization:

Given the array of scores we’ve computed above, we can compute the loss. First, the way to obtain the probabilities is straight forward:

We now have an array probs of size [300 x 3], where each row now contains the class probabilities. In particular, since we’ve normalized them every row now sums to one. We can now query for the log probabilities assigned to the correct classes in each example:

The array correct_logprobs is a 1D array of just the probabilities assigned to the correct classes for each example. The full loss is then the average of these log probabilities and the regularization loss:

In this code, the regularization strength \(\lambda\) is stored inside the reg . The convenience factor of 0.5 multiplying the regularization will become clear in a second. Evaluating this in the beginning (with random parameters) might give us loss = 1.1 , which is -np.log(1.0/3) , since with small initial random weights all probabilities assigned to all classes are about one third. We now want to make the loss as low as possible, with loss = 0 as the absolute lower bound. But the lower the loss is, the higher are the probabilities assigned to the correct classes for all examples.

Computing the Analytic Gradient with Backpropagation

We have a way of evaluating the loss, and now we have to minimize it. We’ll do so with gradient descent. That is, we start with random parameters (as shown above), and evaluate the gradient of the loss function with respect to the parameters, so that we know how we should change the parameters to decrease the loss. Lets introduce the intermediate variable \(p\), which is a vector of the (normalized) probabilities. The loss for one example is:

We now wish to understand how the computed scores inside \(f\) should change to decrease the loss \(L_i\) that this example contributes to the full objective. In other words, we want to derive the gradient \( \partial L_i / \partial f_k \). The loss \(L_i\) is computed from \(p\), which in turn depends on \(f\). It’s a fun exercise to the reader to use the chain rule to derive the gradient, but it turns out to be extremely simple and interpretible in the end, after a lot of things cancel out:

Notice how elegant and simple this expression is. Suppose the probabilities we computed were p = [0.2, 0.3, 0.5] , and that the correct class was the middle one (with probability 0.3). According to this derivation the gradient on the scores would be df = [0.2, -0.7, 0.5] . Recalling what the interpretation of the gradient, we see that this result is highly intuitive: increasing the first or last element of the score vector f (the scores of the incorrect classes) leads to an increased loss (due to the positive signs +0.2 and +0.5) - and increasing the loss is bad, as expected. However, increasing the score of the correct class has negative influence on the loss. The gradient of -0.7 is telling us that increasing the correct class score would lead to a decrease of the loss \(L_i\), which makes sense.

All of this boils down to the following code. Recall that probs stores the probabilities of all classes (as rows) for each example. To get the gradient on the scores, which we call dscores , we proceed as follows:

Lastly, we had that scores = np.dot(X, W) + b , so armed with the gradient on scores (stored in dscores ), we can now backpropagate into W and b :

Where we see that we have backpropped through the matrix multiply operation, and also added the contribution from the regularization. Note that the regularization gradient has the very simple form reg*W since we used the constant 0.5 for its loss contribution (i.e. \(\frac{d}{dw} ( \frac{1}{2} \lambda w^2) = \lambda w\). This is a common convenience trick that simplifies the gradient expression.

Now that we’ve evaluated the gradient we know how every parameter influences the loss function. We will now perform a parameter update in the negative gradient direction to decrease the loss:

Putting all of this together, here is the full code for training a Softmax classifier with Gradient descent:

Running this prints the output:

We see that we’ve converged to something after about 190 iterations. We can evaluate the training set accuracy:

This prints 49% . Not very good at all, but also not surprising given that the dataset is constructed so it is not linearly separable. We can also plot the learned decision boundaries:

neural network assignment github

Clearly, a linear classifier is inadequate for this dataset and we would like to use a Neural Network. One additional hidden layer will suffice for this toy data. We will now need two sets of weights and biases (for the first and second layers):

The forward pass to compute scores now changes form:

Notice that the only change from before is one extra line of code, where we first compute the hidden layer representation and then the scores based on this hidden layer. Crucially, we’ve also added a non-linearity, which in this case is simple ReLU that thresholds the activations on the hidden layer at zero.

Everything else remains the same. We compute the loss based on the scores exactly as before, and get the gradient for the scores dscores exactly as before. However, the way we backpropagate that gradient into the model parameters now changes form, of course. First lets backpropagate the second layer of the Neural Network. This looks identical to the code we had for the Softmax classifier, except we’re replacing X (the raw data), with the variable hidden_layer ):

However, unlike before we are not yet done, because hidden_layer is itself a function of other parameters and the data! We need to continue backpropagation through this variable. Its gradient can be computed as:

Now we have the gradient on the outputs of the hidden layer. Next, we have to backpropagate the ReLU non-linearity. This turns out to be easy because ReLU during the backward pass is effectively a switch. Since \(r = max(0, x)\), we have that \(\frac{dr}{dx} = 1(x > 0) \). Combined with the chain rule, we see that the ReLU unit lets the gradient pass through unchanged if its input was greater than 0, but kills it if its input was less than zero during the forward pass. Hence, we can backpropagate the ReLU in place simply with:

And now we finally continue to the first layer weights and biases:

We’re done! We have the gradients dW,db,dW2,db2 and can perform the parameter update. Everything else remains unchanged. The full code looks very similar:

This prints:

The training accuracy is now:

Which prints 98% !. We can also visualize the decision boundaries:

neural network assignment github

We’ve worked with a toy 2D dataset and trained both a linear network and a 2-layer Neural Network. We saw that the change from a linear classifier to a Neural Network involves very few changes in the code. The score function changes its form (1 line of code difference), and the backpropagation changes its form (we have to perform one more round of backprop through the hidden layer to the first layer of the network).

  • You may want to look at this IPython Notebook code rendered as HTML .
  • Or download the ipynb file

Deep Neural Network Implementation Using PyTorch

Introduction.

🚀 Dive into the Exciting World of Deep Neural Networks with PyTorch! 🤖🔥

Hey there, fellow tech enthusiast! 🤓 Ever felt like PyTorch is a bit of a puzzle, unlike its more user-friendly counterparts? Well, fret not, because we’re here to spice up your deep learning journey! 🎉

⚡ Introducing the “PyTorch Deep Dive” Series ⚡

Buckle up, because we’re taking you on a whirlwind adventure through the realm of implementing Deep Neural Networks using PyTorch. 🌟 Whether you’re a fresh-faced researcher venturing into the deep learning domain or a pro switching up your frameworks, this series has your back!

📚 It’s Not Your Typical Tutorial 📚

Before we get all technical, let’s set the stage. We’re not going to baby you through PyTorch basics. This isn’t your run-of-the-mill PyTorch tutorial; this is where we show you how to wield the power of PyTorch to craft magnificent Deep Neural Networks. So, if you’ve got your basics down with libraries like NumPy and Matplotlib, you’re golden!

🧠 Deep Dive, Literally 🧠

Hold up! This isn’t Deep Learning 101 either. We’re diving straight into the meaty part – building the real deal! So if you’re nodding your head to terms like Neural Networks and the magic of backpropagation, you’re all set to sail.

🌈 Practical Deep Learning FTW 🌈

Picture this: in our first tutorial, we’re going hands-on with a super simple deep neural network example using PyTorch. 🚀 Trust us, learning through examples is the secret sauce! 🍔🍟 After all, who doesn’t love breaking down complex problems with relatable examples?

🎮 Tutorial Level 1: PyTorch Essentials 🎮

Ready, steady, go! 🏁 We’re kicking things off by giving you the lowdown on PyTorch – why you should use it, what makes it tick, and why it’s your trusty sidekick for the deep learning journey. Then, brace yourself for some action! We’re jumping into a fun classification task. Think of it as a sneak peek into the PyTorch universe: how models are born, trained, and put to the test.

Feeling a bit overwhelmed? No worries at all! Everyone starts somewhere, and this is your starting line. With a sprinkle of patience, a dash of trial and error, and a dollop of dedication, you’ll be whizzing through this in no time.

🚀 Elevate Your Neural Network Game 🚀

Here’s the deal: we’re covering the nitty-gritty, from foundational concepts to crafting your network architecture. By the end of this tutorial, you’ll be waving your PyTorch wand to create powerful deep learning models. 🪄✨

Ready to embark on this journey? We thought so! Head over to the Google Colab notebook at the link below and let’s dive into the world of PyTorch-powered Deep Neural Networks together:

🔗 Notebook Link

Let’s make those neurons dance and those models shine! 🕺💃💻

0. What’s the Hype about PyTorch?

PyTorch, the brainchild of the whizzes at Facebook’s AI Research lab (FAIR), is THE open-source framework empowering deep learning daredevils like you. 🎩✨ Whether you’re a research maestro or a coding ninja, PyTorch is your trusty sidekick for crafting and taming deep neural networks that conquer complexity like champs.

0.1 The “Why-Is-PyTorch-Awesome” Showdown

Hold onto your code hats, because PyTorch packs a punch like no other! Here are the dazzling stars that set PyTorch on the red carpet of deep learning fame:

Python Power-Up : PyTorch speaks fluent Python and cozies up with its libraries, making it the ultimate wingman for your Python-powered projects. 🐍📚

Facebook’s Fav : You know it’s a superstar when even Facebook themselves use it for their deep learning endeavors. 📸👑

User-Friendly Vibes : PyTorch brings the party with an API so intuitive, even your pet parrot could grasp it. 🦜💃

Graphs on the Fly : Ever seen a graph build itself while the code dances? PyTorch’s dynamic computational graphs are the coolest party trick in town. 🕺📊

Speed Racer : PyTorch zips through computations faster than a caffeinated cheetah, giving you a turbo-charged coding experience. ⚡🐆

GPU Awesomeness : With CUDA at its side, PyTorch flexes its muscles on GPUs for lightning-fast execution. Cue applause for speed and power! 🚀💥

Community Carnival : Join the PyTorch parade with a bustling community that brings you libraries, tools, and pre-trained models galore! From torchvision for vision quests to transformers for language escapades, PyTorch’s got you covered. 🎉📚

Deploy Delights : PyTorch’s toolkit ensures your models hit the streets with style. Whether you’re sending them to mobile realms or web wonderlands, PyTorch makes sure they’re dressed to impress. 🚀🌐

0.2 The “Code It Like It’s Hot” Vibe

Welcome to the land of Pythonic perfection! 🎉🐍 PyTorch embraces Python’s charm, giving you a playground of simplicity and elegance. It’s a place where you can code complex computations with the grace of a swan in a tutu. 🦢✨

0.3 PyTorch vs. the Universe

Ah, the grand showdown! PyTorch takes on the titans like TensorFlow, Keras, and Caffe. While each has its own fan club, PyTorch stands out like a supernova in a deep learning galaxy:

Dynamic vs. Static : PyTorch’s dynamic graph brings the house down, letting you tweak, twist, and turn your models on the fly. Static graphs, eat your heart out! 💃🎩

Pythonic Powers : PyTorch speaks your language, letting you express complex ideas with Pythonic flair. Say goodbye to tangled code vines! 🐍✨

Debugging Sorcery : PyTorch has your back with debugging tricks up its sleeves. Spot bugs, squash them, and watch your models shine! 🐞🔮

Behold the enchanting journey of implementing deep learning in PyTorch, as illustrated below:

  • Load data from files or libraries.
  • Handle missing values and outliers.
  • Normalize or scale features if necessary.
  • Convert categorical variables to numerical representations.
  • Divide data into training, validation, and test sets.
  • Shuffle the data to ensure randomness.
  • Determine appropriate ratios for each set.
  • Decide on the type of neural network (e.g., feedforward, convolutional, recurrent).
  • Determine the number of layers and units per layer.
  • Choose activation functions for hidden layers.
  • Create a custom class inheriting from nn.Module.
  • Define layers and operations in the init method.
  • Implement the forward pass in the forward method.
  • Select a loss function for the task
  • Choose an optimizer
  • Define learning rate and other hyperparameters.
  • Divide data into batches for efficient computation.
  • Use DataLoader for automatic batching and shuffling.
  • Pass a batch through the model to obtain predictions.
  • Compare predictions to actual target values.
  • Compute the loss using the selected loss function.
  • Calculate gradients using automatic differentiation.
  • Backpropagate gradients through the network.
  • Use the optimizer to update weights based on gradients.
  • Similar to the training loop but with validation data.
  • Calculate validation loss and any desired metrics.
  • Evaluate the model’s performance on unseen data.
  • Track validation loss throughout epochs.
  • Compare current loss to previous values.
  • Implement a stopping criterion to prevent overfitting.
  • Restore the model to the best-performing state.
  • Similar to validation and training loops.
  • Calculate test loss and relevant metrics.
  • Assess the final model’s performance on unseen data.
  • Create graphs of loss and metrics over epochs.
  • Display model predictions alongside actual data.
  • Gain insights into how the model behaves.
  • Save the model.

1. Prerequisites

Before diving into deep neural network implementation with PyTorch, it is essential to have a basic understanding of the following concepts:

  • Python programming language
  • Machine learning fundamentals
  • Neural networks and their architectures

2. Getting Started with PyTorch

Installation.

To install PyTorch, you can use pip or conda . Open a terminal and run the following command:

Also, you can find the comprehensive tutorial on get started in the official page .

Importing Required Libraries

In your Python script or notebook, import the necessary libraries as follows:

Setting up the GPU (Optional)

If you have access to a GPU, PyTorch can leverage its power to accelerate computations. To utilize the GPU, ensure that you have the appropriate NVIDIA drivers installed. You can then enable GPU support in PyTorch by adding the following code:

3. Dataset Preparation

In following, we will explore the intricate process of dataset loading, preprocessing, and train-validation-test splitting. We will focus on the MNIST dataset as an illustrative example. By following these steps, you will gain an in-depth understanding of dataset preparation in the context of deep learning.

Data Loading

Loading the dataset is the initial step in the data preparation pipeline. PyTorch’s torchvision module provides a range of functions for automatically downloading and loading popular datasets, including MNIST, CIFAR and etc. The MNIST dataset comprises grayscale images of handwritten digits, with each image labeled from 0 to 9.

To load the MNIST dataset using PyTorch, execute the following code:

In the above code, we utilize the datasets.MNIST class to download and load the MNIST dataset. The root parameter specifies the directory where the dataset will be stored. By setting train=True , we load the training set, while train=False indicates the test set.

You can visualize the dataset by:

Once the dataset is loaded, it becomes readily available for further processing and model training.

Data Preprocessing

Data preprocessing is a critical step that ensures the dataset is in a suitable format for training a deep neural network. Common preprocessing techniques include normalization, resizing, and data augmentation. PyTorch’s transforms module provides a range of transformation functions to facilitate these preprocessing operations.

Normalization

Normalization is a fundamental preprocessing technique that scales the pixel values to a standardized range. It helps to alleviate the impact of different scales and improves the convergence of the training process. For the MNIST dataset, we can normalize the pixel values to a range of [-1, 1].

To create a preprocessing pipeline specific to the MNIST dataset, use the following code:

In the code snippet above, we use the ToTensor transformation to convert the PIL images in the dataset to tensors, enabling efficient processing within the deep neural network. Subsequently, the Normalize transformation scales the pixel values by subtracting the mean (0.5) and dividing by the standard deviation (0.5), resulting in a range of [-1, 1].

By applying these transformations, the dataset is effectively preprocessed and ready for training.

Resizing is a preprocessing technique commonly employed when input images have varying dimensions. It ensures that all images within the dataset possess consistent dimensions, simplifying subsequent processing steps. However, for the MNIST dataset, the images are already of uniform size (28x28 pixels), so resizing is not necessary in this case.

Data Augmentation

Data augmentation is a powerful technique used to artificially increase the diversity of the training dataset. By applying random transformations to the training images, we introduce additional variations, thereby enhancing the model’s ability to generalize. Common data augmentation techniques include random cropping, flipping, rotation, and scaling.

For the MNIST dataset, data augmentation may not be necessary due to its relatively large size and the inherent variability of the handwritten digits. However, it is worth noting that data augmentation can be beneficial for more complex datasets where additional variations can help improve model performance.

Train-Validation-Test Split

Splitting the dataset into distinct subsets for training, validation, and testing is essential for assessing and fine-tuning the performance of the deep neural network. The train-validation-test split allows us to train the model on one subset, tune hyperparameters on another, and evaluate the final model’s generalization on the independent test set.

PyTorch’s torch.utils.data.random_split utility simplifies this process. We can split the MNIST dataset into training, validation, and test sets using the following code:

In the code above, we first import the random_split function from torch.utils.data . We then define the desired proportions for the train, validation, and test sets. The sizes of each split are computed based on these proportions. Finally, we perform the random split on the training set, generating separate datasets for training, validation, and testing.

By splitting the dataset in this manner, we ensure that the model is trained on a sufficiently large training set, validated on a smaller validation set to monitor performance, and tested on an independent test set to evaluate generalization.

4. Model Architecture Design

Designing the architecture of your deep neural network involves selecting the number of layers, their sizes, and the activation functions. Additionally, you need to choose a suitable loss function and optimizer for training the model. Proper selection of hyperparameters is crucial for model performance.

Neural Network Layers

PyTorch provides a variety of layer types, such as fully connected layers ( nn.Linear ), convolutional layers ( nn.Conv2d ), and recurrent layers ( nn.RNN ). These layers can be stacked together to form a deep neural network architecture.

The list of available neural network layers, including but not limited to:

  • Convolutional Layers ( nn.Conv1d , nn.Conv2d , nn.Conv3d )
  • Linear Layers ( nn.Linear )
  • Recurrent Layers ( nn.RNN , nn.LSTM , nn.GRU )
  • Dropout Layers ( nn.Dropout , nn.Dropout2d , nn.Dropout3d )
  • Normalization Layers ( nn.BatchNorm1d , nn.BatchNorm2d , nn.LayerNorm )
  • Activation Layers ( nn.ReLU , nn.Sigmoid , nn.Tanh , nn.LeakyReLU )
  • Pooling Layers ( nn.MaxPool1d , nn.MaxPool2d , nn.AvgPool1d , nn.AvgPool2d )
  • Embedding Layers ( nn.Embedding )

Activation Functions

Activation functions introduce non-linearity to the model. PyTorch offers a wide range of activation functions, including ReLU ( nn.ReLU ), sigmoid ( nn.Sigmoid ), and tanh ( nn.Tanh ).

Here is a list of available activation functions in PyTorch:

  • ReLU: nn.ReLU
  • Leaky ReLU: nn.LeakyReLU
  • PReLU: nn.PReLU
  • ELU: nn.ELU
  • SELU: nn.SELU
  • GELU: nn.GELU
  • Sigmoid: nn.Sigmoid
  • Tanh: nn.Tanh
  • Softmax: nn.Softmax
  • LogSoftmax: nn.LogSoftmax

Loss Functions

The choice of a loss function depends on the task you are trying to solve. PyTorch provides various loss functions, such as mean squared error ( nn.MSELoss ), cross-entropy loss ( nn.CrossEntropyLoss ), and binary cross-entropy loss ( nn.BCELoss ).

Here is a list of available loss functions in PyTorch:

  • BCELoss: nn.BCELoss
  • BCEWithLogitsLoss: nn.BCEWithLogitsLoss
  • CrossEntropyLoss: nn.CrossEntropyLoss
  • CTCLoss: nn.CTCLoss
  • HingeEmbeddingLoss: nn.HingeEmbeddingLoss
  • KLDivLoss: nn.KLDivLoss
  • L1Loss: nn.L1Loss
  • MarginRankingLoss: nn.MarginRankingLoss
  • MSELoss: nn.MSELoss
  • MultiLabelMarginLoss: nn.MultiLabelMarginLoss
  • MultiLabelSoftMarginLoss: nn.MultiLabelSoftMarginLoss
  • MultiMarginLoss: nn.MultiMarginLoss
  • NLLLoss: nn.NLLLoss
  • PoissonNLLLoss: nn.PoissonNLLLoss
  • SmoothL1Loss: nn.SmoothL1Loss
  • SoftMarginLoss: nn.SoftMarginLoss
  • TripletMarginLoss: nn.TripletMarginLoss

Optimizers are responsible for updating the model’s parameters based on the computed gradients during training. PyTorch includes popular optimizers like stochastic gradient descent ( optim.SGD ), Adam ( optim.Adam ), and RMSprop ( optim.RMSprop ).

Here is a list of available optimizers in PyTorch:

  • SGD: torch.optim.SGD
  • Adam: torch.optim.Adam
  • RMSprop: torch.optim.RMSprop
  • Adagrad: torch.optim.Adagrad
  • Adadelta: torch.optim.Adadelta
  • AdamW: torch.optim.AdamW
  • SparseAdam: torch.optim.SparseAdam
  • ASGD: torch.optim.ASGD
  • LBFGS: torch.optim.LBFGS
  • Rprop: torch.optim.Rprop

Hyperparameters

Hyperparameters define the configuration of your model and training process. Examples include the learning rate, batch size, number of epochs, regularization strength, and more. Proper tuning of hyperparameters significantly impacts the model’s performance.

5. Building the Deep Neural Network Model

To build and train a deep learning model in PyTorch follow the steps outlined below:

Step 1: Define the Model Architecture

  • Start by defining the architecture of your deep learning model. Create a subclass of the torch.nn.Module class and implement the model’s structure in the __init__ method and the forward pass in the forward method. Specify the layers, activation functions, and any other relevant components of your model.

Step 2: Instantiate the Model

  • Once the model class is defined, create an instance of the model by instantiating the class with the appropriate parameters. This includes specifying the input size, hidden layer size, and number of output classes.

Step 3: Define the Loss Function

  • To train the model, define a loss function that measures the difference between the predicted output and the true labels. Choose an appropriate loss function based on the problem at hand.

Step 4: Define the Optimizer

  • Select an optimizer that will update the model’s parameters based on the computed gradients during training. Set the learning rate and other relevant parameters for the optimizer.

Step 5: Train the Model

  • Iterate over the mini-batches of data (batch size) from the training dataset.
  • Perform a forward pass, feeding the input data through the model to obtain predictions.
  • Calculate the loss by comparing the predictions with the true labels using the defined loss function.
  • Perform a backward pass to compute the gradients of the loss with respect to the model’s parameters.
  • Use the optimizer to update the model’s parameters based on the gradients.
  • Optionally, track and log the training loss or any other relevant metrics.

Step 6: Evaluate the Model

  • After training, evaluate the model’s performance on unseen data to assess its generalization ability. Iterate over the validation or test dataset and calculate relevant metrics, such as accuracy, to measure the model’s performance.

Step 7: Save and Load the Model (Optional)

  • If desired, save the trained model to disk for future use or deployment. Use the torch.save() function to save the model’s state dictionary. Later, load the saved model using torch.load() to create an instance of the model and load the saved state.

Throughout the process, ensure that the data is appropriately preprocessed, such as scaling, normalizing, or applying any necessary transformations, to ensure compatibility with the model.

By following these steps, you can build and train a deep learning model in PyTorch. Each step contributes to the overall process of model development, training, evaluation, and potentially saving or loading the model for later use.

When comparing the steps involved in building and training a deep learning model in PyTorch and Keras, both frameworks offer distinct advantages and considerations.

Flexibility and Control : PyTorch provides a low-level and flexible approach to model development, allowing fine-grained control over the architecture. Its dynamic computational graph enables dynamic network structures and control flow, making it ideal for advanced research and experimentation.

Python Integration : PyTorch seamlessly integrates with the Python ecosystem, leveraging popular libraries like NumPy and SciPy. This integration facilitates efficient data processing, scientific computing, and visualization, empowering researchers with a wide range of tools.

Rich Ecosystem : PyTorch benefits from an active community, resulting in a rich ecosystem of libraries, tools, and pre-trained models. This vibrant community ensures a steady influx of advancements and resources that can be readily utilized.

User-Friendly Interface : Keras offers a high-level API built on top of TensorFlow, providing a simplified and intuitive interface. Its design philosophy prioritizes simplicity and ease of use, making it highly accessible for beginners and enabling rapid model iteration.

Quick Prototyping : Keras abstracts away low-level details, allowing for rapid prototyping and easy experimentation. It provides pre-defined models and modules for common deep learning tasks, facilitating quick implementation and practical applications.

In summary, PyTorch’s flexibility and Pythonic approach make it an excellent choice for advanced research and customization, while Keras’s simplicity and abstraction make it preferable for practical applications and rapid prototyping. The choice between the two frameworks depends on the project requirements and the desired trade-off between flexibility and ease of use.

In following, we will delve into how to build and train model using PyTorch.

Defining the Model Class

In the process of building a deep learning model, defining the model class is a fundamental step. This involves creating a class that inherits from the nn.Module base class provided by PyTorch. The model class serves as a blueprint for the architecture of the neural network and encapsulates the layers and the forward propagation logic.

To define the model class, you need to perform the following steps:

Create a class that inherits from nn.Module , such as NeuralNetwork(nn.Module) .

  • By inheriting from nn.Module , you can leverage the functionalities and features provided by PyTorch for model construction and training.

Define the architecture of the neural network within the __init__ method.

  • def __init__(self, ... ) : In the __init__ method, you specify the layers and their configurations. This is where you instantiate and define the individual layers of your network. You can consider input_size, hidden_size, output_size to configure the model.

super(NeuralNetwork, self).__init__() can be used to initialize the parent class ( nn.Module ).

In Python, when a class inherits from another class, it is important to call the constructor of the parent class to properly initialize its attributes and functionalities. By using super(NeuralNetwork, self).__init__() , we explicitly call the constructor of the nn.Module class, which is the parent class of our custom model class ( NeuralNetwork ). This ensures that the necessary initialization steps defined in the parent class are executed before any additional initialization specific to the child class.

Implement the forward propagation logic in the forward method.

  • The forward method defines how the input flows through the network and produces the output.
  • You apply the necessary activation functions and combine the layers to form the desired network architecture.
  • Ensure that the forward pass is defined sequentially, specifying the sequence of operations that transform the input into the output.

Here’s an example of a simple fully connected neural network class:

Initializing the Model

To initialize an instance of the model class, you need to specify the input size, hidden size, and output size. For example:

6. Training the Model

Setting up training parameters.

Before training the model, you need to define the training parameters such as the learning rate, batch size, and number of epochs. You also need to specify the loss function and optimizer.

Defining the Loss Function

Choose an appropriate loss function based on your task. For example, if you are solving a multi-class classification problem, you can use the cross-entropy loss:

Selecting the Optimizer

Select an optimizer and provide the model parameters to optimize. For example, to use the Adam optimizer:

Define the Number of Epochs

The number of epochs determines how many times the model iterates over the entire training dataset. It is essential to find the right balance: too few epochs may result in underfitting, while too many can lead to overfitting. Researchers employ techniques like early stopping and cross-validation to determine the optimal number of epochs. By fine-tuning this parameter, models can achieve optimal performance without unnecessary computational overhead.

Define the Batch Size

The batch size refers to the number of samples processed in each iteration during training. It plays a crucial role in balancing computational efficiency and model performance. Selecting an appropriate batch size is important to ensure efficient memory utilization and computational speed. A small batch size allows the model to update parameters more frequently but may result in noisy gradients and slower convergence. Conversely, a large batch size reduces noise but may lead to longer training times and potential memory limitations. Researchers often experiment with different batch sizes to find the optimal trade-off between accuracy and computational efficiency for their specific problem. It is important to consider hardware limitations, model complexity, and available computational resources when defining the batch size for training deep learning models.

Creat Dataset Loader

Dataset loaders, such as train_loader and validation_loader, play a crucial role in deep learning. They enable efficient loading and processing of training and validation datasets, respectively. The train_loader iterates through the training data with a specified batch size, allowing for effective model training. The validation_loader evaluates the model’s performance on unseen data. By using dataset loaders, researchers can handle large datasets and ensure effective training and evaluation of deep learning models.

Training the Model

The training loop typically involves iterating over the dataset, performing forward and backward propagation, and updating the model’s parameters. Here’s an example of a basic training loop using PyTorch:

Monitoring Training Progress

During training, it’s helpful to monitor the training loss, validation loss, and accuracy. You can use these metrics to assess the model’s performance and make improvements as necessary. You can also visualize the metrics using libraries like matplotlib or tensorboard .

7. Evaluating the Model

Testing the model.

Once you have trained the model, you can evaluate its performance on unseen data. Create a separate test data loader and use the trained model to make predictions. Compare the predictions with the ground truth labels to compute metrics such as accuracy, precision, recall, and F1 score.

Also, you can test the model by showing an image along with the predicted label, you can select a random sample from the test dataset and visualize the image using matplotlib . Here’s an example of how you can achieve this:

Model Evaluation Metrics

The choice of evaluation metrics depends on the specific task you are solving. For classification problems, common metrics include accuracy, precision, recall, and F1 score. For regression tasks, metrics like mean squared error (MSE) or mean absolute error (MAE) are often used.

To visualize the performance of a model on the test dataset, you can create a confusion matrix and a classification report. These visualizations provide insights into how well the model is performing for each class in the test dataset.

The following code demonstrates how to evaluate the performance of a deep learning model using the test dataset. It utilizes the sklearn.metrics module to compute the confusion matrix and classification report, which provide insights into the model’s predictions and overall performance.

First, the model is set to evaluation mode using model.eval() . Then, the code iterates through the test loader, disabling gradient computation for efficiency. In each iteration, the model performs a forward pass on the inputs and obtains the predicted class labels. These predicted labels are appended to the predictions list, while the true labels are appended to the true_labels list.

Once all predictions and true labels are collected, the code proceeds to compute the confusion matrix and classification report using the confusion_matrix and classification_report functions from sklearn.metrics . The confusion matrix provides a tabular representation of the model’s predictions versus the true labels, while the classification report offers metrics such as precision, recall, and F1 score for each class.

To create a visually appealing confusion matrix, you can use the heatmap function from the seaborn library. This will allow you to plot the confusion matrix as a color-coded image, making it easier to interpret.

Here’s an example of how to create a beautiful confusion matrix using seaborn :

In this code, we first compute the confusion matrix using the confusion_matrix function from scikit-learn. Then, we create a figure and axes for the plot using plt.subplots . We use the sns.heatmap function to create a heatmap of the confusion matrix, with annotations to display the values in each cell. We customize the colormap ( cmap ) to use the ‘Blues’ color scheme and set fmt='d' to display the cell values as integers.

We set the axis labels and title, and customize the tick labels to match the class labels. Finally, we rotate the tick labels for better visibility and display the plot using plt.show() .

This code will produce ab informative visualization of the confusion matrix, allowing you to easily interpret the model’s performance on the test dataset.

8. Improving Model Performance

To improve the performance of your deep neural network model, you can employ various techniques.

Regularization Techniques

Regularization helps prevent overfitting. Techniques like L1 and L2 regularization (weight decay), dropout, and batch normalization can be applied to the model.

Hyperparameter Tuning

Tuning hyperparameters is crucial for achieving optimal performance. You can use techniques like grid search, random search, or more advanced methods like Bayesian optimization to find the best combination of hyperparameters.

Data augmentation involves applying transformations to the training data to increase its diversity. Techniques like random cropping, flipping, rotation, and scaling can be used to augment the dataset, thereby improving generalization.

Transfer Learning

Transfer learning allows you to leverage pre-trained models trained on large datasets and adapt them to your specific task. By using pre-trained models as a starting point, you can significantly reduce training time and achieve good performance with limited labeled data.

9. Saving and Loading Models

Saving the model.

You can save the trained model’s parameters to disk for future use or deployment. PyTorch provides the torch.save function to save the model. Here’s an example:

Loading the Model

To load the saved model, create an instance of the model class and load the saved parameters. Here’s an example:

10. Conclusion

In this tutorial, we covered the complete process of implementing a deep neural network using PyTorch. We explored data loading and preprocessing, model architecture design, training, evaluation, and techniques for improving model performance. By leveraging PyTorch’s capabilities, you can build and train powerful deep learning models for a wide range of tasks.

11. References

Here are some references that you may find useful for further exploration:

  • PyTorch documentation: https://pytorch.org/docs/
  • PyTorch tutorials: https://pytorch.org/tutorials/
  • Official PyTorch GitHub repository: https://github.com/pytorch/pytorch

Arman Asgharpoor Golroudbari

Space-ai researcher.

My research interests revolve around planetary rovers and spacecraft vision-based navigation.

Deep-Learning-Specialization

Coursera deep learning specialization, improving deep neural networks: hyperparameter tuning, regularization and optimization.

This course will teach you the “magic” of getting deep learning to work well. Rather than the deep learning process being a black box, you will understand what drives performance, and be able to more systematically get good results. You will also learn TensorFlow.

  • Understand industry best-practices for building deep learning applications.
  • Be able to effectively use the common neural network “tricks”, including initialization, L2 and dropout regularization, Batch normalization, gradient checking.
  • Be able to implement and apply a variety of optimization algorithms, such as mini-batch gradient descent, Momentum, RMSprop and Adam, and check for their convergence.
  • Understand new best-practices for the deep learning era of how to set up train/dev/test sets and analyze bias/variance
  • Be able to implement a neural network in TensorFlow.

Week 1: Practical aspects of Deep Learning

Key concepts of week 1.

  • Recall that different types of initializations lead to different results
  • Recognize the importance of initialization in complex neural networks.
  • Recognize the difference between train/dev/test sets
  • Diagnose the bias and variance issues in your model
  • Learn when and how to use regularization methods such as dropout or L2 regularization.
  • Understand experimental issues in deep learning such as Vanishing or Exploding gradients and learn how to deal with them
  • Use gradient checking to verify the correctness of your backpropagation implementation

Assignment of Week 1

  • Quiz 1: Practical aspects of deep learning
  • Programming Assignment: Initialization
  • Programming Assignment: Regularization
  • Programming Assignment: Gradient_Checking

Week 2: Optimization algorithms

Key concepts of week 2.

  • Remember different optimization methods such as (Stochastic) Gradient Descent, Momentum, RMSProp and Adam
  • Use random mini-batches to accelerate the convergence and improve the optimization
  • Know the benefits of learning rate decay and apply it to your optimization

Assignment of Week 2

  • Quiz 2: Optimization algorithms
  • Programming Assignment: Optimization

Week 3: Hyperparameter tuning, Batch Normalization and Programming Frameworks

Key concepts of week 3.

  • Master the process of hyperparameter tuning

Assignment of Week 3

  • Quiz 3: Hyperparameter tuning, Batch Normalization, Programming Frameworks
  • Programming Assignment: Tensorflow

Course Certificate

Certificate

Neural Networks Assignment #

Backpropagation (20 points) #.

Assuming that $x \in \R^n$, backpropagate the following network to find

$$\frac{\partial L}{\partial \mathbf w}$$

neural network assignment github

Tensorflow Playground (60 points. 10 points per question) #

The TensorFlow Playground is a handy neural network simulator built by the TensorFlow team. In this exercise, you will train several binary classifiers in just a few clicks, and tweak the model’s architecture and its hyperparameters to gain some intuition on how neural networks work and what their hyperparameters do.

Try training the default neural network by clicking the Run button (top left). Notice how it quickly finds a good solution for the classification task. The neurons in the first hidden layer have learned simple patterns, while the neurons in the second hidden layer have learned to combine the simple patterns of the first hidden layer into more complex patterns. Why that is?

Activation functions. Try replacing the tanh activation function with a ReLU activation function, and train the network again. Does it find the solution faster or slower? Why is that?

The risk of local minima. Modify the network architecture to have just one hidden layer with three neurons. Train it multiple times (to reset the network weights, click the Reset button next to the Play button). Why the training time has a wide variability?

Remove one neuron to keep just two. Notice that the neural network is now incapable of finding a good solution - why that is?

Set the number of neurons to eight, and train the network several times. Has the training time (time to convergence) improved? Why that is?

Select the spiral dataset (the bottom-right dataset under “DATA”), and change the network architecture to have four hidden layers with eight neurons each. Notice that training takes much longer and often gets stuck on plateaus for long periods of time. Also notice that the neurons in the highest layers (on the right) tend to evolve faster than the neurons in the lowest layers (on the left). Can you explain how this may be related to gradient flow through the network?

Tensorflow API (20 points) #

(5 points) Submit your notebook URL that allows the notebook here to be executed.

(15 points) Start reducing the number of cats in the dataset and plot the accuracy of the predicting the cat class as the population of cats becomes 90%, 70%, 50%, 30%, 10% of the original. For each population size present the hyperparameter optimized result using AutoKeras . Explain your findings.

neural network assignment github

      "Building your Deep Neural Network Step by Step"

    "building your deep neural network step by step", building your deep neural network: step by step.

Welcome to your week 4 assignment (part 1 of 2)! You have previously trained a 2-layer Neural Network (with a single hidden layer). This week, you will build a deep neural network, with as many layers as you want!

  • In this notebook, you will implement all the functions required to build a deep neural network.
  • In the next assignment, you will use these functions to build a deep neural network for image classification.

After this assignment you will be able to:

  • Use non-linear units like ReLU to improve your model
  • Build a deeper neural network (with more than 1 hidden layer)
  • Implement an easy-to-use neural network class

Let’s get started!

1 - Packages

Let’s first import all the packages that you will need during this assignment.

  • numpy is the main package for scientific computing with Python.
  • matplotlib is a library to plot graphs in Python.
  • dnn_utils provides some necessary functions for this notebook.
  • testCases provides some test cases to assess the correctness of your functions
  • np.random.seed(1) is used to keep all the random function calls consistent. It will help us grade your work. Please don’t change the seed.

2 - Outline of the Assignment

To build your neural network, you will be implementing several “helper functions”. These helper functions will be used in the next assignment to build a two-layer neural network and an L-layer neural network. Each small helper function you will implement will have detailed instructions that will walk you through the necessary steps. Here is an outline of this assignment, you will:

  • We give you the ACTIVATION function (relu/sigmoid).
  • Combine the previous two steps into a new [LINEAR->ACTIVATION] forward function.
  • Compute the loss.
  • Complete the LINEAR part of a layer’s backward propagation step.
  • We give you the gradient of the ACTIVATE function (relu_backward/sigmoid_backward)
  • Combine the previous two steps into a new [LINEAR->ACTIVATION] backward function.
  • Stack [LINEAR->RELU] backward L-1 times and add [LINEAR->SIGMOID] backward in a new L_model_backward function
  • Finally update the parameters.

Figure 1

Note that for every forward function, there is a corresponding backward function. That is why at every step of your forward module you will be storing some values in a cache. The cached values are useful for computing gradients. In the backpropagation module you will then use the cache to calculate the gradients. This assignment will show you exactly how to carry out each of these steps.

3 - Initialization

You will write two helper functions that will initialize the parameters for your model. The first function will be used to initialize parameters for a two layer model. The second one will generalize this initialization process to $L$ layers.

3.1 - 2-layer Neural Network

Exercise : Create and initialize the parameters of the 2-layer neural network.

Instructions :

  • The model’s structure is: LINEAR -> RELU -> LINEAR -> SIGMOID .
  • Use random initialization for the weight matrices. Use np.random.randn(shape)*0.01 with the correct shape.
  • Use zero initialization for the biases. Use np.zeros(shape) .

Expected output :

3.2 - L-layer Neural Network

table1

Exercise : Implement initialization for an L-layer Neural Network.

The model’s structure is [LINEAR -> RELU] $ \times$ (L-1) -> LINEAR -> SIGMOID . I.e., it has $L-1$ layers using a ReLU activation function followed by an output layer with a sigmoid activation function.

Use random initialization for the weight matrices. Use np.random.rand(shape) * 0.01 .

Use zeros initialization for the biases. Use np.zeros(shape) .

4 - Forward propagation module

4.1 - linear forward.

Now that you have initialized your parameters, you will do the forward propagation module. You will start by implementing some basic functions that you will use later when implementing the model. You will complete three functions in this order:

  • LINEAR -> ACTIVATION where ACTIVATION will be either ReLU or Sigmoid.

The linear forward module (vectorized over all the examples) computes the following equations:

Exercise : Build the linear part of forward propagation.

4.2 - Linear-Activation Forward

In this notebook, you will use two activation functions:

For more convenience, you are going to group two functions (Linear and Activation) into one function (LINEAR->ACTIVATION). Hence, you will implement a function that does the LINEAR forward step followed by an ACTIVATION forward step.

Note : In deep learning, the “[LINEAR->ACTIVATION]” computation is counted as a single layer in the neural network, not two layers.

d) L-Layer Model

neural network assignment github

Exercise : Implement the forward propagation of the above model.

  • Use the functions you had previously written
  • Use a for loop to replicate [LINEAR->RELU] (L-1) times
  • Don’t forget to keep track of the caches in the “caches” list. To add a new value c to a list , you can use list.append(c) .

5 - Cost function

Now you will implement forward and backward propagation. You need to compute the cost, because you want to check if your model is actually learning.

Expected Output :

6 - Backward propagation module

Just like with forward propagation, you will implement helper functions for backpropagation. Remember that back propagation is used to calculate the gradient of the loss function with respect to the parameters.

neural network assignment github

Now, similar to forward propagation, you are going to build the backward propagation in three steps:

  • LINEAR backward
  • LINEAR -> ACTIVATION backward where ACTIVATION computes the derivative of either the ReLU or sigmoid activation
  • [LINEAR -> RELU] $\times$ (L-1) -> LINEAR -> SIGMOID backward (whole model)

6.1 - Linear backward

neural network assignment github

Exercise : Use the 3 formulas above to implement linear_backward().

6.2 - Linear-Activation backward

Next, you will create a function that merges the two helper functions: linear_backward and the backward step for the activation linear_activation_backward .

To help you implement linear_activation_backward , we provided two backward functions:

  • sigmoid_backward : Implements the backward propagation for SIGMOID unit. You can call it as follows:
  • relu_backward : Implements the backward propagation for RELU unit. You can call it as follows:

Exercise : Implement the backpropagation for the LINEAR->ACTIVATION layer.

Expected output with sigmoid:

Expected output with relu

6.3 - L-Model Backward

neural network assignment github

Initializing backpropagation :

You can then use this post-activation gradient dAL to keep going backward. As seen in Figure 5, you can now feed in dAL into the LINEAR->SIGMOID backward function you implemented (which will use the cached values stored by the L_model_forward function). After that, you will have to use a for loop to iterate through all the other layers using the LINEAR->RELU backward function. You should store each dA, dW, and db in the grads dictionary. To do so, use this formula :

Expected Output

6.4 - Update Parameters

In this section you will update the parameters of the model, using gradient descent:

Exercise : Implement update_parameters() to update your parameters using gradient descent.

7 - Conclusion

Congrats on implementing all the functions required for building a deep neural network!

We know it was a long assignment but going forward it will only get better. The next part of the assignment is easier.

In the next assignment you will put all these together to build two models:

  • A two-layer neural network
  • An L-layer neural network

You will in fact use these models to classify cat vs non-cat images!

  • Previous       "Planar data classification with one hidden layer"
  • Next       "模仿C++的string实现一个字符串String类"

FEATURED TAGS

neural network assignment github

Assignment #3 - Convolutional Neural Networks

Released on Monday 1398/2/9

Download [ problems ] [ show preview ] [ solutions ]

Late Policy

  • You have free 8 late days.
  • You can use late days for assignments. A late day extends the deadline 24 hours.
  • Once you have used all 8 late days, the penalty is 10% for each additional late day.

In this assignment, you will become familiar with convolutional neural networks, batch normalization, autoencoders, Deconv layer, and an interesting application of CNN in computer vision.

  • Students who audit this course should submit their assignments to be qualified for attending the rest of the sessions.
  • Finding any sort of copying will zero down that assignment grade and also will be counted as two negative assignment points for your final score.
  • Click on Download [problems] to obtain the assignment jupyter notebook.
  • Go to https://colab.research.google.com/ .
  • Switch to Upload tab, choose assignment_03.ipynb and click upload.
  • Now You’re ready to go.

Follow notebook instructions to submit your assignment!

neural network assignment github

CSC413/2516 Winter 2023 Neural Networks and Deep Learning

Course syllabus and policies: Course handout .

Teaching staff:

  • Jimmy Ba , Tues 5-6pm, PT290C
  • Bo Wang , Fri 11-noon,
  • Head TA: John Giorgi and Rex Ma

Contact emails:

Please do not email the instructor or the TAs about the class directly to their personal accounts. Use the ticketing system for most of the requests, e.g., enrollment, coursework extensions, late submissions, etc.

Piazza: Students are encouraged to sign up Piazza to join course discussions. If your question is about the course material and doesn’t give away any hints for the homework, please post to Piazza so that the entire class can benefit from the answer.

Lecture and tutorial hours:

Online lectures and tutorials: The access to online lectures and tutorials will be communicated via course mailing list. Course videos and materials belong to your instructor, the University, and/or other sources depending on the specific facts of each situation, and are protected by copyright. Do not download, copy, or share any course or student materials or videos without the explicit permission of the instructor. For questions about recording and use of videos in which you appear please contact your instructor.

Accessibility: The CSC413 teaching team is fully committed to ensuring accessibility for all our students. For the students looking for additional academic accommodations or accessibility services registration, please visit www.accessibility.utoronto.ca . Students are encouraged to review the course syllabus at the beginning of a course and discuss questions regarding their accommodations for the course with their Accessibility Advisor. Once registered, students should send the Letter of Academic Accommodations to our ticketing system at [email protected] as soon as possible by Friday, January 27th, 2023.

Waitlist and enrollment: CSC413 and CSC2516 always had long waiting lists for the last few years. The hard enrollment cap is determined by teaching resources available at the department level. Note that waitlists typically expire one week after the course starts. Once waitlists are removed, students are responsible for trying to enroll in the course on ACORN in a first-come, first-serve fashion. If you have further questions, please get in touch with CS undergrad office or CS graduate office .

Announcements:

  • Jan 16 : Assignment 1 handout and the starter code are now online. Make sure you create a copy in your own Drive before making edits, or else the changes will not be saved. Assignment 1 is due Feb 3rd.
  • Jan 29 : Assignment 2 handout and the starter code are now online. Make sure you create a copy in your own Drive before making edits, or else the changes will not be saved. Assignment 2 is due Feb 24th.
  • Feb 2 : Assignment 2 handout updated to Version 1.1
  • Feb 9 : Assignment 2 handout updated to Version 1.2
  • Feb 14 : Assignment 2 handout updated to Version 1.3
  • Mar 4 : Assignment 3 handout and the starter code nmt.ipynb , bert.ipynb , clip.ipynb are up. Assignment 3 is now due Mar 31st.
  • Mar 17 : Assignment 4 handout and the starter code gnn.ipynb , dqn.ipynb are up. Assignment 4 is now due Apr 11st.
  • Mar 21 : Assignment 4 handout updated to Version 1.1

Course Overview:

It is very hard to hand design programs to solve many real world problems, e.g. distinguishing images of cats v.s. dogs. Machine learning algorithms allow computers to learn from example data, and produce a program that does the job. Neural networks are a class of machine learning algorithm originally inspired by the brain, but which have recently have seen a lot of success at practical applications. They’re at the heart of production systems at companies like Google and Facebook for image processing, speech-to-text, and language understanding. This course gives an overview of both the foundational ideas and the recent advances in neural net algorithms.

Assignments:

Lateness and grace days: Every student has a total of 7 grace days to extend the coursework deadlines through the semester. Each grace day allows for a 24 hours deadline extension without late penalty. That is, you may apply the grace days on a late submission to remove its late penalty. The maximum extension you may request is up to the remaining grace days you have banked up. We will keep track of the grace days on MarkUs. After the grace period, assignments will be accepted up to 3 days late, but 10% will be deducted for each day late, rounded up to the nearest day. After that, submissions will not be accepted and will receive a score of 0.

Midterm Online Quiz: Feb. 9

The midterm online quiz will cover the lecture materials up to Lecture 4 and Assignment 2 (written part only). The quiz will be hosted on Quecus for 24 hours. The exact details will be announced soon.

Calendar:

Suggested readings included help you understand the course material. They are not required, i.e. you are only responsible for the material covered in lecture. Most of the suggested reading listed are more advanced than the corresponding lecture, and are of interest if you want to know where our knowledge comes from or follow current frontiers of research.

Resource:

Instantly share code, notes, and snippets.

@SeetaramarajuP

SeetaramarajuP / Assignment 16 Neural Network.ipynb

  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Embed Embed this gist in your website.
  • Share Copy sharable link for this gist.
  • Clone via HTTPS Clone using the web URL.
  • Learn more about clone URLs

IMAGES

  1. GitHub

    neural network assignment github

  2. GitHub

    neural network assignment github

  3. GitHub

    neural network assignment github

  4. GitHub

    neural network assignment github

  5. graph-convolutional-neural-networks · GitHub Topics · GitHub

    neural network assignment github

  6. neural-network-visualizations · GitHub Topics · GitHub

    neural network assignment github

VIDEO

  1. COMP3132 Assignment 2 Iris Dataset Neural Network

  2. Visualizing Neural Network Internals

  3. Project: Neural Network data training using NNTool. Part: 7/10

  4. Handwritten digits recognition with Matlab. Coursera's Neural Networks for Machine Learning

  5. c2q2_advanced learning algorithms practice quiz neural network model solutions _ nagwagabr RWPS

  6. Fuzzy Logic And Neural Networks Week 1 Quiz Assignment Solution

COMMENTS

  1. amanchadha/coursera-deep-learning-specialization

    About. Notes, programming assignments and quizzes from all courses within the Coursera Deep Learning specialization offered by deeplearning.ai: (i) Neural Networks and Deep Learning; (ii) Improving Deep Neural Networks: Hyperparameter tuning, Regularization and Optimization; (iii) Structuring Machine Learning Projects; (iv) Convolutional Neural Network…

  2. GitHub

    You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window.

  3. Building your Deep Neural Network: Step by Step

    In the next assignment, you will use these functions to build a deep neural network for image classification. After this assignment you will be able to: Use non-linear units like ReLU to improve your model. Build a deeper neural network (with more than 1 hidden layer) Implement an easy-to-use neural network class.

  4. fanghao6666/neural-networks-and-deep-learning

    You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window.

  5. Building your Deep Neural Network: Step by Step

    In the next assignment, you will use these functions to build a deep neural network for image classification. After this assignment you will be able to: Use non-linear units like ReLU to improve your model. Build a deeper neural network (with more than 1 hidden layer) Implement an easy-to-use neural network class.

  6. Convolutional Neural Networks

    This course will teach you how to build convolutional neural networks and apply it to image data. Thanks to deep learning, computer vision is working far better than just two years ago, and this is enabling numerous exciting applications ranging from safe autonomous driving, to accurate face recognition, to automatic reading of radiology images ...

  7. Sequence Models

    Learn about recurrent neural networks. This type of model has been proven to perform extremely well on temporal data. It has several variants including LSTMs, GRUs and Bidirectional RNNs, which you are going to learn about in this section. Assignment of Week 1. Quiz 1: Recurrent Neural Networks

  8. Assignment 1

    1. Open collect_submission.ipynb in Colab and execute the notebook cells. This notebook/script will: Generate a zip file of your code ( .py and .ipynb) called a1_code_submission.zip. Convert all notebooks into a single PDF file. If your submission for this step was successful, you should see the following display message:

  9. Neural Networks and Deep Learning

    Week 4: Deep Neural Networks. Understand the key computations underlying deep learning, use them to build and train deep neural networks, and apply it to computer vision. Quiz 4: Key concepts on Deep Neural Networks; Programming Assignment: Building your Deep Neural Network Step by Step; Programming Assignment: Deep Neural Network Application ...

  10. Assignment 2

    The goals of this assignment are as follows: Understand Neural Networks and how they are arranged in layered architectures. Understand and be able to implement (vectorized) backpropagation. Implement various update rules used to optimize Neural Networks. Implement Batch Normalization and Layer Normalization for training deep networks.

  11. CS231n Convolutional Neural Networks for Visual Recognition

    The parameters of the linear classifier consist of a weight matrix W and a bias vector b for each class. Lets first initialize these parameters to be random numbers: # initialize parameters randomly. W = 0.01 * np.random.randn(D,K) b = np.zeros((1,K)) Recall that we D = 2 is the dimensionality and K = 3 is the number of classes.

  12. Deep Neural Network Implementation Using PyTorch

    PyTorch provides a variety of layer types, such as fully connected layers ( nn.Linear ), convolutional layers ( nn.Conv2d ), and recurrent layers ( nn.RNN ). These layers can be stacked together to form a deep neural network architecture. The list of available neural network layers, including but not limited to:

  13. Improving Deep Neural Networks: Hyperparameter tuning, Regularization

    View on GitHub Improving Deep Neural Networks: Hyperparameter tuning, Regularization and Optimization. This course will teach you the "magic" of getting deep learning to work well. Rather than the deep learning process being a black box, you will understand what drives performance, and be able to more systematically get good results ...

  14. Neural Network Assignment

    Select the spiral dataset (the bottom-right dataset under "DATA"), and change the network architecture to have four hidden layers with eight neurons each. Notice that training takes much longer and often gets stuck on plateaus for long periods of time. Also notice that the neurons in the highest layers (on the right) tend to evolve faster ...

  15. "Building your Deep Neural Network Step by Step"

    Use non-linear units like ReLU to improve your model. Build a deeper neural network (with more than 1 hidden layer) Implement an easy-to-use neural network class. Notation: Superscript denotes a quantity associated with the layer. Example: is the layer activation. and are the layer parameters.

  16. Assignment #3

    A late day extends the deadline 24 hours. Once you have used all 8 late days, the penalty is 10% for each additional late day. In this assignment, you will become familiar with convolutional neural networks, batch normalization, autoencoders, Deconv layer, and an interesting application of CNN in computer vision.

  17. CSC413/2516 Neural Networks and Deep Learning (Winter 2023)

    Mar 4: Assignment 3 handout and the starter code nmt.ipynb, bert.ipynb, clip.ipynb are up. Assignment 3 is now due Mar 31st. ... Neural networks are a class of machine learning algorithm originally inspired by the brain, but which have recently have seen a lot of success at practical applications. They're at the heart of production systems at ...

  18. Neural Networks Assignment (forestfires) · GitHub

    Neural Networks Assignment (forestfires).ipynb This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.

  19. Assignment 16 Neural Network.ipynb · GitHub

    Assignment 16 Neural Network.ipynb. GitHub Gist: instantly share code, notes, and snippets.

  20. GitHub

    You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window.