Create_minibatch

Why to use mini-batching?

In the Udacity SDC there is a module on mini-batching and why?

Mini-batching is a technique for training on subsets of the dataset instead of all the data at one time. This provides the ability to train a model, even if a computer lacks the memory to store the entire dataset.

Mini-batching is computationally inefficient, since you can’t calculate the loss simultaneously across all samples. However, this is a small price to pay in order to be able to run the model at all.

Question 1 Calculate the memory size of train_features, train_labels, weights, and bias in bytes. Ignore memory for overhead, just calculate the memory required for the stored data.

You may have to look up how much memory a float32 requires, using this link.

train_features Shape: (55000, 784) Type: float32

train_labels Shape: (55000, 10) Type: float32

weights Shape: (784, 10) Type: float32

bias Shape: (10,) Type: float32 How many bytes of memory does train_features need? 172480000 RESET How many bytes of memory does train_labels need? 2200000 RESET How many bytes of memory does weights need? 31360 RESET How many bytes of memory does bias need? 40 RESET The total memory space required for the inputs, weights and bias is around 174 megabytes, which isn’t that much memory. You could train this whole dataset on most CPUs and GPUs.

But larger datasets that you’ll use in the future measured in gigabytes or more. It’s possible to purchase more memory, but it’s expensive. A Titan X GPU with 12 GB of memory costs over $1,000.

Instead, in order to run large models on your machine, you’ll learn how to use mini-batching.

Let’s look at how you implement mini-batching in TensorFlow.

import math def batches(batch_size, features, labels): “”” Create batches of features and labels :param batch_size: The batch size :param features: List of features :param labels: List of labels :return: Batches of (Features, Labels) “”” assert len(features) == len(labels) # TODO: Implement batching outout_batches = []

sample_size = len(features)
for start_i in range(0, sample_size, batch_size):
    end_i = start_i + batch_size
    outout_batches.append( [features[start_i:end_i], labels[start_i:end_i]] )
    
return outout_batches


from quiz import batches from pprint import pprint

4 Samples of features

example_features = [ [‘F11’,’F12’,’F13’,’F14’], [‘F21’,’F22’,’F23’,’F24’], [‘F31’,’F32’,’F33’,’F34’], [‘F41’,’F42’,’F43’,’F44’]]

4 Samples of labels

example_labels = [ [‘L11’,’L12’], [‘L21’,’L22’], [‘L31’,’L32’], [‘L41’,’L42’]]

PPrint prints data structures like 2d arrays, so they are easier to read

pprint(batches(3, example_features, example_labels))