Deep Learning with Javascript 2.1.3

Posted on March 28, 2020

Notes on chapter two of Deep Learning with Javascript

2.1.3 Creating and Formattiong the Data 2.1.4 Defining a Simple Model 2.1.5 Fitting the Model to the Training Data

Goals

Cover tensors, modeling, and optimization what they are and how thework and how to use them appropriately.

2.1.3 Creating and Formatting the Data

Below are the steps to take in order to accept a file size and output an accurate prediction of download time. There is a one to one correspondance between the emlements in the two variables timeSec and sizeMb this allows a prediction to be made based on a sizeMb entry. There are two sets of this data trainData and testData, the first contains the training data and test data contains the _example_s with test data. They are so that the algorithm looks at new data when making a prediction. The output is called a target. The input is called features. This example has exactly one feature sizeMb, and one target timeSec. The goal is to train the model so that the future inputs are able to be used by the model to accurately predict download time.

Workflow

Train a neural network to make accurate predictions of timeSec given sizeMb.

Convert JavaScript data structures into tensors or the format that TensorFlow.js understands.
    
const trainData = {
  sizeMB:  [0.080, 9.000, 0.001, 0.100, 8.000, 5.000, 0.100, 6.000, 0.050, 0.500,
        0.002, 2.000, 0.005, 10.00, 0.010, 7.000, 6.000, 5.000, 1.000, 1.000],
  timeSec: [0.135, 0.739, 0.067, 0.126, 0.646, 0.435, 0.069, 0.497, 0.068, 0.116,
        0.070, 0.289, 0.076, 0.744, 0.083, 0.560, 0.480, 0.399, 0.153, 0.149]
};
const testData = {
  sizeMB:  [5.000, 0.200, 0.001, 9.000, 0.002, 0.020, 0.008, 4.000, 0.001, 1.000,
            0.005, 0.080, 0.800, 0.200, 0.050, 7.000, 0.005, 0.002, 8.000, 0.008],
  timeSec: [0.425, 0.098, 0.052, 0.686, 0.066, 0.078, 0.070, 0.375, 0.058, 0.136,
            0.052, 0.063, 0.183, 0.087, 0.066, 0.558, 0.066, 0.068, 0.610, 0.057]
};

trainXs = tf.tensor2d(trainData.sizeMB, [20, 1]);
trainYs = tf.tensor2d(trainData.timeSec, [20, 1]);
testXs = tf.tensor2d(testData.sizeMB, [20, 1]);
testYs = tf.tensor2d(testData.timeSec, [20, 1]);

Tensors are a generalization of matrices to an arbitrary number of dimensions. The number of dimensions and size of each dimension is called its shape. In the context of tensors a dimension is often called an axis.

Produce predictions from the test set
Measure how close those predictions are to the target for the test data

2.1.4 Defining a Simple Model

In the context of deep learning the function from input features to targets is called a model. Sometimes the word network is used in place of the word model. This model will be an implementation of linear regression.

Regression in machine-learning means the model will output real valued numbers by attempting to match the training numbers. This is in contrast to classification in which outputs choices from a set of options.

Linear regression is a specific type of model whose output as a function of input is illustrated a a straight line. An important property of models is that they are tunable.. This means the input output computation can be adjusted. Because it’s linear it’s always the case that it is a straight line however the slope and y-intercept can be adjusted to better fit the relationship.

const model = tf.sequential();
model.add(tf.layers.dense({inputShape: [1], units:1}))

The core building block of a neural network is the layer

This means a layer is a data processing module that is a tunable function from tensor to tensor. The network consists of a single dense layer. At its core the dense layer is a tunable multiply add function. The layer in the example above has a constraint on the input layer as defined by the parameter inputShape. This means it expects a 1D tensor with exactly one value. The output is always a 1D tensor for each example and the size of that dimension is set by the units parameter. We only want one number because we are trying to predict one number timeSec. Th linear model is

y = m * x + b
/* or... */
timeSec = kernel * sizeMb + bias

For the training data the variables timeSec and sizeMb are fixed. The kernel and bias are the models parameters and are chosen randomly. This will not produce results desired instead the search for these values or weights is called the training process has to learn form the data in order to acheive the results.

To find a good setting for kernel and bias we need two values: - A measure that tells us how well a given set of weights perform - A method that updates the current weights so that next time the model performs better according to the measure above.

In order to make the network ready for training we need to pick a measure and update method. In TensorFlow.js this is called the model compilation step. This step accepts as Loss function which measures errror in the network. A lower value in the loss function is better and allows movement towards the best fit. In general we should be able to plot the loss over time and see it decreasing. If the model trains for a long time and does not decreasee the loss it means the model is not learning to fit the data.

The optimizer or the algorithm by which the network updates its weights based on the data and the loss function.

2.1.5 Fitting the Model to the Training Data



modelOutput = [1.1, 2.2, 3.3, 3.6]
targets =     [1.0, 2.0, 3.0, 4.0]

meanAbsoluteError = average(absolute(modelOutput - targets))
/* this gives  meanAbsoluteError = 0.25 */
/* or computed w/ tf.js lib */
tf.mean(tf.abs(tf.sub(testData.timeSec, 0.295 || tf.mean(testData.timeSec)))).print()
model.compile({optimizer: 'sgd', loss: 'meanAbsoluteErrror'})

SGD here stands for stochastic gradient ddescent which uses calculus to determine which adjustments we should make in order to reduce the loss. This process is repeated until the model is ready to be fit to the training data.

Training a model is done by calling the model’s fit() method. We pass in our sizeMb as the input and get timeSecas output. We also specify epochs variable to tell the model we want to go through the training data 10 times. In deep learning the iteration through the complete training data is called an epoch.

The evaluate() is similar to the fit() function however it doesn;t update the weights When providing test data to this function we are given the loss function value averaged across all the values. This is another way of saying it provides the mean average error. Above we computed the mean absolute error by hand