Notes on chapter two of Deep Learning with Javascript

**Goals**

Cover tensors, modeling, and optimization what they are and how thework and how to use them appropriately.

### 2.1.3 Creating and Formatting the Data

Below are the steps to take in order to accept a file size and output an accurate prediction of download time. There is a one to one correspondance between the emlements in the two variables `timeSec`

and `sizeMb`

this allows a prediction to be made based on a sizeMb entry. There are two sets of this data `trainData`

and `testData`

, the first contains the training data and test data contains the _example_s with test data. They are so that the algorithm looks at new data when making a prediction. The output is called a *target*. The input is called *features*. This example has exactly one *feature* `sizeMb`

, and one *target* `timeSec`

. The goal is to train the model so that the future inputs are able to be used by the model to accurately predict download time.

##### Workflow

Train a neural network to make accurate predictions of `timeSec`

given `sizeMb`

.

###### Convert JavaScript data structures into tensors or the format that TensorFlow.js understands.

```
const trainData = {
sizeMB: [0.080, 9.000, 0.001, 0.100, 8.000, 5.000, 0.100, 6.000, 0.050, 0.500,
0.002, 2.000, 0.005, 10.00, 0.010, 7.000, 6.000, 5.000, 1.000, 1.000],
timeSec: [0.135, 0.739, 0.067, 0.126, 0.646, 0.435, 0.069, 0.497, 0.068, 0.116,
0.070, 0.289, 0.076, 0.744, 0.083, 0.560, 0.480, 0.399, 0.153, 0.149]
;
}const testData = {
sizeMB: [5.000, 0.200, 0.001, 9.000, 0.002, 0.020, 0.008, 4.000, 0.001, 1.000,
0.005, 0.080, 0.800, 0.200, 0.050, 7.000, 0.005, 0.002, 8.000, 0.008],
timeSec: [0.425, 0.098, 0.052, 0.686, 0.066, 0.078, 0.070, 0.375, 0.058, 0.136,
0.052, 0.063, 0.183, 0.087, 0.066, 0.558, 0.066, 0.068, 0.610, 0.057]
;
}
= tf.tensor2d(trainData.sizeMB, [20, 1]);
trainXs = tf.tensor2d(trainData.timeSec, [20, 1]);
trainYs = tf.tensor2d(testData.sizeMB, [20, 1]);
testXs = tf.tensor2d(testData.timeSec, [20, 1]); testYs
```

Tensors are a generalization of matrices to an arbitrary number of dimensions. The number of dimensions and size of each dimension is called its *shape*. In the context of tensors a dimension is often called an *axis*.

###### Produce predictions from the test set

###### Measure how close those predictions are to the target for the test data

### 2.1.4 Defining a Simple Model

In the context of deep learning the function from input features to targets is called a *model*. Sometimes the word *network* is used in place of the word *model*. This model will be an implementation of linear regression.

*Regression* in machine-learning means the model will output real valued numbers by attempting to match the training numbers. This is in contrast to *classification* in which outputs choices from a set of options.

Linear regression is a specific type of model whose output as a function of input is illustrated a a straight line. An important property of models is that they are tunable.. This means the input output computation can be adjusted. Because it’s linear it’s always the case that it is a straight line however the slope and y-intercept can be adjusted to better fit the relationship.

```
const model = tf.sequential();
.add(tf.layers.dense({inputShape: [1], units:1})) model
```

The core building block of a neural network is the layer

This means a layer is a data processing module that is a tunable function from tensor to tensor. The network consists of a single dense layer. At its core the dense layer is a tunable multiply add function. The layer in the example above has a constraint on the input layer as defined by the parameter `inputShape`

. This means it expects a 1D tensor with exactly one value. The output is always a 1D tensor for each example and the size of that dimension is set by the `units`

parameter. We only want one number because we are trying to predict one number `timeSec`

. Th linear model is

```
= m * x + b
y /* or... */
= kernel * sizeMb + bias timeSec
```

For the training data the variables timeSec and sizeMb are fixed. The kernel and bias are the models parameters and are chosen randomly. This will not produce results desired instead the search for these values or *weights* is called the training process has to learn form the data in order to acheive the results.

To find a good setting for kernel and bias we need two values: - A measure that tells us how well a given set of weights perform - A method that updates the current weights so that next time the model performs better according to the measure above.

In order to make the network ready for training we need to pick a measure and update method. In TensorFlow.js this is called the model compilation step. This step accepts as *Loss function* which measures errror in the network. A lower value in the loss function is better and allows movement towards the best *fit*. In general we should be able to plot the loss over time and see it decreasing. If the model trains for a long time and does not decreasee the loss it means the model is not learning to fit the data.

The optimizer or the algorithm by which the network updates its weights based on the data and the loss function.

### 2.1.5 Fitting the Model to the Training Data

```
= [1.1, 2.2, 3.3, 3.6]
modelOutput = [1.0, 2.0, 3.0, 4.0]
targets
= average(absolute(modelOutput - targets))
meanAbsoluteError /* this gives meanAbsoluteError = 0.25 */
/* or computed w/ tf.js lib */
.mean(tf.abs(tf.sub(testData.timeSec, 0.295 || tf.mean(testData.timeSec)))).print()
tf.compile({optimizer: 'sgd', loss: 'meanAbsoluteErrror'})
model
```

SGD here stands for *stochastic gradient ddescent* which uses calculus to determine which adjustments we should make in order to reduce the loss. This process is repeated until the model is ready to be fit to the training data.

Training a model is done by calling the model’s `fit()`

method. We pass in our sizeMb as the input and get timeSecas output. We also specify `epochs`

variable to tell the model we want to go through the training data 10 times. In deep learning the iteration through the complete training data is called an *epoch*.

The `evaluate()`

is similar to the `fit()`

function however it doesn;t update the weights When providing test data to this function we are given the loss function value averaged across all the values. This is another way of saying it provides the mean average error. Above we computed the mean absolute error by hand