Notes on chapter four of Deep Learning with Javascript

### 4.0.0

- Know how images and other perceptual data such as audio are represented as multidimensional tensors
- Know what convnets are, how they works and what makes them well suited to machine learning with images.
- Know how to write and train data to solve the task of classifying hand written digits
- Know how to train models with Node.js
- Know how to use convnets for spoken word recognition

## 4.1 Vectors to Tensors: Representing Image Data

Representing an image requires a 3D tensor. The first two dimensions are the height and width of the image, whereas the third dimension has a size of 3 and is referred to as the color channel. Grayscale or blackand white images only require a color channel tensor of size 1.

This method of encoding is called a

orheight width channelHWC

Multiple images can be batched together using a tensor shap of ** NHWC** where

*is the number of images in the batch*

**N**### 4.1.1 The MNIST Dataset

When a dataset is said to be balanced this means there are approximately equal numbers of examples for the categories in the classification.

## 4.2 Convnets

Given the problem of classifying an image into its handwritten number representation we know the input of the problem and an output format. The input will be the *NHWC* format and the output shape will be a tensor [null,10] where 10 is a one hot encoded tensor indicating which digit is represented in the image.

Convnets stands for

⁽ ¹ ⁾ convolutional is just a type of mathematical operationconvolutional¹ networks