3.3.2 Softmax Activation
The softmax activation function is the result of
softmax(x1,x2,x3,...,xn) = [ exp(n1)/ exp(n1) + exp(n2) + ... + exp(xn), exp(n2)/ exp(n2) + exp(n3) + ... + exp(xn) ]
3.3.3 Categorical Cross Entropy
The loss function for multiclass classification
Accuracy is a good metric for performance however as a loss function it suffers from the same zero gradient issue as binary classification.
Categorical crossentropy is a generalization of the binary crossentropy into the cases where there are more than two categories.