I am training a feedforwardnet with gradient descent traingd as backpropagation algorithm to predict times table.
X = [repmat([1:10]', 10, 1) repelem([1:10]', 10)];
y = X(:, 1) .* X(:, 2);
net = feedforwardnet(8); % Create a neural network with 8 neurons in the hidden layer
net.layers{1}.transferFcn = 'logsig'; % Hidden layer activation function set to logsig
net.trainFcn = 'traingd'; % Set backpropagation algorithm to gradient descent
net.divideParam.trainRatio = 0.6;
net.divideParam.testRatio = 0.2;
net.divideParam.valRatio = 0.2;
[net, TR] = train(net, X', y'); % Train the network
Whenever I first train the network the lowest validation error is too high as you can see below.

But then if I change my backpropagation algorithm to Levenberg-Marquardt trainlm, train the network and then switch back to gradient descent traingd and train again then my lowest validation error starts making sense.
Why is my training failing for the first time when I train it using gradient descent?


More answer...