I have a neural network taking in 100 MLB baseball player plate appearances as a time series. Each sequence is paired with a single continuous number corresponding to the future OPS (a baseball statistic) over the following 100 plate appearance window. Essentially I’m using the plate appearance outcomes for a player to predict their future performance.
However, baseball is subject to a lot of random noise. A batter can go into a slump for no reason other than random chance, and future performance can vary significantly as well. If possible, I’d like the neural network to output its confidence in its performance prediction alongside the prediction itself.
Is there any documented way to do this? Currently, my output consists of a single number corresponding to its performance prediction, and I’m using squared error as my loss function. However this doesn’t capture the network’s confidence in its own prediction.
I thought about discretizing the output and using softmax over, say, a dozen or so buckets. However, softmax is TOO categorical for this purpose. If the NN says there is a 50% probability future performance falls within [0.5, 0.6), and 50% probability within [0.6, 0.7), but actual performance is 0.45, the loss from the [0.6, 0.7) bucket should be weighted higher than the loss from the [0.5, 0.6) bucket because the values in that bucket are farther away from the actual performance. Softmax doesn’t capture this and would treat every bucket equally.
Is there a loss function engineered for such a purpose?