Thanks for your re-implementation of temperature scaling! Although I understand this work is from a few years back, I am still wondering why the positively initiated temperature T will never turn negative during the optimization of NLL.
Both your implementation and https://github.com/gpleiss/temperature_scaling (as mentioned in Guo et al., 2017) seem to have no obvious mechanism that can prevent the temperature T from turning negative.
However, if I am not mistaken, the positivity of T is essential to not changing the prediction of the original uncalibrated classifier (i.e., if T turns negative, a class with a negative logit can have a larger probability than a class with a positive logit.)
Thanks for your re-implementation of temperature scaling! Although I understand this work is from a few years back, I am still wondering why the positively initiated temperature T will never turn negative during the optimization of NLL.
Both your implementation and https://github.com/gpleiss/temperature_scaling (as mentioned in Guo et al., 2017) seem to have no obvious mechanism that can prevent the temperature T from turning negative.
However, if I am not mistaken, the positivity of T is essential to not changing the prediction of the original uncalibrated classifier (i.e., if T turns negative, a class with a negative logit can have a larger probability than a class with a positive logit.)