I have seen many research papers use SGD and RMSProp in place of Adam.
By my knowledge adam is considered to be the best and the best default choice.

Moreover, since adam offers for every single I see no point why anyone would use any other optimizer. Please enlighten me fellow machine learning scientists.

Source link
thanks you RSS link
( https://www.reddit.com/r//comments/90xpb6/_why_do__use__or_any/)


Please enter your comment!
Please enter your name here