I9;ve been using on a side project of mine where I train an algorithm to a video game just to see how well it works. The Neural Network decides what actions to perform in-game, then evaluates the of those actions. Then it stores a snapshot of the scenario it was in at the of executing the action, and a value that indicates the success level of that action. This forms the basis of my set.

I've noticed something happen that I can only explain by calling it "learned constriction". What I mean by this is that once the NN learns that something doesn't work in a given scenario enough times it'll just never try it again. This game has a lot of randomness elements to it, meaning the data produced will be quite noisy. Rather than just slowing down the learning rate of the NN, I would like to implement something to make it experiment a bit with what actions it tries to perform.

I'm wondering if there's something like "standard procedure" for this type of scenario to combat incorrect modeling of noisy data in an online learning . Is simple occasional random selection of what action to perform enough? Ideally I would be able to measure its in-game performance while also experimenting, but that's the best case scenario.

submitted by /u/Drag0nDr0p
[comments]



Source link
thanks you RSS link
( https://www.reddit.com/r/MachineLearning/comments/8dxpao/p_preventing_learned_constriction/)

LEAVE A REPLY

Please enter your comment!
Please enter your name here