If you imagine the life of a machine learning researcher, you might think it’s quite glamorous. You’ll program self-driving cars, work for the biggest names in , and your software could even lead to the downfall of humanity. So cool! But, as a new survey of scientists and machine learners shows, those expectations need adjusting, because the biggest challenge in these professions is something quite mundane: cleaning dirty .

This comes from a survey conducted by data science community Kaggle (which was acquired by Google earlier this year). Some 16,700 of the site’s 1.3 million responded to the questionnaire, and when asked about the biggest barriers faced at work, the most common answer was “dirty data,” followed by a lack of talent in the field.

But what exactly is dirty data, and why is it such a problem?



Source link

No tags for this post.

LEAVE A REPLY

Please enter your comment!
Please enter your name here