Tracking Big Data’s Evolution in Data Science, AI & Machine Learning
Machines have always fascinated Piatetsky-Shapiro – ever since he was a kid reading stories about robots by Isaac Asimov and other sci-fi authors.
He discovered his love for programming while studying computer science at Technion, where he spent a few weeks in the summer of his first year programming a computer (in APL) to play battleships. “I was soundly defeated by my own program,” he says. “That gave me an appreciation for the abilities of technology. I became more interested in creating programs than playing them.”
Piatetsky-Shapiro’s passion for understanding data and helping others stay up to date on developments in databases led him to launch the first Knowledge Discovery in Databases workshop in 1989, which later grew into full-fledged KDD conferences.
In 1993, after the third KDD workshop, he started KDnuggets News, an e-newsletter focused on data mining and knowledge discovery. The first issue went to 50 researchers who attended the workshop. Today, the KDnuggets brand has more than 200,000 subscribers across email, Twitter, Facebook and LinkedIn. With over 500,000 visitors in October 2017, KDnuggets.com has become a go-to resource for data science and analytics news, software, jobs, courses, education and more.
Piatetsky-Shapiro is one of the leading voices in Big Data – a field he says is somewhat amorphous, encapsulating infrastructure and database management, and closely connected to data science, machine learning and artificial intelligence.
(Note: what is now called “data science” was earlier called “data mining” or “knowledge discovery” but it refers to the same field dedicated to analyzing and understanding data and extracting useful knowledge from it.)
Exciting AI Advances with Deep Learning
“What is really most exciting now is deep learning,” he says.
While the concept of multi-level (deep learning) neural networks has been around since the 1960s, there wasn’t enough data, computer power or clever algorithms to use them effectively. But in the past few years, this approach– rebranded as “deep learning”– received sufficient data and processing powers and has been achieving amazing feats almost every week.
Examples of Deep Learning Breakthroughs
There are many examples of deep learning being deployed today.
Consumers who speak to their smartphone assistants like Siri or Cortana, or to Amazon Alexa or Google Home, are getting good results thanks to deep learning.
Google’s recent advances in machine translation are another big advance, thanks to deep learning.
It used to be that computers would do machine-based translation by using hand-crafted rules derived by thousands of linguistic experts. However, powered by large amounts of text and advanced Deep Learning network, in 2016 Google switched to Google Neural Machine Translation, which eliminates all manual rules and translates entire sentences at a time. This has significantly improved the quality of translations.
Finally, Piatetsky-Shapiro mentioned AlphaGo, a computer program developed by Google DeepMind to play the ancient Chinese game of Go. In 2016, AlphaGo, trained partly on thousands of human championship games, defeated world champion Lee Sedol 4:1.
In 2017, an improved version called AlphaGo Zero, combined Deep Learning and Reinforcement Learning methods and learned to play from scratch, entirely by self-play. After three days and a few million games, the new version reached the level of program that defeated Go world champion in 2016. After 40 days, AlphaGo Zero achieved superhuman level and defeated the previous version 100:0.
Today, it’s considered the strongest Go player in history.
Causes for Concern
“It’s very exciting and it’s also very scary,” Piatetsky-Shapiro says. As AlphaGo Zero improved its game play, it began choosing very different moves than human experts on a more frequent basis.
The period of time when humans and computers collaborate to solve problems might not last very long. It’s not a matter of if, but when computers will be able to do jobs better than us. The question we should be asking now is What will humans do?
In the short term, Piatetsky-Shapiro says he’s concerned about the use of technology to automate repetitive tasks previously done by humans. Even devices with limited intelligence will be able to complete jobs that are structured and require a lot of repetition. For example, toll booths on Mass Pike were removed and the job of collectors replaced by EZ-pass radio technology and taking photos of license plates and recognizing the numbers – a limited form of computer vision.
In regards to the developments in the field of Artificial General Intelligence (AGI) – machine learning that is able to perform the same intellectual tasks that humans can – Piatetsky-Shapiro tends to side with entrepreneur Elon Musk and physicist Stephen Hawking. It could put humanity at risk.
“I think we are not likely to have AGI in the next 10 years, but people, in general, have very poor track record of predicting long-term events.”
— KDnuggets (@kdnuggets) January 5, 2018
Even if the probability of AGI is small, its impact could be huge. A program like AlphaGo Zero demonstrates that computers can achieve super human ability in a relatively narrow field and that once they do it, they are probably using a different logic than we do, Piatetsky-Shapiro says.
“What if AGI values are not aligned with what we want to do? That’s a serious problem.”
While the AI Now Institute was founded this year at NYU to address the problem of incorporating values training in AI, Piatetsky-Shapiro says he doesn’t think we should pretend there would be any guarantees about the way programs behave. Just like a parent can’t guarantee their children won’t rebel against the values they’re raised with, we shouldn’t assume machines would always follow the rules we put in place.
“If it is really intelligent, it will have its own values.”
How Businesses Should Approach Artificial Intelligence
There are no best practices yet for companies wanting to incorporate AI and machine-learning into their business strategies today because the technology has only been viable for a few years. With that in mind, brands need to be aware of both the capabilities of these tools, but also the limitations.
He shared three guidelines to follow or be aware of when using AI:
- In order to use these tools effectively, companies need large sets of data – at least 100,000 examples. The more recent the data and the more frequent the data, the more effective the predictions will be.
- Make sure there are people in the organization who understand the technology and know what will lead to development.
- Have realistic expectations. There’s a lot of randomness when it comes to predicting human behavior. If you build a model that gives you perfect predictions, chances are you probably have false predictors.
To better manage and leverage all the data they’re collecting, Piatetsky-Shapiro recommends enabling more interactive access.
“I think the approach of dumping everything together in one Data Lake and hoping you’ll discover something is probably not very useful,” he says.
Instead, have specific goals you want to answer and look at the data with the goals in mind. Look at what gives the best return on investment and what gives value. Many of the Big Data projects that create big data lakes are not able to show ROI.
“Start with business value and proceed from there,” he says.
Finally, invest in good quality data visualization. Humans are still the best at interpreting data, so the visuals should clearly present patterns that allow business stakeholders to make better decisions.
In tomorrow’s part 2 of this interview, Piatetsky-Shapiro discusses self-driving Artificial Intelligence and how businesses can approach it.
For a more Big Data insights, check out our report, 2018 Big Data Trends: Liberate, Integrate & Trust, to see what every business needs to know in the upcoming year, including 5 key trends to watch for in the next 12 months!
Bigdata and data center