I’m trying to create a 4 node Hadoop cluster on AWS using the Cloudera distribution for learning purposes.
I’ve done it manually (without Cloudera) and ended up spending more time just learning about Linux/Networking/Matching correct software versions etc. and less time using Hadoop/Map Reduce/Hive, which is what I really want to focus on.
I’d much rather learn these Hadoop technologies in a multi node cluster environment instead of a single node VM. I feel like I’m not really learning everything when it’s just a single node. I also want to work with the Cloudera distro in particular and use HUE.
I’ve found videos/articles from a few years ago that reference being able to set up a Cloudera Live account and run it on AWS, however I can’t find anything like that on Cloudera’s website anymore.
Does anyone have advice? Can this be done relatively simply and low cost? Are there better ways to practice on a multi node cluster using Cloudera?