Ostensibly, Microsoft previously considered its competition to be on-premises Hadoop implementations. But it now offers pricing that is far more competitive with Amazon Web Services‘ (AWS’) Elastic MapReduce (EMR), while still offering a 3-nines service level agreement (SLA) as a differentiator.
The pricing changes, highlighted in a blog post by Microsoft’s Rimma Nehme and detailed on a separate page, offer varying price cuts depending on the virtual machine type used for the head and worker nodes in the HDInsight cluster. Price cuts are up to 52 percent, Microsoft says, while the service itself remains largely the same.
In addition, for those customers wishing to run data science workloads with code written in R, the surcharge for running R Server in a distributed fashion on an HDI cluster has been cut by 80 percent, down to just $0.016 (i.e. 1.6 US cents) per CPU core, per hour.
Microsoft points out that because of Azure’s numerous global data centers (regions), HDI is available at more points of presence than any other cloud Hadoop service. In addition to Azure’s mainstream cloud, the service is also available on its US government cloud and on so-called sovereign clouds, including those in Germany and China. Per various regulatory requirements, the sovereign clouds run in facilities operated by local partners, rather than Microsoft itself.
In other news
Microsoft has a few other announcements to go with the price change:
- The introduction, available in preview, of HDInsight Enterprise Security Package, which integrates Microsoft Active Directory with Apache Ranger. This is essentially a re-branding of HDInsight’s Premium cluster tier
- General availability of its Apache Kafka cluster type, which had been in preview until recently
- General availability of HDInsight’s integration with Azure Log Analytics
- Public preview of support for Power BI DirectQuery, specifically against Hive LLAP, used in HDI “Interactive Query” cluster types
- New HDInsight add-in developer tools for IntelliJ, Eclipse and Visual Studio Code (Microsoft’s cross-platform code editor for MacOS, Linux and Windows). The IntelliJ and Eclipse tooling include the ability to submit and debug distributed Spark code right from those development environments. The VS Code tooling allows for interactive execution of PySpark (a Python library for Apache Spark) code.
I haven’t crunched the numbers yet, but HDInsight may still be slightly higher in price than AWS’ EMR, and is likely even a good chunk more than Google’s Cloud Data Proc. But Microsoft is betting that this new, lower pricing, combined with its SLA, global availability, and multiple cluster types, including R Server, will catch customers’ attention.
Microsoft is also likely putting its faith in HDI’s unique integrations with Power BI, Azure Data Lake Store and Azure Blob Storage to make a very compelling offering to Microsoft shops, as well as to platform-neutral customers looking to shift analytics workloads from running on-premises to the public cloud.
Bigdata and data center