It’s taken Microsoft quite a while to get traction in the cloud, and even longer for it to get its cloud data story right. For the longest time, things weren’t looking good. I say that as someone who has worked with – and at various times championed – Microsoft technology for most of my career. As much as I’ve wanted Microsoft to do well in the cloud data arena, I thought it was doomed to an eternity of near misses.
But things have been steadily improving since the summer, especially in the last few weeks. The glass that was half empty in the spring is now nearly full, with a complete HDInsight Big Data service based on Hadoop 2.0; an able machine learning service called Azure Machine Learning; a document store NoSQL database called DocumentDB; a publish-subscribe service for capturing streaming data called Event Hubs; a service for processing and analyzing that data called Azure Stream Analytics; a data transformation workflow service called Data Factory; and an eponymous Search service based on ElasticSearch at its core.
Beyond all of these “house brand” products, partnerships announced in the past two weeks mean that customers can or will soon be able to spin up Hadoop clusters based on Cloudera’s Distribution of Hadoop (CDH) and Hortonworks Data Platform (HDP), running on either Linux or Windows; IBM’s Cloudant NoSQL database, based on BigCouch and Apache CouchDB, is also available; and so is IBM’s relational database standby, DB2. Oracle and DataStax provide access to Oracle 12c and Cassandra on Azure, and other partners allow customers to run MySQL, PostgreSQL and MongoDB.