Share via

Data Lakes Stories of Relevance to Retail


The offline recommendations engine is used to batch-process user
telemetry to build multiple machine learning models to be hosted by the online
recommendations service. User interaction telemetry is stored in Azure Data
Lake Store for long-term storage. Competing versions of the user and product
vector models are generated, using the Apache Spark MLlib machine learning
library in Azure HDInsight using Python LightFM and TensorFlow. These are then
bulk-loaded by Azure Data Factory into Azure Cosmos DB.

Ignition AI:

Turning marketing from an art to a precise science requires
ingesting vast amounts of live information from numerous sources such as social
media, browsing data and previous purchasing patterns and deploying highly
specialised technology. Ignition AI uses Microsoft Azure Data Lakes to manage
the data because it provides a cloud based platform to store, process and
analyse any amount or type of data at incredible speeds.


Acxiom is using Azure Data Lake and Azure HDInsight to build a
data exchange platform and marketplace where it can share its data with third


Damco’s disruptive Logistics application makes extensive use of
Microsoft Azure cloud technologies to improve insight into potential supply
chain problems, reduce the amount of manual effort involved when workarounds
for disruptions are necessary, and proactively notify clients. The app uses
Azure Event Hubs and Azure Stream Analytics to combine and analyze external
data from news and weather feeds and internal data from the Damco supply chain
management solution. All of this data is ingested, parsed, processed, and
stored using Azure Data Factory, Azure Data Lake, and Azure SQL Data Warehouse,
and it is visualized so it can be easily turned into actionable business


Sustainalytics, a global responsible investment research firm,
adopted a broad range of Microsoft Azure services for processing and
integrating a large volume of structured and unstructured, high and low
velocity data as part of its research activity. Sustainalytics built a Data
Lake on top of Azure HDInsight in order to consolidate and streamline
structured and unstructured data from 20 different sources.