IL - Designing and Implementing Big Data Analytics Solutions - Exam 70-475

Course Overview

This course provides a comprehensive introduction to designing and implementing big data solutions on Microsoft Azure. The course covers all the major steps common to any analytics pipeline from ingest, to processing, storage and analysis. Coverage includes designing for the range of processing options found in analytics solutions- batch processing, interactive analytics and real-time processing. Security, encryption and data governance capabilities are also covered. With the major components understood, the course then turns to adding intelligence to the pipeline through the application of machine learning. The course concludes with the options available to operationalize the end-to-end analytics solutions.


  • 5 Days

What You Will Learn

  • Understand the key capabilities of several Azure Data, Storage, Analytics and Intelligence services
  • Understand the core storage services including Data Lake Store, Blob Storage, HDFS, Event Hubs and IoT Hubs
  • Understand core processing services including HDInsight, Stream Analytics, SQL Data Warehouse and Data Lake Analytics
  • Understand how to operationalize data pipelines with Data Factory
  • Understand common architectures including Lambda and Kappa architectures
  • Understand how to manage and secure the data solution

Who this course is designed for

  • Data Professionals, Data Scientists

Module 1: Overview of the Azure Analytics Platform

In this module, students will learn the basics of analytics pipeline terminology and where the Microsoft Azure services fit. This module introduces the Lambda Architecture, which is used as a reference architecture for building an analytics data pipeline.

Module 2: Bulk and relational ingest

In this module, students will be introduced to the various tools and protocols available for the loading of data from bulk and relational sources for ingestion into an Azure based analytics pipeline.

Module 3: Ingest storage

In this module, students will be introduced to the Microsoft Azure services that support batch storage of ingested data: Azure Storage Blobs, Data Lake Store and HDFS.

Module 4: Batch Processing

In this module, students will be introduced to some of the services offered by Microsoft Azure that support the batch processing of data at scale. Topics include the application of HDInsight to perform batch processing the MapReduce, Tez and Spark. Similarly, SQL Data Warehouse is introduced to support processing of data present in batch storage.

Module 5: Interactive Processing & Querying

In this module, students will be introduced to the services which enable lower latency, interactive querying of big data. Students will learn various options for querying data using SQL. Service covered include Azure SQL Data Warehouse, HDInsight with Spark SQL, HDInsight with HBase/Phoenix and performing analytics with Data Lake Analytics with USQL.

Module 6: Real-Time Ingest & Storage

In this module, students will learn about the protocols for real-time ingest including HTTP, AMQP and MQTT and the storage of data received using queue based services including Event Hubs and IoT Hub.

Module 7: Real-time Processing

In this module, the student will learn about different services and capabilities of Azure for processing ingested real-time data. Key concepts such as tuple-at-time and micro-batch processing are introduced. Services covered include HDInsight with Apache Storm, HDInsight with Storm/Trident, HDInsight with Spark Streaming, Web Jobs, Azure Functions, and Stream Analytics.

Module 8: Intelligence & Machine Learning

In this module, student will understand the fundamentals of machine learning using Azure Machine Learning. Covered topics include ML Studio, Training Experiments, Predictive Experiments and operationalizing experiments with Web Services and Cortana Intelligence components.

Module 9: Data Pipelines

This module will help the student pull all the pieces together into a pipeline managed under a single pane of glass by using Azure Data Factory.

Module 10: Security & Governance

In this concluding module, the student will look horizontally across the data pipeline to understand how to secure the data at rest and in transit, as well to enable governance and discovery with services such as Azure Data Catalog.
Schedule a delivery

Delivery Options

  • Onsite Instructor-Led (private)
  • Virtual Instructor-Led (private)
  • Onsite Instructor-Led (open)
  • Virtual Instructor-Led (open)

Upcoming Open Deliveries