Official Description

9.0 Continuing Education Units (CEUs)

Overview of various aspects of large data sets and how they are managed both on site and in the Cloud. Emphasis on hands-on experience from data ingestion to analysis of large data sets, both data-at-rest and data-in-motion (streaming data), including defining Big Data and its 5 V's: Volume, Velocity, Variety, Veracity, and Value.


Supplementary Information

24 hours of lectures and 66 hours of independent study; course includes synchronous and asynchronous activities.

Topics Covered

  • Distributed Storage & Distributed Processing
  • Analyzing Structured & Un-structured data at scale
  • Row Oriented and Columnar Oriented Files Formats
  • NOSQL Databases main categories
  • Ingesting Data at Scale (High Velocity & Large Volume)
  • Apache Hadoop and Apache Spark ecosystems (On-premise and in the Cloud)
  • Scala language overview

Learning Outcomes

The course is designed to enable you to:

  • Use Hadoop & Spark to store and process data at scale (using MapReduce and Spark)
  • Use Hive, Impala and Spark SQL to analyze data at scale (using Hive Query Language, Scala, Python)
  • Use Pig to analyze unstructured data at scale (using Pig Latin language)
  • Improve querying data time (using Avro and Parquet files formats)
  • Import and Export data at scale (using Sqoop)
  • Install and configure ODBC/JDBC connectors to connect third party tools (MS Excel, Tableau Software, MicroStrategy, etc...)
  • Build Nifi Data Flows to ingest, transform and route data at scale
  • Implement Real-Time dashboard (using Nifi, HBase, Kafka, Banana Solr Dashboard)


This course is supported by DataCamp, the most intuitive learning platform for data science. Learn R, Python and SQL the way you learn best through a combination of short expert videos and hands-on-the-keyboard exercises. Take over 100+ courses by expert instructors on topics such as importing data, data visualization or machine learning and learn faster through immediate and personalized feedback on every exercise.

Prerequisite(s) and Corequisite(s)

Applies Towards the Following Programs

Section(s) offered
Section Title
Data at Scale
Language of Delivery
Online Course, fixed date
6:00PM to 9:00PM
Jan 11, 2024 to Mar 21, 2024
Online Course, fixed date
6:00PM to 8:00PM
Mar 28, 2024
Schedule and Location
Contact Hours
Delivery Format(s)
Course Fee(s)
Tuition Fee non-credit $1,482.60 Click here to get more information
Drop Request Deadline
Oct 01, 2023 to Jan 18, 2024
Transfer Request Deadline
Oct 01, 2023 to Jan 18, 2024
Withdrawal Request Deadline
Jan 18, 2024 to Jan 25, 2024
Section Notes

A minimum number of registrations is required for this course section to be offered. The School reserves the right to cancel any course section when a minimum number of registrations has not been reached 7 days prior to the start date. In the event of a cancellation, the course fee will be refunded in full.

Course Format

This online course is a combination of weekly live online instructor-led sessions from 6:00 - 7:30 PM and self-directed learning activities and assignments.

Course Drop/Withdrawal Policy

  • Any time prior to the 1st class: Course Drop Period with Full Refund.
  • After the 1st and before the 2nd class: Course Withdrawal Period with Full Refund.
  • After the 2nd class before the 3rd class: Course Withdrawal with No Refund.


Required fields are indicated by .