Northeastern University
Data Warehousing and Integration Part 2

5 days left: Discover new skills with 30% off courses from industry experts. Save now.

Diese kurs ist nicht verfügbar in Deutsch (Deutschland)

Wir übersetzen es in weitere Sprachen.
Northeastern University

Data Warehousing and Integration Part 2

Bei Coursera Plus enthalten

Verschaffen Sie sich einen Einblick in ein Thema und lernen Sie die Grundlagen.
1 Woche zu vervollständigen
unter 10 Stunden pro Woche
Flexibler Zeitplan
In Ihrem eigenen Lerntempo lernen
Verschaffen Sie sich einen Einblick in ein Thema und lernen Sie die Grundlagen.
1 Woche zu vervollständigen
unter 10 Stunden pro Woche
Flexibler Zeitplan
In Ihrem eigenen Lerntempo lernen

Kompetenzen, die Sie erwerben

  • Kategorie: CI/CD
  • Kategorie: Scalability
  • Kategorie: Data Warehousing
  • Kategorie: Amazon Redshift
  • Kategorie: Data Governance
  • Kategorie: Data Quality
  • Kategorie: Cloud Computing
  • Kategorie: Data Pipelines
  • Kategorie: Data Architecture
  • Kategorie: Extract, Transform, Load
  • Kategorie: Infrastructure as Code (IaC)
  • Kategorie: Analytics
  • Kategorie: Database Architecture and Administration
  • Kategorie: Amazon S3
  • Kategorie: Cloud Computing Architecture
  • Kategorie: Data Transformation
  • Kategorie: DevOps
  • Kategorie: Data Integration

Wichtige Details

Zertifikat zur Vorlage

Zu Ihrem LinkedIn-Profil hinzufügen

Kürzlich aktualisiert!

August 2025

Bewertungen

9 Aufgaben

Unterrichtet in Englisch

Erfahren Sie, wie Mitarbeiter führender Unternehmen gefragte Kompetenzen erwerben.

 Logos von Petrobras, TATA, Danone, Capgemini, P&G und L'Oreal

In diesem Kurs gibt es 6 Module

In this module, you'll learn about ETL (Extract, Transform, Load) processes, an essential part of Data Warehousing and Data Integration solutions. ETL processes can be complex and costly, but effective design and modeling can significantly reduce development and maintenance costs. You'll be introduced to the basics of Business Process Modeling Notation (BPMN), which is crucial for modeling business processes. We’ll focus on the basics of BPMN, including key components such as flow objects, gateways, events, and artifacts, which are essential for modeling business processes. You will explore how BPMN can be customized to conceptual modeling of ETL tasks, with a particular focus on differentiating control tasks from data tasks. Control tasks manage the orchestration of ETL processes, while data tasks handle data manipulation, both of which are critical in conceptualizing ETL workflows. By the end of this module, you’ll gain a solid understanding of how to design ETL processes using BPMN, enabling greater flexibility and adaptability across various tools.

Das ist alles enthalten

2 Videos8 Lektüren2 Aufgaben

In this module you will dive into Talend Studio, a powerful Eclipse-based data integration platform that transforms complex ETL operations into intuitive visual workflows. By explorating Talend's drag-and-drop interface, you will learn to navigate the core components of the platform. You’ll master fundamental ETL operations by studying essential components like tMap for complex data transformations and joins, tJoin for straightforward data linking, and various input/output components for connecting to databases, files, and APIs. By the end of the module you will understand how Talend automatically generates executable Java code from visual designs, enabling you to create scalable, production-ready data integration solutions that can handle both batch processing and real-time data scenarios across diverse technological environments.

Das ist alles enthalten

3 Lektüren1 Aufgabe

In this module, we transition from on-premises Data Warehousing to Data Engineering. While Data Engineering has its roots in Data Warehousing, it encompasses much more. We’ll explore the key enablers of this evolution, specifically cloud computing and DevOps. You will learn about the benefits of cloud development, including enhanced scalability, cost efficiency, and flexibility in data operations. We will also dive into how traditional IT infrastructure components—such as security, networking, and compute resources—are redefined in cloud environments using AWS. Additionally, you'll gain an understanding of DevOps in the cloud, focusing on the use of virtual machines and containers to streamline continuous integration and deployment. We will cover key DevOps practices like Infrastructure as Code (IaC), CI/CD pipelines, and automated testing, emphasizing their role in ensuring consistency, faster development cycles, and secure applications. You will then explore what data engineering entails and the skills required to become a data engineer. Finally, we’ll introduce the concept of the data engineering lifecycle and its different phases, focusing on the first two: Data Generation and Storage.

Das ist alles enthalten

1 Video12 Lektüren2 Aufgaben

In this module, we will explore the next two phases of the data engineering lifecycle: Ingestion and Transformation. Data ingestion refers to the process of moving data from source systems into storage, making it available for processing and analysis. As you delve into the reading, you will examine key ingestion patterns, including batch versus streaming ingestion, synchronous versus asynchronous methods, and push, pull, and hybrid approaches. You’ll also explore essential engineering considerations such as scalability, reliability, and data quality management, along with the challenges posed by schema changes. The reading will introduce various technologies that enable data ingestion, such as JDBC/ODBC, Change Data Capture (CDC), APIs, and event-streaming platforms like Kafka. We then shift focus to the transformation phase of the lifecycle, exploring different types of transformations that integrate complex business logic into data pipelines. At the end of the module, we will focus on data architecture and implementing good architecture principles to build scalable and reliable data pipelines.

Das ist alles enthalten

4 Videos12 Lektüren2 Aufgaben2 App-Elemente

In this module, we will explore data characteristics and how they drive infrastructure decisions. In today’s data-driven world, understanding the properties of your data is essential for designing robust data pipelines. We’ll go over key characteristics like volume, which refers to the size of datasets, and velocity, which concerns how frequently new data is generated. We’ll also take a look at variety, which focuses on data formats and sources, and veracity, which emphasizes data accuracy and trustworthiness. The ultimate goal is to uncover value from data through insightful analysis. As we delve into pipeline design, you'll learn how these characteristics influence key decisions, such as the choice of storage, processing, and analytics tools. We will also cover essential AWS services like Amazon S3, Glue, and Athena, exploring how they support scalable and flexible data engineering. By the end of this module, you’ll have a comprehensive understanding of how to build effective data solutions to meet both technical and business needs.

Das ist alles enthalten

6 Lektüren1 Aufgabe

Welcome to the final stage of the data engineering lifecycle: serving data. In this module, we will focus on how to effectively serve data for analytics, machine learning (ML), and reverse ETL to ensure that the data products you design are reliable, actionable, and trusted by stakeholders. Key topics include setting SLAs, identifying use cases, evolving data products with feedback, standardizing data definitions, and exploring delivery methods such as file exchanges, databases, and streaming systems. We’ll also cover the use of reverse ETL to improve business processes and discuss the importance of context for choosing the best visualization type and tools. We then delve into KPIs and metrics and how to classify them, including how to identify robust KPIs based on the business context. Finally, we will focus on creating intuitive dashboards by choosing the right analysis, visualizations, and metrics to showcase based on the business context and audience involved. By the end of this module, you will understand how to design and serve data solutions that drive meaningful action and are trusted by end users.

Das ist alles enthalten

11 Lektüren1 Aufgabe

Dozent

Venkat Krishnamurthy
Northeastern University
3 Kurse341 Lernende

von

Mehr von Data Analysis entdecken

Warum entscheiden sich Menschen für Coursera für ihre Karriere?

Felipe M.
Lernender seit 2018
„Es ist eine großartige Erfahrung, in meinem eigenen Tempo zu lernen. Ich kann lernen, wenn ich Zeit und Nerven dazu habe.“
Jennifer J.
Lernender seit 2020
„Bei einem spannenden neuen Projekt konnte ich die neuen Kenntnisse und Kompetenzen aus den Kursen direkt bei der Arbeit anwenden.“
Larry W.
Lernender seit 2021
„Wenn mir Kurse zu Themen fehlen, die meine Universität nicht anbietet, ist Coursera mit die beste Alternative.“
Chaitanya A.
„Man lernt nicht nur, um bei der Arbeit besser zu werden. Es geht noch um viel mehr. Bei Coursera kann ich ohne Grenzen lernen.“
Coursera Plus

Neue Karrieremöglichkeiten mit Coursera Plus

Unbegrenzter Zugang zu 10,000+ Weltklasse-Kursen, praktischen Projekten und berufsqualifizierenden Zertifikatsprogrammen - alles in Ihrem Abonnement enthalten

Bringen Sie Ihre Karriere mit einem Online-Abschluss voran.

Erwerben Sie einen Abschluss von erstklassigen Universitäten – 100 % online

Schließen Sie sich mehr als 3.400 Unternehmen in aller Welt an, die sich für Coursera for Business entschieden haben.

Schulen Sie Ihre Mitarbeiter*innen, um sich in der digitalen Wirtschaft zu behaupten.

Häufig gestellte Fragen