Big Data Management

2025/2026

Recommended prerequisite for participation in the module

The module builds upon fundamental concepts of data structures and algorithms, databases, programming, and cloud technologies. Moreover, the module uses competences from distributed systems and networking.

Content, progress and pedagogy of the module

This course focuses on the techniques and tools required to manage, process, and optimize large-scale data workflows. Students will explore advanced data storage concepts, including data lakes, lakehouses, and efficient file formats (e.g., Parquet, ORC). The course delves into both batch and real-time data processing frameworks (e.g., Apache Hadoop, Flink), teaching students to design ETL workflows and handle high-throughput, low-latency data streams. Topics also include scalable database management systems for unstructured and semi-structured data (e.g., HBase, Cassandra) and workflow orchestration tools (e.g., Apache Airflow). Practical exercises emphasize performance tuning and optimization, equipping students with the skills to build robust, efficient, and scalable big data systems for a variety of real-world applications.

Learning objectives

Knowledge

  • Must have knowledge about different data types such as big data, data lake, etc. 

  • Must have knowledge about data streams 

  • Understand the foundational principles of managing large-scale data, including storage, processing, and retrieval. 

  • Learn methods for structuring and optimizing data storage to support analytical and operational needs. 

  • Explore techniques for handling both batch and real-time data processing in distributed systems. 

  • Comprehend the architecture and functionality of scalable database systems for unstructured and semi-structured data. 

  • Understand strategies for optimizing data workflows, focusing on scalability, reliability, and performance. 

Skills

  • Design and implement workflows for data extraction, transformation, and storage across diverse datasets. 

  • Query and analyze large-scale data efficiently using optimized storage and processing techniques. 

  • Develop workflows to process high-throughput, low-latency data streams in real time. 

  • Implement scalable and reliable database systems for managing diverse data types. 

  • Monitor, troubleshoot, and optimize data workflows to meet performance and business objectives.

Competences

  • Design and manage comprehensive systems for processing and analyzing large-scale datasets. 

  • Evaluate and select appropriate data management techniques and tools for specific scenarios. 

  • Lead the development of scalable, efficient data pipelines that handle batch and real-time workflows effectively. 

  • Innovate and adapt data management practices to emerging technologies and evolving industry needs.

Type of instruction

The instruction will combine lectures, invited talks, assignments, and exercises 

Exam

Exams

Name of examBig Data Management
Type of exam
Written or oral exam
ECTS5
Permitted aids
With certain aids:
See exam specification
Assessment7-point grading scale
Type of gradingInternal examination
Criteria of assessmentThe criteria of assessment are stated in the Examination Policies and Procedures
Permalink Print

Facts about the module

Danish titleBig data management
Module codeESNCEKK2K2
Module typeCourse
Duration1 semester
SemesterSpring
ECTS5
Language of instructionEnglish
Empty-place SchemeYes
Location of the lectureCampus Copenhagen
Responsible for the module
Used in

Organisation

Education ownerMaster of Science (MSc) in Engineering (Computer Engineering)
Study BoardStudy Board of Electronics and IT
DepartmentDepartment of Electronic Systems
FacultyThe Technical Faculty of IT and Design

Søg i modulbasen

View all fonts in this project