Module Details

Module Code: DATA
Module Title: Data Engineering
Title: Data Engineering
Module Level:: 8
Credits:: 5
Module Coordinator: Nigel Whyte
Module Author:: Paul Barry
Domains:  
Module Description: To provide an overview of modern data engineering practices, tools, and methods.
 
Learning Outcomes
On successful completion of this module the learner will be able to:
# Learning Outcome Description
LO1 Clean and wrangle data from multiple sources into a usable state.
LO2 Organize the collection, processing, and storage of data from different data sources.
LO3 Design and build ETL and ELT processes and pipelines.
Dependencies
Module Recommendations

This is prior learning (or a practical skill) that is recommended before enrolment in this module.

No recommendations listed
Co-requisite Modules
No Co-requisite modules listed
Additional Requisite Information
No Co Requisites listed
 
Indicative Content
Data Formats
Understanding internet data-types: MIME, quoted-printable, Base64 (and others). Data Sources: TXT, CSV, JSON, Web Data, APIs, ERP, CRM, Databases. Structured data, Semi-structured data, and unstructured data.
Data Storage
SQL Databases, Document Databases, Graph Databases, Data Warehouses, Data Lakes, Dataframes.
ETL/ELT
Extract, Transform, and Load and Extract, Load, and Transform: data cleaning, munging, parsing, converting, mining, and saving.
Data Platforms
Big Data, Map Reduce, Cloud-scale data, distributed data processing, Data pipelines, Parallel Computation Platforms, Scaling Issues/Concerns.
Module Content & Assessment
Assessment Breakdown%
Project100.00%

Assessments

Full Time

No Continuous Assessment
Project
Assessment Type Project % of Total Mark 35
Timing n/a Learning Outcomes 1
Non-marked No
Assessment Description
TBD
Assessment Type Project % of Total Mark 20
Timing n/a Learning Outcomes 2
Non-marked No
Assessment Description
TBD
Assessment Type Project % of Total Mark 45
Timing n/a Learning Outcomes 3
Non-marked No
Assessment Description
TBD
No Practical
No End of Module Formal Examination
Reassessment Requirement
Coursework Only
This module is reassessed solely on the basis of re-submitted coursework. There is no repeat written examination.

SETU Carlow Campus reserves the right to alter the nature and timings of assessment

 

Module Workload

Workload: Full Time
Workload Type Workload Category Contact Type Workload Description Frequency Average Weekly Learner Workload Hours
Laboratory Contact Interactive delivery of content. 12 Weeks per Stage 2.00 24
Estimated Learner Hours Non Contact Review of classroom-delivered material. 15 Weeks per Stage 6.73 101
Total Weekly Contact Hours 2.00
 
Module Resources
Recommended Book Resources
  • Jesse Anderson. (2020), Data Teams, 1st. Apress, p.294, [ISBN: 1484262271].
  • PAUL. CRICKARD. DATA ENGINEERING WITH PYTHON, [ISBN: 183921418X].
  • Steven L. Brunton,J. Nathan Kutz. (2019), Data-Driven Science and Engineering, Cambridge University Press, p.500, [ISBN: 1108422098].
This module does not have any article/paper resources
Other Resources
Discussion Note: First draft of one of the elective modules for final year undergrad offerings.