Module Details
Module Code: |
DATA |
Module Title:
|
Data Engineering
|
Title:
|
Data Engineering
|
Module Level:: |
8 |
Module Coordinator: |
Nigel Whyte
|
Module Author:: |
Paul Barry
|
Module Description: |
To provide an overview of modern data engineering practices, tools, and methods.
|
Learning Outcomes |
On successful completion of this module the learner will be able to: |
# |
Learning Outcome Description |
LO1 |
Clean and wrangle data from multiple sources into a usable state. |
LO2 |
Organize the collection, processing, and storage of data from different data sources. |
LO3 |
Design and build ETL and ELT processes and pipelines. |
Dependencies |
Module Recommendations
This is prior learning (or a practical skill) that is recommended before enrolment in this module.
|
No recommendations listed |
Co-requisite Modules
|
No Co-requisite modules listed |
Additional Requisite Information
|
No Co Requisites listed
|
Indicative Content |
Data Formats
Understanding internet data-types: MIME, quoted-printable, Base64 (and others). Data Sources: TXT, CSV, JSON, Web Data, APIs, ERP, CRM, Databases. Structured data, Semi-structured data, and unstructured data.
|
Data Storage
SQL Databases, Document Databases, Graph Databases, Data Warehouses, Data Lakes, Dataframes.
|
ETL/ELT
Extract, Transform, and Load and Extract, Load, and Transform: data cleaning, munging, parsing, converting, mining, and saving.
|
Data Platforms
Big Data, Map Reduce, Cloud-scale data, distributed data processing, Data pipelines, Parallel Computation Platforms, Scaling Issues/Concerns.
|
Module Content & Assessment
|
Assessment Breakdown | % |
Project | 100.00% |
AssessmentsFull Time
No End of Module Formal Examination |
Reassessment Requirement |
Coursework Only
This module is reassessed solely on the basis of re-submitted coursework. There is no repeat written examination.
|
SETU Carlow Campus reserves the right to alter the nature and timings of assessment
Module Workload
Workload: Full Time |
Workload Type |
Workload Category |
Contact Type |
Workload Description |
Frequency |
Average Weekly Learner Workload |
Hours |
Laboratory |
|
Contact |
Interactive delivery of content. |
12 Weeks per Stage |
2.00 |
24 |
Estimated Learner Hours |
|
Non Contact |
Review of classroom-delivered material. |
15 Weeks per Stage |
6.73 |
101 |
Total Weekly Contact Hours |
2.00 |
Module Resources
|
Recommended Book Resources |
---|
-
Jesse Anderson. (2020), Data Teams, 1st. Apress, p.294, [ISBN: 1484262271].
-
PAUL. CRICKARD. DATA ENGINEERING WITH PYTHON, [ISBN: 183921418X].
-
Steven L. Brunton,J. Nathan Kutz. (2019), Data-Driven Science and Engineering, Cambridge University Press, p.500, [ISBN: 1108422098].
| This module does not have any article/paper resources |
---|
Other Resources |
---|
-
Tobias Macey. The Data Engineering Podcast, Internet,
-
The PyData Website,
-
The Irish Government's Open Data Portal,
-
Project Jupyter. Jupyter Notebook and associated tools,
|
Discussion Note: |
First draft of one of the elective modules for final year undergrad offerings. |
|