Back to portfolio
Microsoft Fabric Data Engineering

CORDIS Fabric Data Platform

Built a Microsoft Fabric Lakehouse solution for CORDIS European research data using a medallion architecture. The project covered source extraction, Landing, Bronze, Silver and Gold layer processing, audit logging, data validation and Power BI semantic modelling.

Microsoft FabricLakehouseOneLakePySparkDelta TablesPower BIMedallion Architecture
CORDIS Fabric medallion architecture preview
PROJECT DOCUMENT

CORDIS Fabric Data Platform Full Document

Click to read document

Document Preview

This document explains the medallion architecture design, Fabric Lakehouse layers, PySpark transformations, audit logging and Power BI-ready Gold model.

  • Landing, Bronze, Silver and Gold layer design
  • Metadata-driven ingestion and Delta table loading
  • Audit logs, validations and row-count checks
  • Power BI semantic model preparation
Read Full Document
4 layers

Landing, Bronze, Silver and Gold lakehouse design.

82K+

Cleaned CORDIS projects prepared for reporting and search.

Gold model

Fact and dimension tables for Power BI analytics.

Problem

CORDIS data is spread across multiple programmes and file formats. The goal was to create an analytics-ready data platform that could standardise project, organisation, publication, deliverable, report, IPR and policy-priority data for reporting.

Architecture

Source ZIP/CSV/JSONLandingBronzeSilverGoldPower BI

What I Built

Outcome

The project became the foundation for the later CORDIS Research Explorer web application and the portable CORDIS-to-Supabase ETL pipeline.