Projects
Things I'm building
Open source work, side projects, and infrastructure experiments.
Active
● Active
Kafka → Iceberg Streaming Pipeline
Real-time event streaming from Kafka topics into Apache Iceberg tables using Flink, with automatic compaction and schema evolution.
● Active GitHub ↗
Data Platform Modernisation
Migrating our batch pipelines to streaming with Kafka and dbt.
● Active
MLflow on Kubernetes
Self-hosted MLflow tracking server on EKS with S3 artifact storage, PostgreSQL backend, and Keycloak authentication. Helm-based deployment.
Shipped
● Shipped
Cloudera CDP Health Monitor
Internal Grafana dashboard stack for Cloudera CDP cluster health — HDFS, YARN, Impala, and Kafka metrics unified in one view with alerting.