Data Projects

Real-Time Stock Price Pipeline

In this project, I developed a real-time data pipeline using Kafka, Spark Streaming, and Airflow. Stock price data is fetched from a public API, streamed into MongoDB, and processed daily to generate summaries stored in PostgreSQL. The pipeline is fully automated and designed to run daily with Airflow scheduling.

Kafka Streaming Monitoring
API to MongoDB Pipeline with PySpark and Airflow
aerial view of people walking on raodaerial view of people walking on raod
Real-Time Flight Data Pipeline

In this project, I built a real-time pipeline to collect and process flight data using Kafka, Spark Streaming, and Airflow. The system ingests live data from a flight tracking API, stores it in MongoDB, and writes daily landed flight records to PostgreSQL. Airflow handles the daily automation and reporting.

CI/CD-Focused Data Pipeline

In this end-to-end project, I built a data pipeline using Airflow, Spark, PostgreSQL, and Docker, with a strong focus on CI/CD workflows. The pipeline runs on AWS and demonstrates how to automate deployment, testing, and orchestration of data workflows in a containerized environment.

In this project, I set up real-time monitoring for a Kafka-based data pipeline using Prometheus and Grafana. The setup tracks key metrics like message throughput and consumer lag, helping to ensure system stability and performance visibility.

In this project, I set up real-time monitoring for a Kafka-based data pipeline using Prometheus and Grafana. The setup tracks key metrics like message throughput and consumer lag, helping to ensure system stability and performance visibility.

In this project, I explored the Sakila movie rental database using PostgreSQL. I performed complex SQL queries to analyze customer behavior, film popularity, revenue patterns, and inventory performance. The project focuses on hands-on data analysis with a realistic schema.

Sakila Data Analysis with PostgreSQL

In this project, I explored the Sakila movie rental database using PostgreSQL. I performed complex SQL queries to analyze customer behavior, film popularity, revenue patterns, and inventory performance. The project focuses on hands-on data analysis with a realistic schema.

User Data Pipeline to AWS S3

In this project, I explored the Sakila movie rental database using PostgreSQL. I performed complex SQL queries to analyze customer behavior, film popularity, revenue patterns, and inventory performance. The project focuses on hands-on data analysis with a realistic schema.

Face Detection with YOLOv8