Data Engineering Projects

Data Engineering Projects

Project 1: Relational Databases - Data Modeling with PostgreSQL

Developed a relational database using PostgreSQL to model user activity data for a music streaming app. Skills include:

Proficiencies include: Python, PostgreSql, Star Schema, ETL pipelines, Normalization

Project 2: NoSQL Databases - Data Modeling with Apache Cassandra

Designed a NoSQL database using Apache Cassandra based on the original schema outlined in project one. Skills include:

Proficiencies used: Python, Apache Cassandra, Denormalization

Project 3: Data Warehouse - Amazon Redshift

Created a database warehouse utilizing Amazon Redshift. Skills include:

Proficiencies used: Python, Amazon Redshift, aws cli, Amazon SDK, SQL, PostgreSQL

Project 4: Data Lake - Spark

Scaled up the current ETL pipeline by moving the data warehouse to a data lake. Skills include:

Technologies used: Spark, S3, EMR, Athena, Amazon Glue, Parquet.

Project 5: Data Pipelines - Airflow

Automate the ETL pipeline and creation of data warehouse using Apache Airflow. Skills include:

Technologies used: Apache Airflow, S3, Amazon Redshift, Python.

Daniel Diamond

Daniel Diamond

data watches music travel

rss facebook twitter github gitlab youtube mail spotify lastfm instagram linkedin google google-plus pinterest medium vimeo stackoverflow reddit quora quora