David O'MalleyData Engineer

Transforming raw data into actionable insights through innovative engineering solutions. Specializing in data pipelines, ETL processes, and cloud-based data architecture.

About Me

Alex Morgan

Hi, I'm David O'Malley

I'm a passionate Data Engineer with 5+ years of experience designing and implementing data infrastructure, ETL pipelines, and analytics solutions. I specialize in transforming complex data challenges into scalable, efficient systems that drive business value.

With a background in Computer Science and a deep understanding of both software engineering and data science principles, I bridge the gap between raw data and actionable insights.

Education

MSc in Data Science

Stanford University, 2018

Location

San Francisco, California

Available for remote work

Interests

Big Data, Cloud Architecture

Machine Learning, Open Source

Technical Skills

A comprehensive set of technical skills and expertise developed through years of hands-on experience in data engineering and analytics projects.

Data Engineering

  • ETL/ELT Pipelines
  • Data Warehousing
  • Data Modeling
  • Data Governance

Databases

  • PostgreSQL
  • MongoDB
  • Snowflake
  • Redis
  • Elasticsearch

Cloud Platforms

  • AWS
  • Google Cloud
  • Azure
  • Databricks

Programming

  • Python
  • SQL
  • Scala
  • Java
  • Bash

Big Data

  • Spark
  • Hadoop
  • Kafka
  • Airflow
  • dbt

Analytics

  • Tableau
  • Power BI
  • Looker
  • Data Visualization

DevOps

  • Docker
  • Kubernetes
  • CI/CD
  • Terraform
  • Git

Data Processing

  • Batch Processing
  • Stream Processing
  • Real-time Analytics

Machine Learning

  • ML Pipelines
  • Feature Engineering
  • Model Deployment

Data Science

  • Statistical Analysis
  • Pandas
  • NumPy
  • Jupyter

Proficiency Levels

Python95%
SQL90%
AWS85%
Spark80%
Airflow85%
Data Modeling90%
ETL/ELT95%
Docker/Kubernetes75%

Featured Projects

A selection of my most impactful data engineering projects, showcasing my expertise in building scalable data solutions that drive business value.

Real-time Data Pipeline
Real-time Data Pipeline
Designed and implemented a real-time data processing pipeline using Apache Kafka and Spark Streaming.
Kafka
Spark
AWS
Python
Data Warehouse Migration
Data Warehouse Migration
Led the migration of a legacy data warehouse to Snowflake, improving query performance and reducing costs.
Snowflake
dbt
Python
SQL
ML Feature Store
ML Feature Store
Built a centralized feature store for machine learning models, enabling feature reuse and consistency.
Python
Redis
Feast
Kubernetes
Data Quality Framework
Data Quality Framework
Developed an automated data quality monitoring and alerting system for critical data pipelines.
Great Expectations
Airflow
Python
Grafana
Customer 360 Platform
Customer 360 Platform
Created a unified customer data platform integrating data from multiple source systems.
Databricks
Delta Lake
Python
Spark
Streaming Analytics Dashboard
Streaming Analytics Dashboard
Built a real-time analytics dashboard for monitoring business KPIs and system performance.
Kafka
Elasticsearch
Kibana
Node.js

Professional Experience

My professional journey in data engineering, showcasing a progression of roles with increasing responsibility and technical expertise.

Jan 2021 - Present
Senior Data Engineer
TechCorp Inc. | San Francisco, CA
  • Lead a team of 5 data engineers in designing and implementing data pipelines processing 10TB+ daily
  • Architected and deployed a cloud-based data lake on AWS using S3, Glue, and Athena, reducing data processing costs by 40%
  • Implemented data quality monitoring framework using Great Expectations, reducing data incidents by 75%
  • Collaborated with data science team to build ML feature pipelines that improved model performance by 30%
AWS
Spark
Airflow
Python
Snowflake
Terraform
Mar 2019 - Dec 2020
Data Engineer
DataStream Solutions | Seattle, WA
  • Designed and implemented ETL pipelines using Apache Airflow and Spark for financial data processing
  • Migrated on-premise data warehouse to Google BigQuery, improving query performance by 8x
  • Built real-time data processing system using Kafka and Spark Streaming for fraud detection
  • Developed data governance policies and implemented data lineage tracking
GCP
BigQuery
Kafka
Spark
Python
dbt
Jun 2017 - Feb 2019
Data Analyst
Analytics Innovations | Boston, MA
  • Performed data analysis and created dashboards using Tableau for business stakeholders
  • Developed SQL queries and stored procedures for data extraction and transformation
  • Automated reporting processes, saving 20+ hours of manual work weekly
  • Collaborated with product teams to define KPIs and implement tracking
SQL
Tableau
Python
Excel
PostgreSQL
Jan 2017 - May 2017
Data Science Intern
AI Research Lab | Stanford, CA
  • Assisted in developing machine learning models for predictive analytics
  • Performed data cleaning and feature engineering on large datasets
  • Implemented data visualization tools to communicate findings to stakeholders
  • Contributed to research papers on applied machine learning techniques
Python
Pandas
Scikit-learn
Jupyter
TensorFlow

Get In Touch

Have a project in mind or want to discuss potential opportunities? Feel free to reach out. I'm always open to discussing new projects, creative ideas, or opportunities to be part of your vision.

Contact Information

Location

San Francisco, California

Connect With Me

Send Me a Message