Hello, I'm Shiv Tomar

Data Engineer & Analyst

Transforming raw data into actionable insights and building robust data pipelines. I specialize in designing ETL processes, data modeling, and analytics solutions that drive business decisions.

Shiv Tomar

About Me

I'm a passionate Data Engineer and Analyst with over 5 years of experience transforming complex data into meaningful insights. My journey in the data world started when I discovered how powerful information could be when properly structured and analyzed.

Specializing in building robust ETL pipelines, data models, and analytics solutions, I've helped organizations across healthcare, retail, and finance sectors make data-driven decisions that impact their bottom line.

My technical expertise includes Python, SQL, and various big data technologies, but what truly drives me is solving complex problems and helping businesses leverage their data assets effectively.

Data Engineering

Building scalable data infrastructure and ETL processes

Analytics

Transforming raw data into actionable business insights

Development

Creating data-driven applications and solutions

Looking for a data professional?

I'm always open to discussing new projects, challenges, and opportunities.

Download Resume
0+

Years Experience

0+

Projects Completed

0+

Happy Clients

Work Experience

Over the years, I've gained valuable experience working with data in various contexts. Here's a summary of my professional journey.

Data Analyst
2020
Business Data Analyst
2020-2021
Data Engineer
2021-2024
Lead, Fast Prototyping Team
2024-Present

Lead, Fast Prototyping Team

Pycube Inc. – Sterling, VA

Nov 2024 – Present

Leading a fast-paced prototyping team focused on full-stack development for data and analytics solutions.

  • Managing a team of 3 engineers, coordinating deliverables, code reviews, and sprint planning.
  • Rapidly building MVPs using tools like Cursor AI, Python, AWS (Glue, Lambda, S3), and modern frontend frameworks.
  • Accelerated product delivery cycles by 60% through lean experimentation and modular architecture.
  • Collaborating closely with stakeholders to define project goals, test assumptions, and deliver high-impact prototypes.

Data Engineer

Capital One – McLean, VA

Nov 2021 – Nov 2024

Built and maintained scalable data pipelines using Python, SQL, Spark, Airflow, and AWS (EMR, Glue, S3, IAM).

  • Led the migration of EMR-based pipelines to AWS Glue, cutting costs by 40% and reducing processing time.
  • Developed Dockerized Airflow environments for local testing, slashing QA cycles by 60%.
  • Created a custom New Relic monitoring class and integrated feature metrics into dashboards used by DS and ML teams.
  • Automated EMR log retrieval and developed Glue job insights, boosting debugging efficiency by 70%.
  • Reduced deployment errors by 70% by shifting from manual to Jenkins-managed CI/CD.
  • Actively contributed to TREx failover simulations across AWS regions.

Business Data Analyst

Pycube Inc. – Sterling, VA

Jun 2020 – Nov 2021

Analyzed healthcare supply chain data and created actionable dashboards in Tableau for Montefiore Hospital.

  • Improved delivery efficiency, order tracking, and inventory management through custom analytics tools.
  • Helped identify cost-saving opportunities, resulting in over $1M in potential savings.
  • Conducted SQL-driven inventory audits ensuring 100% accuracy and improved data trust.

Data Analyst

Sanford Health – Sioux Falls, SD

Jan 2020 – May 2020

Led data-driven recruitment initiatives and automated job posting processes.

  • Scraped and analyzed LinkedIn profiles to increase recruiting efficiency by 30%.
  • Matched candidate profiles with job requisitions to identify top 100 potential candidates for hiring.
  • Automated job postings on career websites using Selenium and Python, posting one job per minute and making the process 6 times faster.
  • Developed a system to suggest jobs based on skills and keywords extracted from candidate resumes.
  • Automated requisition closure through Python scripts, closing 200 reqs in 5 minutes with 90% increased speed.

Education

My educational background has provided me with a strong foundation in data science, computer science, and analytical techniques.

Master of Science in Information Systems

Stevens Institute of Technology – Hoboken, NJ

Aug 2018 – Dec 2019

Specialized in data strategy, enterprise systems, and business-focused analytics with a concentration in Business Intelligence & Analytics.

Key Courses:
  • Advanced Machine Learning
  • Big Data Analytics
  • Statistical Methods
  • Enterprise Systems
Achievements:
  • Graduated with Honors
  • Research Assistant

Bachelor of Technology (B.Tech) in Computer Science

NMIMS University – Mumbai, India

Aug 2014 – May 2018

Core focus on software engineering, algorithms, databases, and systems programming.

Key Courses:
  • Data Structures and Algorithms
  • Database Management
  • Programming Languages
  • Web Development
Achievements:
  • Dean's List
  • Hackathon Winner

Projects

Here are some of the key projects I've worked on, showcasing my skills in data engineering, analytics, and automation.

Dockerized Airflow QA Environment

Capital One

Built a containerized Airflow testing environment that accelerated development and reduced integration failures.

60% Testing cycle reduction
40% Fewer integration failures
DockerPythonAirflowJenkins CI/CD+2
View Details

Healthcare Asset Management System

Pycube Inc

Built a Master Data Management system for healthcare asset tracking across multiple hospitals.

35% Reduction in misplacements
70% Increased savings
PythonFlaskPostgreSQLRedis+4
View Details

Automated EMR Log Retrieval Pipeline

Capital One

Automated log analysis system that drastically reduced debugging time for ML features in production.

70% Less debugging time
90% Faster issue detection
PythonAWS EMRAWS LambdaAirflow+3
View Details

ML Feature Observability Dashboard

Capital One

Developed a centralized monitoring system called OneDash that provides real-time visibility into ML feature pipeline performance, data quality, and failure patterns.

80% Reduced incident resolution time
30% Improved feature data accuracy
PythonAWS LambdaCloudWatchNew Relic+6
View Details

Data Quality & Submission Tracker

Capital One

Built a data validation pipeline that ensured regulatory compliance for healthcare data submissions.

99.8% Data accuracy achieved
45% Less manual verification
PythonAirflowAWS S3Snowflake+3
View Details

Cloud Data Migration Framework

Capital One

Led the design and implementation of a framework to migrate petabyte-scale data environments to AWS.

$2.1M Annual cost savings
30% Performance improvement
PythonAWS GlueS3DynamoDB+4
View Details

Hospital Supply Chain Dashboard

Pycube Inc

Created an interactive supply chain visualization system that optimized inventory management across multiple facilities.

32% Inventory cost reduction
98% Critical supply availability
PythonFlaskPostgreSQLAWS EC2+3
View Details

Rapid Prototyping: Full-Stack MVPs

Pycube Inc

Led a fast-paced team building proof-of-concept applications for healthcare data solutions.

60% Faster product validation
3 Successful products launched
ReactNode.jsPythonFlask+5
View Details

Skills

My technical toolkit includes a variety of programming languages, frameworks, and technologies that I've mastered over the years.

Programming

PythonSQLJavaJavaScriptShell Scripting

Data Engineering

ETLData PipelinesJenkinsApache SparkHadoopWorkflow OrchestrationAirflowKafkaAWS Glue

Artificial Intelligence

Cursor AIMachine LearningGenerative AINLPPredictive AnalyticsAI Prototyping

Data Analysis

PandasNumPyData VisualizationMachine LearningTableauLooker StudioPower BI

Databases

MySQLPostgreSQLMongoDBSnowflakeDynamoDBNRQL

Cloud Technologies

AWSAWS EMRAWS IAMAWS S3AWS LambdaAWS EC2AWS CloudWatchServerlessDockerKubernetes

Tools & Methodologies

GitCI/CDAgileJiraData ModelingNew RelicDatadogPostmanJupyter NotebooksLucidchart
0+
Lines of Code
From ETL scripts and data pipelines to full-stack applications
0+
Projects Completed
Includes end-to-end data pipelines, analytics dashboards, and cloud-native data workflows
0TB+
Data Processed
Processing big data in healthcare, finance, and retail domains
0+
Cups of Coffee
The fuel behind all successful data projects

Testimonials & Certifications

What people say about working with me and my professional achievements

Shiv's work on our healthcare data platform was transformative. His ETL pipelines decreased processing time by 40% while improving data quality significantly. A true professional who delivers beyond expectations.

John Smith

John Smith

CTO at Healthcare Solutions

Working with Shiv on our data analytics project was a game-changer. His insights and technical expertise helped us identify patterns we'd missed for years, resulting in a 23% revenue increase.

Sarah Johnson

Sarah Johnson

Data Analytics Manager

Shiv is exceptional at translating complex data problems into elegant solutions. His automated job matching system reduced our hiring time by 35% and significantly improved candidate quality.

Michael Chang

Michael Chang

HR Director

Certifications & Achievements

AWS Certified - Cloud Practitioner

Proficiency in AWS services and cloud computing fundamentals

2022

Google Analytics Essentials

Proficiency in Google Analytics and Google Ads

2020

Tableau Essential Training

Proficiency in creating interactive dashboards and visualizations

2020

Get in Touch

Interested in working together or have a question? I'd love to hear from you!

Contact Information

Feel free to reach out through any of these channels. I'm always open to discussing new projects, creative ideas, or opportunities to be part of your vision.

Location

United States

Connect with me

Send Me a Message