Data Engineer & Analyst with 5+ years of experience in designing and implementing scalable data pipelines, ML monitoring systems, and cloud-based analytics solutions. Proven track record in building observability tools that improve data quality and reduce incident resolution time. Expertise in Python, AWS (Lambda, EMR, Glue, S3), Airflow, Docker, and developing full-stack data applications that drive business decisions. Skilled in leading small teams to rapidly deliver high-impact MVPs through lean experimentation and modular architecture.
Interactive Resume
A comprehensive overview of my professional experience, education, and skills
Professional Summary
Professional Experience
Lead, Fast Prototyping Team
Nov 2024 – PresentPycube Inc. – Sterling, VA
Leading a fast-paced prototyping team focused on full-stack development for data and analytics solutions.
- Managing a team of 3 engineers, coordinating deliverables, code reviews, and sprint planning.
- Rapidly building MVPs using tools like Cursor AI, Python, AWS (Glue, Lambda, S3), and modern frontend frameworks.
- Accelerated product delivery cycles by 60% through lean experimentation and modular architecture.
- Collaborating closely with stakeholders to define project goals, test assumptions, and deliver high-impact prototypes.
Data Engineer
Nov 2021 – Nov 2024Capital One – McLean, VA
Built and maintained scalable data pipelines using Python, SQL, Spark, Airflow, and AWS (EMR, Glue, S3, IAM).
- Led the migration of EMR-based pipelines to AWS Glue, cutting costs by 40% and reducing processing time.
- Developed Dockerized Airflow environments for local testing, slashing QA cycles by 60%.
- Created a custom New Relic monitoring class and integrated feature metrics into dashboards used by DS and ML teams.
- Automated EMR log retrieval and developed Glue job insights, boosting debugging efficiency by 70%.
- Reduced deployment errors by 70% by shifting from manual to Jenkins-managed CI/CD.
- Actively contributed to TREx failover simulations across AWS regions.
Business Data Analyst
Jun 2020 – Nov 2021Pycube Inc. – Sterling, VA
Analyzed healthcare supply chain data and created actionable dashboards in Tableau for Montefiore Hospital.
- Improved delivery efficiency, order tracking, and inventory management through custom analytics tools.
- Helped identify cost-saving opportunities, resulting in over $1M in potential savings.
- Conducted SQL-driven inventory audits ensuring 100% accuracy and improved data trust.
Education
Master of Science in Information Systems
Stevens Institute of Technology – Hoboken, NJ
Aug 2018 – Dec 2019
Concentration: Business Intelligence & Analytics
Specialized in data strategy, enterprise systems, and business-focused analytics.
Bachelor of Technology (B.Tech) in Computer Science
NMIMS University – Mumbai, India
Aug 2014 – May 2018
Core focus on software engineering, algorithms, databases, and systems programming.
Featured Projects
ML Feature Observability Dashboard
Developed a centralized monitoring system called OneDash that provides real-time visibility into ML feature pipeline performance, reducing incident resolution time by 80% and improving data accuracy by 30%.
Cloud Data Migration Framework
Led the design and implementation of a framework to migrate petabyte-scale data environments to AWS, achieving $2.1M annual cost savings with zero data loss incidents.
Automated EMR Log Retrieval Pipeline
Developed an automated system to retrieve and parse EMR logs for failed ML features with Slack alerts, cutting debugging time by 70% and improving issue detection by 90%.
Dockerized Airflow QA Environment
Built a containerized Airflow testing environment that accelerated development and reduced integration failures by 40%, slashing QA cycles by 60%.