Responsible for ensuring the stability of a shared non-prod environment (QA and Regression) supporting 1,727 microservice applications through proactive monitoring, incident management, and reliability engineering initiatives.
- Serve as Major Incident Manager for application outages and deployment errors, leading detection, coordination, triage, communication, paging, and documentation while ensuring follow-up tasks are completed
- Reduced average time to mitigate impact from approximately 11 hours to under 1 hour for the majority of incidents over a 2-year period through improved incident response processes and coordination
- Built 3 dashboards to monitor inventory scope, unhealthy API endpoints, and application-specific health as pre-checks before automated testing
- Reduced alert noise by 61% through identification of appropriate alerting thresholds and creation of New Relic synthetic monitors for critical user journeys
- Analyzed Amazon Q Business for company adoption and researched frameworks for alert reduction to improve overall environment stability
- Managed lunch and learn sessions averaging 15 attendees per session, conducted weekly and monthly inventory scope refreshes, and published bi-weekly newsletters on environment stability metrics
- Utilized Splunk, Observe, CloudWatch, New Relic, PagerDuty, ServiceNow, Snowflake, and in-house tools for monitoring, incident management, and automated failover/rollback operations
Contracted to develop the new UI and API for disclosures, articles, and task guides used by call center agents and associates across Capital One's customer service operations.
- Developed single page application components using Vue.js, enabling streamlined access to customer service resources for call center agents
- Designed and implemented emergency caching strategy using Elasticache Redis with key-field-value pair architecture, ensuring API availability during system disruptions
- Built comprehensive automated testing framework including unit, integration, and performance test suites to maintain code quality and system reliability
- Modernized legacy application by creating an API layer through Apache Solr configuration modifications and custom Python 3 scripts, extending application lifecycle
- Managed AWS infrastructure including EC2 servers, Security Groups, Lambda functions, and S3 buckets to support application deployment and operations
- Proactively maintained application security by regularly updating dependencies, AWS machine images, and implementing cyber security recommendations
- Successfully rebuilt API application to eliminate dependency on deprecated internal module, preventing future technical debt and system failures
- Established PagerDuty alerting integration with Splunk and DataDog to enable rapid incident identification and reduce consumer impact
- Led quarterly AWS regional failover exercises across 3 applications, ensuring disaster recovery readiness and business continuity
Engaged as a consulting resource for Capital One contract work, receiving comprehensive full-stack development training and continuous skill development support.
- Completed intensive training in the MERN stack (MongoDB, Express, React, Node.js) and Angular framework, building proficiency across modern web development technologies
- Implemented Firebase Real Time database integration with full CRUD operations (POST, PUT, DELETE, GET), demonstrating practical API development skills
- Participated in weekly mock technical interviews to maintain and strengthen fundamental front-end development knowledge and communication skills
- Developed a full-featured social media single page application as capstone project, showcasing end-to-end understanding of software development principles and best practices
Hired to build automated testing infrastructure and eliminate dependency on outsourced regression testing contract, reducing testing costs and improving release velocity.
- Developed comprehensive UI automation test suite for multiple platforms using Jasmine and WebDriverIO, enabling consistent quality across web and mobile experiences
- Conducted manual testing for development tickets and defects within agile sprint cycles, ensuring timely delivery of high-quality features
- Maintained and executed automation suite to continuously validate platform stability and catch regressions early in the development cycle
- Performed regression testing for monthly production releases, ensuring application stability and preventing customer-impacting issues post-deployment
Provided Level 3 technical support for retail store associates and developed custom Excel-based automation tools to streamline merchandising operations.
- Created comprehensive technical documentation for Carter's iPad applications, improving store associate efficiency and enhancing in-store customer experience
- Deployed applications to over 600 retail stores using AirWatch Mobile Device Management Console, ensuring consistent user experience across the organization
- Resolved technical support tickets related to iPad application functionality and connectivity issues, minimizing store operational disruptions
- Developed VBA automation program in Excel that improved daily merchandising operations by streamlining style generation workflows
- Created reusable VBA function to control user paste operations, which was implemented across multiple business-critical Excel tools