24 may
|
EPAM Systems
|
La Plata
24 may
EPAM Systems
La Plata
Postúlate en Kit Empleo: kitempleo.com.ar/empleo/q0jx0
We are looking for a Lead Cloud Engineer to join our team.
You will lead operational excellence of the cloud platform by owning observability, incident response, resilience, and disaster recovery. This role ensures that /"run/" is as strong as /"build,/" providing confidence that cloud workloads remain healthy, compliant, and performant.
Responsibilities
- Own operational health dashboards, alert thresholds, and incident response playbooks for the cloud platform
- Lead on-call rotations, coordinate major incident resolution, and drive post‑incident reviews
- Implement and maintain Disaster Recovery (DR) solutions for core applications, including DNS routing strategies and low‑RTO repositories
- Manage patching pipelines, golden images, container registries, backups, and automated resilience testing
- Partner with platform engineers to feed operational learnings into architecture improvements and the roadmap
- Use automation and AI‑assisted tools to correlate anomalies, reduce noise, and accelerate root‑cause discovery
- Educate product teams on DR patterns, operational best practices, and shared responsibilities
Requirements
- Bachelor's or Master's degree in Computer Science, Computer Engineering, or equivalent professional experience
- At least 5 years of relevant professional experience
- A minimum of one year of experience in people management or team leadership,
leading a team of 5+ FTEs
- Hands‑on experience in cloud operations or SRE roles with deep exposure to AWS or similar hyperscale platforms
- Advanced skills in monitoring, alerting, logging, and incident management tooling
- Proven track record executing disaster recovery strategies, backup regimes, and resilience testing
- Solid knowledge of patching processes, golden AMI and container image management, and change control governance
- Experience automating operational workflows to reduce MTTR and toil using tools such as Python, Lambda, and runbooks
- Familiarity with AI‑assisted observability and correlation tooling and how to operationalize it
- Strong communication skills for on‑call coordination and stakeholder updates
- Excellent oral and written communication skills in English (B2+ level or higher)
We offer
- International projects with top brands
- Work with integral teams of highly skilled, diverse peers
- Employee financial programs
- Paid time off and sick leave
- Upskilling, reskilling and certification courses
- Unlimited access to the Linked In Learning library and 22,000+ courses
- Global career opportunities
- Volunteer and community involvement opportunities
- EPAM Employee Groups
- Award‑winning culture recognized by Glassdoor, Newsweek and Linked In
#J-18808-Ljbffr
Postúlate en Kit Empleo: kitempleo.com.ar/empleo/q0jx0
📌 Lead Cloud Engineer (La Plata)
🏢 EPAM Systems
📍 La Plata