25 may
|
EPAM Systems
|
Argentina
25 may
EPAM Systems
Argentina
Postúlate en Kit Empleo: kitempleo.com.ar/empleo/pzs18
We are seeking a Site Reliability Engineer with a strong programming background to join our Cloud Security and Infrastructure (CSI) team.
CSI provides a single point of entry to enable identity, branding and compliance, as well as a single point of management to support provisioning, monitoring, security and operational support. The ideal candidate will bring hands-on expertise in containerization, orchestration and observability to help build and maintain reliable, scalable systems.
Responsibilities
Create and manage applications, containerize them and run them using open-source container management tools such as Docker or Podman
Interpret container logs and trace specific events for troubleshooting purposes
Create and manage Kubernetes resource manifests for deployment into K8S clusters (e.g., Kind cluster locally or GKE/AKS in a cloud provider)
Deploy Prometheus agents to monitor infrastructure and application behavior
Raise and manage alerts based on observability data
Support provisioning, monitoring, security and operational tasks across distributed systems
Implement and maintain CI/CD pipelines and GitOps-based continuous deployment workflows
Collaborate with cross-functional teams to ensure system reliability and performance
Requirements
At least 2 years of hands-on programming experience
Proficiency in at least one scripting language
Hands-on expertise in Kubernetes and Linux
Knowledge of at least one cloud provider, with experience in Microsoft Azure
Familiarity with Prometheus or a similar monitoring agent and strong fundamentals of observability
Skills in Azure DevOps CI/CD pipelines and/or GitOps packaging and continuous deployment tools such as Helm and ArgoCD
Capability to troubleshoot distributed systems
Background in Terraform for infrastructure as code
Fluent communication skills in English at a B2+ level
Nice to have
Familiarity with Azure DevOps
Knowledge of Google Cloud Platform
Expertise in Istio
Proficiency in Prometheus
We offer
International projects with top brands
Work with general teams of highly skilled, diverse peers
Healthcare benefits
Employee financial programs
Paid time off and sick leave
Upskilling, reskilling and certification courses
Unlimited access to the LinkedIn Learning library and 22,000+ courses
Global career opportunities
Volunteer and community involvement opportunities
EPAM Employee Groups
Award-winning culture recognized by Glassdoor, Newsweek and LinkedIn
Postúlate en Kit Empleo: kitempleo.com.ar/empleo/pzs18
📌 Site Reliability Engineer (Argentina)
🏢 EPAM Systems
📍 Argentina