24 may
|
EPAM Systems
|
Burzaco
24 may
EPAM Systems
Burzaco
Postúlate en Kit Empleo: kitempleo.com.ar/empleo/pyg8o
We are strengthening a client-facing delivery team that operates Kubernetes and Linux compute stacks for advanced AI workloads, including GPU scheduling with Volcano. You will automate day-to-day operations with Python and UNIX Shell, manage namespaces, RBAC, and quotas, and partner with researchers to keep platforms fast and dependable; apply now.
Responsibilities
- Deliver and support GPU-enabled Kubernetes clusters plus standalone Linux compute environments with strong scheduling behavior and throughput
- Run Volcano scheduling operations, including queue setup, POD execution, GPU allocation, and enforcement of namespace quotas
- Own Kubernetes administration across namespaces, RBAC, resource quotas, and workload isolation strategies
- Create and evolve Python and Shell scripts that automate job submission, resource provisioning, and system reporting
- Partner with orchestration, optimization, and observability teams to improve scheduling efficiency, utilization, and researcher workflows
- Track infrastructure health and resource utilization and provide input for optimization and reporting requirements
- Propose and drive improvements to infrastructure, tooling, and automation workflows to raise performance, scalability, and usability
- Support operational processes that ensure researchers have an efficient experience across AI and computational workloads
Requirements
- At least 3 years of experience in DevOps or infrastructure engineering roles supporting complex, large-scale environments
- Expert proficiency in Kubernetes administration and orchestration, including management of namespaces, POD scheduling and distribution, persistent volume claims (PVC), network file systems (NFS), and resource quota management
- Hands‑on experience with Volcano scheduler for GPU job execution, including queue configuration, workload prioritization, and integration with Kubernetes
- Proven experience managing GPU cluster environments, both within Kubernetes and on standalone
Postúlate en Kit Empleo: kitempleo.com.ar/empleo/pyg8o
📌 Senior DevOps Engineer (Burzaco)
🏢 EPAM Systems
📍 Burzaco