
This position focuses on maintaining high availability and performance of production services while optimizing cloud infrastructure and microservices. The ideal candidate will have extensive experience in AWS, programming, and automation, ensuring operational excellence through effective incident management.
The role involves ensuring the reliability and performance of production services across various regions. Responsibilities include defining service level objectives, leading incident responses, optimizing cloud infrastructure, and implementing observability systems. The engineer will also build CI/CD pipelines and participate in on-call rotations to enhance operational excellence.
4+ years of experience in Site Reliability Engineering, DevOps, or Production Engineering
Proficient in AWS ecosystem management
Experience with container orchestration using AWS ECS
Strong programming skills in Go and Node.js
Familiarity with Infrastructure as Code tools like Terraform
Ability to manage high-traffic production systems
Experience with observability tools such as Prometheus and Grafana
Knowledge of microservices architecture and cloud networking
Company
ZUS COFFEE
Location
Selangor
Salary
Undisclosed
Skills Required
8 skills
Click to submit your application
AWS
Go
Node.Js
Terraform
Prometheus
Grafana
Docker
Python