Senior Site Reliability Engineer

HiLabs
HiLabs

Software Engineering

Pune, Maharashtra, India

Posted on Jun 26, 2026

Job Title: Senior DevOps Engineer (SRE Focus)

Experience: 7 to 10 Years

Location: Pune Kharadi (5 Days Work from Office)


About the Rol

eWe are looking for a highly skilled Senior DevOps Engineer with strong Site Reliability Engineering (SRE) experience to join our growing engineering team. The ideal candidate should have hands-on expertise in AWS cloud infrastructure, Kubernetes, Infrastructure as Code, CI/CD automation, and production reliability engineering

.This role requires someone who can design, automate, deploy, monitor, and optimize large-scale cloud-native environments while ensuring high availability, scalability, security, and operational excellence

.Key Responsibilitie

  • sDesign, deploy, and manage highly available cloud infrastructure on AWS
  • .Build and maintain scalable Kubernetes environments
  • .Implement Infrastructure as Code using Terraform and CloudFormation
  • .Develop and manage CI/CD pipelines using Jenkins
  • .Drive automation across infrastructure provisioning, deployments, monitoring, and incident management
  • .Work closely with engineering teams to improve system reliability, performance, and scalability
  • .Manage production incidents, root cause analysis, and postmortems
  • .Implement monitoring, alerting, logging, and observability solutions
  • .Optimize cloud infrastructure for cost, security, and performance
  • .Support containerized applications and microservices architectures
  • .Maintain and improve deployment workflows using Git and Bitbucket
  • .Contribute to SRE best practices including SLIs, SLOs, and error budgets

.Mandatory Skill

  • s7 to 10 years of DevOps / SRE experience
  • .Strong hands-on experience with AWS
  • .Production-grade Kubernetes (K8s) experience
  • .Expertise in Terraform
  • .Experience with AWS CloudFormation
  • .Strong Jenkins pipeline creation and administration experience
  • .CI/CD implementation and automation expertise
  • .Linux administration and troubleshooting
  • .Shell Scripting, Bash, or Python
  • .Monitoring and logging tools such as ELK, Prometheus, Grafana, CloudWatch, etc
  • .Experience handling production environments and critical incidents

.Preferred Skill

  • sSite Reliability Engineering (SRE) experience
  • .Bitbucket administration and pipeline integration
  • .Docker and containerization technologies
  • .Azure cloud exposure
  • .Security, compliance, and infrastructure governance experience
  • .Experience working in Agile environments

.Ideal Candidate Profil

  • eStrong hands-on engineer, not a people manager
  • .Experience supporting large-scale production environments
  • .Proven expertise in reliability engineering and operational excellence
  • .Startup or product company experience preferred
  • .Excellent troubleshooting and problem-solving skills
  • .Strong communication and stakeholder management abilities

.Nice to Hav

  • eAWS Certifications
  • .Kubernetes Certifications
  • .Azure exposure
  • .Experience in healthcare, fintech, SaaS, or product-based organizations
.