Key Responsibilities:
Kubernetes Administration: Deploy, configure, and manage Kubernetes clusters in cloud and on-prem environments.
Reliability & Performance: Implement best practices to ensure high availability, scalability, and performance of containerized applications.
Monitoring & Incident Response: Set up monitoring (Prometheus, Grafana, ELK, etc.), troubleshoot issues, and lead incident resolution.
Automation & Infrastructure as Code (IaC): Develop and maintain Terraform, Helm charts, and Kubernetes manifests for automation.
CI/CD & DevOps Integration: Work with DevOps teams to optimize CI/CD pipelines for Kubernetes deployments (Jenkins, ARBCCD, FluxCD, etc.).
Security & Compliance: Implement security best practices for containerized workloads, RBAC, network policies, and vulnerability scanning.
Capacity Planning & Optimization: Analyze resource usage and optimize infrastructure costs and performance.
Disaster Recovery & Backup: Implement backup and disaster recovery strategies for Kubernetes.