Senior DevOps Engineer
07 May 2026Job Description:
JOB SUMMARY:
We are looking for a hands-on Senior DevOps Engineer with experience in DevOps, systems administration, cloud infrastructure, and security. This role requires both strong technical depth and leadership capability to design, secure, automate, and improve our infrastructure, deployment pipelines, and platform operations.
The ideal candidate has solid experience in Linux server management, multi-cloud environments, CI/CD, GitLab, AWS, microservices deployment, cybersecurity, ISO 27001-aligned controls, infrastructure automation, and DevSecOps / MLSecOps practices. This is not a purely managerial role—the candidate must be able to lead by doing, mentor engineers, troubleshoot deeply, and drive operational excellence.
DUTIES AND RESPONSIBILITIES:
1. Infrastructure, Cloud, and Platform Management
- Design, build, secure, and optimize infrastructure across cloud, hybrid, and multi-cloud environments.
- Manage, harden, and troubleshoot Linux servers, especially RHEL-based systems such as CentOS and Rocky Linux.
- Tune, secure, and maintain LAMP (Apache) and LEMP (Nginx) stacks to support PHP/Laravel, Uvcorn/Hypercorn and Granian for Production Python ASGI and other business-critical applications.
- Support infrastructure and hosting requirements for Laravel, Next.js, Node.js, Python and React applications across development, staging, and production environments.
- Provision and manage compute, network, and storage resources for application, platform, analytics, and data workloads.
- Implement and optimize auto-scaling, load balancing, high availability, backup, disaster recovery, and capacity planning strategies.
- Manage and help maintain the self-managed GitLab environment, ensuring the reliability and availability of the CI/CD platform.
- Own, improve, and standardize CI/CD pipelines, including GitLab CI runners, build processes, release workflows, and deployment automation.
- Build and maintain pipelines using tools such as GitLab CI, Jenkins, or GitHub Actions to test, build, and deploy applications efficiently and securely.
- Ensure deployments are secure, repeatable, atomic, and rollback-capable.
- Troubleshoot and resolve build failures, pipeline issues, and deployment problems, bridging gaps between local development, staging, and production environments.
- Improve deployment speed, consistency, and release quality through automation and engineering best practices.
- Build and support containerized environments from the ground up using Docker and orchestration-ready deployment practices.
- Design, deploy, and manage containerized workloads using Kubernetes for scaling, orchestration, and platform management.
- Support microservices architecture deployment, platform modernization, and future-ready infrastructure patterns.
- Design and implement infrastructure using Infrastructure as Code (IaC) tools such as Terraform/OpenTofu, Ansible, CloudFormation, or equivalent.
- Automate provisioning, configuration management, and operational tasks using Bash scripting, Python, IaC, and platform automation tools.
- Support and implement serverless deployment models and event-driven architectures where appropriate.
- Provide infrastructure support for data pipelines, ETL workflows, machine learning pipelines, and big data processing environments.
- Ensure platform readiness for scalable data and ML workloads, including compute allocation, storage provisioning, workload scheduling, and performance optimization.
- Support reliable and secure environments for data ingestion, processing, analytics, and model deployment.
- Help establish standards for scaling, monitoring, governance, and securing data-intensive and ML-enabled systems.
- Implement and maintain monitoring, logging, alerting, and observability solutions across infrastructure and applications.
- Set up and manage monitoring and alerting tools such as Prometheus, Grafana, and ELK Stack to ensure uptime, visibility, and incident response readiness.
- Improve system performance, scalability, reliability, and recovery readiness across all environments.
- Support web performance optimization, including caching, CDN integration, asset compression, and delivery tuning.
- Proactively detect, troubleshoot, and resolve system, OS-level, application, and infrastructure issues.
- Apply cybersecurity best practices across infrastructure, pipelines, and platform operations.
- Implement hardening standards, patching, access control, secrets management, secure configuration baselines, and vulnerability remediation.
- Support DevSecOps and MLSecOps practices by embedding security controls into infrastructure, CI/CD pipelines, and deployment workflows.
- Ensure environments align with ISO 27001 controls, governance requirements, audit readiness, and internal security standards.
- Help strengthen security posture across cloud, Linux, network, container, and application environments.
- Work closely with the current DevOps engineer and the wider engineering team to document infrastructure knowledge and reduce single points of failure.
- Collaborate with developers, QA, security, data, and project teams to improve platform usability, deployment readiness, and operational efficiency.
- Maintain and improve Jira and Confluence workflows to ensure engineering work, documentation, and processes are properly tracked and managed.
- Mentor team members, promote best practices, and provide hands-on technical leadership across cloud, infrastructure, deployment, security, and platform operations.
- Lead by example through strong ownership, deep troubleshooting, practical decision-making, and continuous improvement.
2. CI/CD, GitLab, and Release Engineering
3. Containerization, Infrastructure as Code, and Modern Architecture
4. Data, Machine Learning, and Scalable Workload Support
5. Monitoring, Reliability, and Performance
6. Security, Compliance, and DevSecOps
7. Collaboration, Documentation, and Leadership
JOB COMPETENCIES:
· Education Background: Bachelor’s degree in Computer Engineering, Information Technology, Computer Science, or related courses.
· Work Experience: Minimum of 8 years experience in DevOps or System Administration
· Linux Mastery: Deep hands-on experience managing RHEL-based Linux systems in production, including OS tuning, permissions, storage, patching, and troubleshooting.
· Python Proficiency: Strong ability to write Python scripts for automation, API integration, configuration management, and cloud operations, including use of libraries such as boto3, requests, pytest, logging, and PyYAML.
· Bash and CLI Skills: Strong command-line and shell scripting skills for automation, administration, and troubleshooting.
· Web Stack Fluency: Strong understanding of the infrastructure and build requirements of Laravel/PHP, Python/FastApi and Next.js / Node.js / React applications.
· CI/CD Expertise: Proven experience in GitLab CI/CD, including runner management, pipeline optimization, and troubleshooting complex build and deployment failures.
· Containerization and Orchestration: Strong experience with Docker and Kubernetes for packaging, deploying, scaling, and managing applications.
· Infrastructure as Code: Proven ability to provision and manage infrastructure using Terraform, Ansible, CloudFormation, or similar tools.
· Cloud Platform Experience: Strong experience with AWS and good working knowledge of Azure and/or GCP in multi-cloud or hybrid-cloud environments.
· AWS Fundamentals: Solid hands-on knowledge of core AWS services such as EC2, S3, RDS, IAM, Security Groups, and related operational best practices.
· Monitoring and Logging: Experience implementing observability solutions using Prometheus, Grafana, ELK Stack, and other monitoring tools.
· Networking Knowledge: Good understanding of networking fundamentals, including DNS, load balancing, reverse proxies, firewalls, VPNs, TLS, connectivity, and traffic flow troubleshooting.
· Security and Compliance: Strong knowledge of cybersecurity best practices, DevSecOps, ISO 27001-aligned controls, hardening, patching, secrets management, and access control.
· Modern Architecture Support: Experience supporting microservices, containerized platforms, serverless deployments, data pipelines, machine learning pipelines, big data workloads, and scalable storage/compute environments.
· Automation Mindset: Experience automating manual processes and operational tasks to improve consistency, reliability, and efficiency.
· Hindi-speaking capability is a plus.