Thoothukudi
FULL_TIME
Skilled work
Global MNC Tech is seeking a highly skilled Site Reliability Engineering (SRE) Lead to oversee the reliability, performance, and scalability of our high-traffic news portals. The ideal candidate will be a proactive problem-solver with deep expertise in cloud infrastructure, monitoring, incident management, and automation. You will lead a talented SRE team, implement best practices for operational excellence, and ensure our platforms deliver a seamless experience to millions of users worldwide.
Lead and mentor the SRE team to maintain high availability, reliability, and scalability of high-traffic news portals.
Design, implement, and maintain robust monitoring, alerting, and incident response systems.
Collaborate with engineering teams to optimize application performance and improve system resilience.
Drive automation of operational tasks including deployments, scaling, and infrastructure management.
Conduct post-incident reviews and implement preventive measures for recurring issues.
Define and enforce SLOs, SLIs, and error budgets to ensure service reliability targets.
Evaluate and adopt new tools, technologies, and methodologies to enhance reliability engineering practices.
Lead cross-functional initiatives during high-impact incidents and production outages.
Strong expertise in cloud platforms (AWS, GCP, or Azure) and container orchestration (Kubernetes, Docker).
Deep understanding of CI/CD pipelines, Infrastructure as Code (Terraform, Ansible).
Hands-on experience with monitoring and observability tools (Prometheus, Grafana, ELK, Datadog).
Proficiency in scripting languages (Python, Go, Bash) for automation and operational tasks.
Solid knowledge of networking, caching, database performance, and distributed systems.
Experience managing high-traffic web applications with millions of concurrent users.
Strong analytical, problem-solving, and incident response capabilities.
7+ years in site reliability engineering, systems engineering, or related roles.
3+ years of leadership experience managing SRE teams or large-scale operational projects.
Proven track record in maintaining uptime and performance for high-traffic web applications.
Full-time role with flexible working hours.
On-call rotation may be required to support global news portal operations.
Remote-first position, with occasional cross-timezone collaboration.
Exceptional leadership and team management skills.
Excellent communication skills with ability to explain technical concepts to non-technical stakeholders.
Strong decision-making and prioritization abilities under pressure.
Continuous learning mindset and ability to stay current with industry trends.
Ability to plan and execute complex projects in fast-paced environments.
Competitive salary and performance-based bonuses.
Comprehensive health, dental, and vision insurance.
Generous paid time off and flexible remote work options.
Professional development budget and access to cutting-edge technologies.
Collaborative, inclusive, and innovative work culture.
Work with globally recognized high-traffic platforms reaching millions daily.
Lead initiatives that directly impact user experience and operational excellence.
Join a forward-thinking company that values innovation, reliability, and employee growth.
Be part of a supportive team that fosters creativity, learning, and professional advancement.
Interested candidates are invited to submit their resume and a cover letter detailing their SRE leadership experience and notable projects. Please apply through our career portal at Global MNC Tech Careers.