SRE (Site Reliability Engineer) - Remote Platform Ops

Location

Delavan

Job Type

FULL_TIME

Experience

Skilled work

Job Description

Job Summary

Global MNC Tech is seeking a highly skilled and proactive Site Reliability Engineer (SRE) to join our Remote Platform Operations team. This role is critical to ensuring the reliability, scalability, security, and performance of our cloud-based platforms that support global enterprise clients. As an SRE, you will bridge the gap between software engineering and IT operations by designing resilient systems, automating operational tasks, and driving a culture of reliability across the organization.

You will work closely with software engineers, DevOps, security, and product teams to build highly available systems, reduce operational toil, and improve incident response. This is an excellent opportunity for a technically strong engineer who enjoys solving complex problems at scale in a fully remote, globally distributed environment.


Key Responsibilities

  • Design, build, and maintain highly reliable, scalable, and fault-tolerant production systems.

  • Develop and maintain monitoring, alerting, and observability frameworks for critical services.

  • Automate infrastructure provisioning, configuration management, and deployment pipelines.

  • Participate in on-call rotations and lead incident response, root cause analysis, and post-incident reviews.

  • Improve system performance, capacity planning, and disaster recovery strategies.

  • Collaborate with development teams to implement reliability best practices throughout the SDLC.

  • Define and track Service Level Objectives (SLOs), SLAs, and error budgets.

  • Reduce operational toil by creating tools and scripts to streamline manual processes.

  • Ensure compliance with security, data protection, and operational standards.


Required Skills and Qualifications

  • Strong experience with Linux/Unix systems administration.

  • Proficiency in one or more programming languages (Python, Go, Java, or similar).

  • Hands-on experience with cloud platforms (AWS, Azure, or Google Cloud).

  • Expertise in containerization and orchestration tools (Docker, Kubernetes).

  • Solid knowledge of CI/CD pipelines and infrastructure as code (Terraform, Ansible, CloudFormation).

  • Experience with monitoring and observability tools (Prometheus, Grafana, ELK, Datadog, or similar).

  • Strong understanding of networking, system architecture, and distributed systems.

  • Excellent problem-solving and troubleshooting skills.

  • Strong communication and documentation abilities.


Experience

  • Bachelors degree in Computer Science, Engineering, or a related field (or equivalent experience).

  • 3–7 years of experience in Site Reliability Engineering, DevOps, or Systems Engineering roles.

  • Proven experience managing production environments at scale.

  • Experience working in agile or DevOps-driven organizations.


Working Hours

  • Fully remote position with flexible working hours.

  • Collaboration across global time zones; availability for occasional overlap meetings.

  • Participation in an on-call rotation schedule for incident support.


Knowledge, Skills, and Abilities

  • Deep understanding of system reliability, scalability, and high-availability architectures.

  • Ability to design systems with resilience, redundancy, and automation in mind.

  • Strong analytical mindset with a passion for continuous improvement.

  • Ability to work independently in a remote environment.

  • Strong collaboration skills for cross-functional teamwork.

  • Adaptability to fast-changing technical environments.


Benefits

  • Competitive salary package based on experience and location.

  • Fully remote work with flexible scheduling.

  • Performance-based bonuses and annual reviews.

  • Health insurance and wellness programs.

  • Learning and development budget for certifications and training.

  • Access to cutting-edge technologies and large-scale systems.

  • Paid time off, holidays, and mental wellness days.


Why Join Global MNC Tech?

At Global MNC Tech, we believe reliability is the foundation of innovation. You will be part of a high-impact engineering team that builds and maintains platforms used by global clients across industries. We foster a culture of trust, ownership, continuous learning, and technical excellence. This role offers long-term career growth, exposure to large-scale systems, and the opportunity to shape the future of platform reliability in a world-class organization.


How to Apply

Interested candidates should submit their updated CV/resume along with a brief cover letter highlighting their relevant experience in Site Reliability Engineering and cloud platforms. Shortlisted candidates will be contacted for technical interviews and remote assessments.

Additional Details

Similar Jobs

Apply Now