SRE Observability Engineer - Remote (Datadog Specialist)

Location

Sedona

Job Type

FULL_TIME

Experience

Skilled work

Job Description

Job Summary

Global MNC Tech is seeking a highly skilled and proactive SRE Observability Engineer (Datadog Specialist) to join our growing remote engineering team. This role is critical in ensuring the reliability, performance, and scalability of our cloud-based platforms by building and maintaining world-class observability systems.

As an Observability Engineer, you will be responsible for designing, implementing, and optimizing monitoring, logging, and alerting solutions using Datadog. You will work closely with Site Reliability Engineers, DevOps, Platform, and Product teams to provide deep visibility into system health, reduce incident response times, and drive a culture of reliability and data-driven operations.

This is a fully remote role, offering the opportunity to work with global teams on high-impact, mission-critical systems.


Key Responsibilities

  • Design, implement, and manage end-to-end observability solutions using Datadog across distributed systems.

  • Develop and maintain dashboards, alerts, and monitors for infrastructure, applications, and services.

  • Build and optimize logging, tracing, and metrics pipelines.

  • Partner with SRE and DevOps teams to improve system reliability, availability, and performance.

  • Lead incident response analysis by providing actionable insights through observability data.

  • Define SLIs, SLOs, and SLAs and ensure compliance across platforms.

  • Continuously improve monitoring strategies to support scalability and resilience.

  • Automate observability processes using Infrastructure as Code (Terraform, CloudFormation, etc.).

  • Train engineering teams on best practices for observability and monitoring.

  • Contribute to post-incident reviews and implement preventive measures.


Required Skills and Qualifications

  • Strong hands-on experience with Datadog (core requirement).

  • Proficiency in cloud platforms such as AWS, Azure, or Google Cloud Platform.

  • Solid understanding of SRE principles and observability concepts (metrics, logs, traces).

  • Experience with containerized environments (Docker, Kubernetes).

  • Strong scripting skills in Python, Bash, or similar languages.

  • Knowledge of CI/CD pipelines and DevOps practices.

  • Experience with Infrastructure as Code tools (Terraform, Pulumi, CloudFormation).

  • Excellent problem-solving and analytical skills.

  • Strong communication skills with the ability to work in cross-functional teams.


Experience

  • 4+ years of experience in SRE, DevOps, Platform Engineering, or Observability roles.

  • 2+ years of hands-on experience with Datadog in production environments.

  • Experience working with large-scale, distributed systems.

  • Prior experience in SaaS or cloud-native environments is highly preferred.


Working Hours

  • Fully remote role.

  • Flexible working hours with core overlap aligned to global team collaboration.

  • Occasional on-call rotation for incident response (planned and compensated).


Knowledge, Skills, and Abilities

  • Deep understanding of system architecture and microservices.

  • Strong ability to interpret complex telemetry data and turn it into actionable insights.

  • Expertise in performance monitoring and capacity planning.

  • Ability to work independently in a remote environment.

  • Strong documentation and knowledge-sharing mindset.

  • High attention to detail and strong ownership of system reliability.

  • Continuous learning attitude towards emerging observability tools and practices.


Benefits

  • Competitive salary package.

  • 100% remote work flexibility.

  • Health insurance and wellness programs.

  • Paid time off, sick leave, and public holidays.

  • Learning and development budget for certifications and training.

  • Career growth opportunities in a global organization.

  • Performance bonuses and recognition programs.

  • Collaborative, inclusive, and innovation-driven culture.


Why Join Global MNC Tech?

At Global MNC Tech, we build scalable digital platforms used by thousands of customers worldwide. You will join a high-performing engineering team that values innovation, automation, and operational excellence.

We offer a truly global remote culture, modern technology stack, and the freedom to experiment, learn, and grow. Your work will directly impact system reliability and customer experience at scale.

This role is ideal for engineers who are passionate about reliability, data, and building robust systems that never sleep.


How to Apply

Interested candidates are invited to submit their updated resume along with a brief cover letter highlighting their experience with Datadog and observability systems.

Shortlisted candidates will go through a structured interview process including:

  • Technical assessment

  • System design discussion

  • Final leadership interview

Apply today and become a key contributor in shaping the reliability backbone of Global MNC Techs digital infrastructure.

Additional Details

Similar Jobs

Apply Now