Sedona
FULL_TIME
Skilled work
Global MNC Tech is seeking a highly skilled and proactive SRE Observability Engineer (Datadog Specialist) to join our growing remote engineering team. This role is critical in ensuring the reliability, performance, and scalability of our cloud-based platforms by building and maintaining world-class observability systems.
As an Observability Engineer, you will be responsible for designing, implementing, and optimizing monitoring, logging, and alerting solutions using Datadog. You will work closely with Site Reliability Engineers, DevOps, Platform, and Product teams to provide deep visibility into system health, reduce incident response times, and drive a culture of reliability and data-driven operations.
This is a fully remote role, offering the opportunity to work with global teams on high-impact, mission-critical systems.
Design, implement, and manage end-to-end observability solutions using Datadog across distributed systems.
Develop and maintain dashboards, alerts, and monitors for infrastructure, applications, and services.
Build and optimize logging, tracing, and metrics pipelines.
Partner with SRE and DevOps teams to improve system reliability, availability, and performance.
Lead incident response analysis by providing actionable insights through observability data.
Define SLIs, SLOs, and SLAs and ensure compliance across platforms.
Continuously improve monitoring strategies to support scalability and resilience.
Automate observability processes using Infrastructure as Code (Terraform, CloudFormation, etc.).
Train engineering teams on best practices for observability and monitoring.
Contribute to post-incident reviews and implement preventive measures.
Strong hands-on experience with Datadog (core requirement).
Proficiency in cloud platforms such as AWS, Azure, or Google Cloud Platform.
Solid understanding of SRE principles and observability concepts (metrics, logs, traces).
Experience with containerized environments (Docker, Kubernetes).
Strong scripting skills in Python, Bash, or similar languages.
Knowledge of CI/CD pipelines and DevOps practices.
Experience with Infrastructure as Code tools (Terraform, Pulumi, CloudFormation).
Excellent problem-solving and analytical skills.
Strong communication skills with the ability to work in cross-functional teams.
4+ years of experience in SRE, DevOps, Platform Engineering, or Observability roles.
2+ years of hands-on experience with Datadog in production environments.
Experience working with large-scale, distributed systems.
Prior experience in SaaS or cloud-native environments is highly preferred.
Fully remote role.
Flexible working hours with core overlap aligned to global team collaboration.
Occasional on-call rotation for incident response (planned and compensated).
Deep understanding of system architecture and microservices.
Strong ability to interpret complex telemetry data and turn it into actionable insights.
Expertise in performance monitoring and capacity planning.
Ability to work independently in a remote environment.
Strong documentation and knowledge-sharing mindset.
High attention to detail and strong ownership of system reliability.
Continuous learning attitude towards emerging observability tools and practices.
Competitive salary package.
100% remote work flexibility.
Health insurance and wellness programs.
Paid time off, sick leave, and public holidays.
Learning and development budget for certifications and training.
Career growth opportunities in a global organization.
Performance bonuses and recognition programs.
Collaborative, inclusive, and innovation-driven culture.
At Global MNC Tech, we build scalable digital platforms used by thousands of customers worldwide. You will join a high-performing engineering team that values innovation, automation, and operational excellence.
We offer a truly global remote culture, modern technology stack, and the freedom to experiment, learn, and grow. Your work will directly impact system reliability and customer experience at scale.
This role is ideal for engineers who are passionate about reliability, data, and building robust systems that never sleep.
Interested candidates are invited to submit their updated resume along with a brief cover letter highlighting their experience with Datadog and observability systems.
Shortlisted candidates will go through a structured interview process including:
Technical assessment
System design discussion
Final leadership interview
Apply today and become a key contributor in shaping the reliability backbone of Global MNC Techs digital infrastructure.