SRE Observability Lead - Remote Kubernetes Clusters

Location

Singapore

Job Type

FULL_TIME

Experience

Skilled work

Job Description

Job Summary

Startup Inno is seeking an experienced SRE Observability Lead to drive the monitoring, reliability, and performance of our cloud-native infrastructure. This role will focus on building end-to-end observability across Kubernetes clusters, ensuring optimal system performance, scalability, and reliability. You will partner closely with engineering and DevOps teams to design, implement, and maintain robust monitoring, logging, and alerting systems, enabling proactive incident detection and response. This is an excellent opportunity for a proactive, technically skilled leader to shape the observability culture in a fast-paced, innovative startup environment.

Key Responsibilities

Lead the design, implementation, and management of observability solutions across multiple Kubernetes clusters.
Develop and maintain monitoring, logging, tracing, and alerting systems to ensure service reliability.
Collaborate with SRE, DevOps, and engineering teams to define SLIs, SLOs, and error budgets.
Proactively identify potential system performance bottlenecks and recommend scalable solutions.
Implement automated tools for system health checks, incident response, and postmortem analysis.
Mentor and guide junior SRE and engineering team members on observability best practices.
Work closely with product and engineering teams to provide operational insights that inform architecture and development decisions.
Continuously evaluate and recommend new technologies and tools to enhance observability capabilities.

Required Skills and Qualifications

Strong expertise in Kubernetes architecture and operations.
Proven experience with observability tools: Prometheus, Grafana, Jaeger, OpenTelemetry, ELK stack, or equivalent.
Solid understanding of cloud platforms (AWS, GCP, or Azure) and container orchestration.
Proficiency in scripting and automation (Python, Go, Bash, or similar).
Experience in monitoring distributed systems and microservices architectures.
Strong incident management and troubleshooting skills in complex production environments.
Excellent collaboration, leadership, and communication skills.

Experience

Minimum of 5–7 years of experience in Site Reliability Engineering, DevOps, or cloud infrastructure roles.
At least 3 years in observability-focused roles with hands-on experience in Kubernetes environments.
Experience leading or mentoring teams in observability, monitoring, and reliability practices.

Working Hours

Full-time, remote position.
Flexible hours, with occasional on-call rotation for incident management.
Overlap with global engineering teams may be required for collaboration.

Knowledge, Skills, and Abilities

Deep understanding of distributed systems, containerized workloads, and cloud-native architectures.
Ability to analyze metrics, logs, and traces to identify patterns, anomalies, and performance issues.
Strong problem-solving skills with the ability to make quick, data-driven decisions.
Skilled in designing scalable, highly available systems with a focus on operational excellence.
Exceptional interpersonal skills to communicate complex technical information to non-technical stakeholders.

Benefits

Competitive salary with performance-based bonuses.
Fully remote work with flexible schedules.
Professional development and training opportunities.
Access to cutting-edge observability and cloud-native tools.
Health, wellness, and insurance packages (where applicable by region).
Collaborative and innovative startup culture with opportunities for impact.

Why Join Startup Inno?

Be a part of a fast-growing, innovative startup shaping the future of cloud-native solutions.
Work with a highly skilled and collaborative team passionate about technology and innovation.
Take ownership of critical reliability and observability initiatives that directly impact product performance.
Access to continuous learning opportunities and career growth in a cutting-edge tech environment.

How to Apply

Interested candidates should submit their resume and cover letter highlighting relevant observability and Kubernetes experience to us. Please include examples of prior work with monitoring systems, distributed architectures, or Kubernetes reliability projects.

Additional Details

Job Seeker Safety

We never ask for payment for job applications. If an employer requests money or sensitive bank details, report it to us immediately.

Similar Jobs

American Airlines

Flight Booking Support Executive – Remote – Night Shift

FULL_TIME

Global MNC Tech

Virtual Receptionist for Small Business – Remote – Morning Shift (9 AM - 1 PM)

FULL_TIME

Global MNC Tech

Remote Content Moderator – Gaming Community – Discord & Telegram Experience

FULL_TIME

Global MNC Tech

Data Entry Clerk – Work from Home – Part Time for College Students

FULL_TIME

Apply Now

FreelanceShop is a premium job portal designed for the modern era of work. We make it easy for freelancers to showcase their expertise and for employers to find the perfect match for their projects, all in one place.

SRE Observability Lead - Remote Kubernetes Clusters

Job Description

Job Summary

Key Responsibilities

Required Skills and Qualifications

Experience

Working Hours

Knowledge, Skills, and Abilities

Benefits

Why Join Startup Inno?

How to Apply

Additional Details

Startup Inno

Job Seeker Safety

Similar Jobs

American Airlines

Flight Booking Support Executive – Remote – Night Shift

Global MNC Tech

Virtual Receptionist for Small Business – Remote – Morning Shift (9 AM - 1 PM)

Global MNC Tech

Remote Content Moderator – Gaming Community – Discord & Telegram Experience

Global MNC Tech

Data Entry Clerk – Work from Home – Part Time for College Students

My Account

Helpful Links

Information

SRE Observability Lead - Remote Kubernetes Clusters

Job Description

Job Summary

Key Responsibilities

Required Skills and Qualifications

Experience

Working Hours

Knowledge, Skills, and Abilities

Benefits

Why Join Startup Inno?

How to Apply

Additional Details

Startup Inno

Job Seeker Safety

Similar Jobs

American Airlines

Flight Booking Support Executive – Remote – Night Shift

Global MNC Tech

Virtual Receptionist for Small Business – Remote – Morning Shift (9 AM - 1 PM)

Global MNC Tech

Remote Content Moderator – Gaming Community – Discord & Telegram Experience

Global MNC Tech

Data Entry Clerk – Work from Home – Part Time for College Students

My Account

Helpful Links

Information

Welcome Back!