Site Reliability Engineer - Global Infrastructure Ops

Location

Singapore

Job Type

FULL_TIME

Experience

Skilled work

Job Description

Job Summary

Startup Inno is seeking a highly motivated and experienced Site Reliability Engineer (SRE) to join our Global Infrastructure Operations team. In this role, you will be responsible for ensuring the reliability, scalability, performance, and security of our distributed systems and cloud-based platforms. You will work at the intersection of software engineering and IT operations, applying engineering principles to automate processes, improve system resilience, and deliver highly available services to a global user base.

As an SRE at Startup Inno, you will play a critical role in designing, implementing, and maintaining infrastructure that supports business-critical applications across multiple regions. You will collaborate closely with development, DevOps, security, and product teams to proactively identify risks, resolve incidents, and continuously enhance system performance and operational excellence.


Key Responsibilities

  • Design, implement, and maintain highly reliable and scalable infrastructure for global production systems.

  • Monitor system performance, availability, and capacity using industry-standard observability tools.

  • Develop automation scripts and tools to reduce manual operations and improve system efficiency.

  • Lead incident response, root cause analysis, and post-incident reviews to prevent recurrence.

  • Build and maintain CI/CD pipelines to support rapid and reliable software deployments.

  • Ensure system security, compliance, and best practices across infrastructure environments.

  • Collaborate with engineering teams to design fault-tolerant architectures.

  • Optimize system performance, cost, and resource utilization across cloud platforms.

  • Document operational procedures, runbooks, and system architectures.

  • Participate in on-call rotations to ensure 24/7 system availability.


Required Skills and Qualifications

  • Bachelors degree in Computer Science, Engineering, Information Technology, or a related field.

  • Strong experience with Linux/Unix system administration.

  • Proficiency in cloud platforms such as AWS, Azure, or Google Cloud Platform (GCP).

  • Hands-on experience with containerization and orchestration tools (Docker, Kubernetes).

  • Strong scripting skills in Python, Bash, Go, or similar languages.

  • Experience with monitoring and observability tools (Prometheus, Grafana, Datadog, ELK).

  • Knowledge of infrastructure as code tools (Terraform, Ansible, CloudFormation).

  • Solid understanding of networking concepts, security protocols, and system architecture.

  • Familiarity with CI/CD tools (Jenkins, GitHub Actions, GitLab CI).

  • Excellent problem-solving and troubleshooting skills.


Experience

  • 3+ years of experience in Site Reliability Engineering, DevOps, or Infrastructure Engineering roles.

  • Proven experience managing large-scale distributed systems in production environments.

  • Experience supporting high-availability and mission-critical systems.

  • Background in automating infrastructure and operational processes.

  • Experience working in Agile and DevOps-driven organizations is preferred.


Working Hours

  • Full-time position (40 hours per week).

  • Flexible working hours with overlap for global team collaboration.

  • On-call rotation required for critical infrastructure support.

  • Remote or hybrid work options depending on location and team needs.


Knowledge, Skills and Abilities

  • Strong analytical mindset with the ability to diagnose complex system issues.

  • Ability to work under pressure in high-impact situations.

  • Excellent communication skills for cross-functional collaboration.

  • Proactive approach to identifying and mitigating operational risks.

  • Strong documentation and knowledge-sharing skills.

  • Passion for automation, system reliability, and continuous improvement.

  • Ability to manage multiple priorities in a fast-paced environment.


Benefits

  • Competitive salary and performance-based bonuses.

  • Flexible remote/hybrid working environment.

  • Health insurance and wellness programs.

  • Paid time off, holidays, and personal leave.

  • Learning and development budget for certifications and training.

  • Access to the latest tools and technologies.

  • Career growth opportunities within a rapidly scaling organization.

  • Inclusive and diverse work culture.


Why Join Startup Inno?

At Startup Inno, we are building the next generation of innovative digital solutions for global markets. Joining our team means working in a dynamic, high-growth environment where your contributions directly impact product performance and customer experience. You will be empowered to experiment, innovate, and lead infrastructure initiatives that shape the future of our technology platform.

We value creativity, ownership, and collaboration, and we are committed to creating an environment where talented professionals can thrive and grow.


How to Apply

Interested candidates are encouraged to apply by submitting their updated resume and a brief cover letter highlighting their relevant experience and technical expertise. Shortlisted candidates will be contacted for technical interviews and assessments.

Additional Details

Similar Jobs

Apply Now