Search jobs > Houston, TX > Reliability engineer

Lead Software Reliability Engineer - Business Ecosystem

Grab
Houston, Texas, US
Full-time

Life at Grab

If you are interested in applying for this job, please make sure you meet the following requirements as listed below.

At Grab, every Grabber is guided by The Grab Way , which spells out our mission, how we believe we can achieve it, and our operating principles - the 4Hs : Heart, Hunger, Honour and Humility .

These principles guide and help us make decisions as we work to create economic empowerment for the people of Southeast Asia.

Get to know the Team

The Business & Transaction Platform, SNP and DNA SRE team is a longstanding team responsible for the stable operation of the core Grab systems.

We make an impact by contributing to Business & Transaction Platform, Search & Personalization, Demand and Ads systems and the company's stability and operational excellence.

Our team is made up of a group of passionate Site Reliability Engineers. If you are looking for an opportunity to work in a large scale cloud environment and utilize your sharp ideas to make engineers’ life better, then you should join our team!

Get to know the Role

We are looking for a Lead Software Reliability Engineer to provide better stability and operational excellence for Business & Transaction Platform, SNP and DNA tech families in Grab.

We believe a successful candidate has professional sysops / infrastructure knowledge and the ability to build comprehensive systems, but if you believe you have what it takes then we’d love to hear from you either way.

This role is required because stability and operational excellence is critical to our services. In return, you will get an opportunity to generate impacts to Grab’s core systems.

The Day-to-Day Activities

  • Engage in and improve the whole lifecycle of services - from design, through deployment, operation and refinement.
  • Work with engineering teams to design and write code to create systems which are highly available and able to scale seamlessly.
  • Help improve reliability, stability and scalability challenges with engineering teams.
  • Get involved in deep diagnosis of incidents, and engage with multiple highly skilled engineering teams on resolutions.
  • Maintain services once they are live by measuring and monitoring availability, latency and overall system health.
  • Contribute to a culture of learning and responsibility by guiding teams to write detailed postmortem reports.
  • Identify and resolve problems relating to critical service operations and to prevent their recurrence using automation.
  • Be part of a cool team, responsible for one of the largest cloud based services in Southeast Asia.
  • Mentor other engineers, define our technical culture, set high engineering bars and help build a fast-growing team.
  • Lead other engineers to conquer challenging projects with great qualities.
  • Contribute initiatives to improve tech family’s stability and operational excellence.

The Must-Haves

  • Bachelor's or Master's degree in Computer Science, Software Engineering, Information Technology or related technical field involving coding.
  • Preferably with at least 5 years of relevant experience in this role.
  • Strong experience with algorithms, data structures, complexity analysis and software design.
  • Strong experience in one or more of the following : Go, Python, C, C++, Java, Perl or Ruby.
  • Strong experience in using service monitoring, log, and alarm-related environments and tools.
  • Strong experience in system troubleshooting in a Linux environment.
  • Solid experience in using Linux commands and shell script, and has the ability to automate routine tasks.
  • Solid experience with automation & provisioning tools (e.g., Jenkins, Ansible / Chef / SaltStack / Puppet).
  • Possess analytical skills, mental resilience and the ability to think systematically under stressful conditions.
  • Highly accountable and takes ownership. Outstanding work ethic, high-integrity, team player, and a lifelong learner.
  • Proficiency in verbal and written English.

The Nice-to-Haves

  • Experience in Go.
  • Experience with cloud-based large-scale infrastructure from vendors such as Amazon Web Services, Azure or Google Cloud Platform.
  • Experience with containerization technologies (e.g., Docker) and container orchestration platforms (e.g., Kubernetes).
  • Experience on building high throughput streaming services, and knowledge on the streaming processing framework such as Flink.
  • Contributes to open source project experience with performance analysis and debugging tools.

Our Commitment

We recognize that with these individual attributes come different workplace challenges, and we will work with Grabbers to address them in our journey towards creating inclusion at Grab for all Grabbers.

J-18808-Ljbffr

1 day ago
Related jobs
Promoted
CACI
Houston, Texas

CACI is looking for experienced lead spaceflight simulation software engineers to provide project leadership and technical expertise in simulation software math model development and integration to support human-rated space vehicle engineering and training simulations. NASA Software Project Lead Eng...

Promoted
myGwork - LGBTQ+ professionals & allies
Houston, Texas

As a Lead Software Engineer at JPMorgan Chase within the Oracle Database Product Team, you are an integral part of an agile team that works to enhance, build, and deliver trusted market-leading technology products in a secure, stable, and scalable way. Leads communities of practice across Software E...

Promoted
AgileEngine
Houston, Texas
Remote

AgileEngine is a top-ranking provider of software solutions to Fortune 500, Global 500, and Future 50 companies. US companies, we are always open to talented software, UX, and data experts in the Americas, Europe, and Asia. Collaborate closely and build rapport with product, research, and engineerin...

Promoted
CACI
Houston, Texas

CACI is looking for experienced lead human spaceflight software engineers to provide project leadership and technical expertise in embedded software systems to support development of human-rated space vehicle avionics and subsystem software. Software Project Lead Engineer Space Vehicle Embedded Soft...

Prudential Financial
TX, US

As a Lead Software Engineer on/in Data Management & Governance you will partner with product owners, tech leads, designers, engineers and delivery professionals to improve Data Management and Governance services. You will code, test and debug new and existing applications as you implement capabi...

JPMorgan Chase & Co.
Houston, Texas

As a Lead Security Engineer at JPMorgan Chase within the Cybersecurity and Technology Controls organization, you are an integral part of team that works to deliver software solutions that satisfy pre-defined functional and user requirements with the added dimension of preventing misuse, circumventio...

JPMorgan Chase Bank, N.A.
Houston, Texas

Job responsibilities * Regularly provides technical guidance and direction to support the business and its technical teams, contractors, and vendors * Develops secure and high-quality production code, and reviews and debugs code written by others * Drives decisions that...

JPMorgan Chase & Co.
Houston, Texas

As a Lead Software Engineer at JPMorgan Chase within the Risk Technology Team, you are an integral part of an agile team that works to enhance, build, and deliver trusted market-leading technology products in a secure, stable, and scalable way. Leads communities of practice across Software Engineeri...

New Relic, Inc.
Houston, Texas

Lead Software Engineer- Cloud Platform. Lead Software Engineer- Cloud Platform. Lead Software Engineer Req ID FY|R&D|#1 Location(s) Atlanta, Georgia, USA; Austin, Texas, USA; Baltimore, Maryland, USA; Boise, Idaho, USA; Boston, Massachusetts, USA; Charleston, South Carolina, USA; Chicago, Illinois, ...

JPMorgan Chase & Co.
Houston, Texas

As a Senior Lead Software Engineer at JPMorgan Chase within the Corporate Sector in Kubernetes Product team, you are an integral part of an agile team that works to enhance, build, and deliver trusted market-leading technology products in a secure, stable, and scalable way. Formal training or certif...