Senior Software Site Reliability Engineer

Credit Karma, Inc.
Charlotte, North Carolina, US
Full-time
We are sorry. The job offer you are looking for is no longer available.

The SiteOps Engineering team ensures reliability for the Credit Karma ecosystem, both Native and Web. The team combines excellent incident and problem management to facilitate troubleshooting and continuous improvement.

The team also develops automation and AI capabilities to ensure minimum toil across the engineering organization. You will be reporting to the Senior Manager of Engineering.

As a SRE, you will be a strong technical contributor for the team and Credit Karma Platform Engineering. You will help evolve our technology through automation, reliable architecture and help increase velocity by collaborating across engineering to facilitate adoption of best practices.

Find out exactly what skills, experience, and qualifications you will need to succeed in this role before applying below.

What you’ll do :

  • Contribute to overall change, incident and problem management in our environment with a focus on troubleshooting and fast restoration of our essential services and preventing future outages.
  • Participate in a once a month 24x7 on-call rotation and take leadership of severe incidents to help minimize impact.
  • Assist engineering teams by conducting truly blameless post mortems with focused action items to drive continuous improvements.
  • Provide insights on trends of issues affecting reliability and partner in cross functional projects to provide scalable solutions.
  • Review and advise on high risk platform changes to minimize impact to the site and maximize success for stakeholders
  • Work within a large distributed system based on Cloud Native services.
  • Maintain an automation-centric vision and incorporate SRE methodologies to increase reliability and decrease toil.
  • Participate in technical design and architecture decisions and contribute to technical troubleshooting in various parts of the system.
  • Create operating standards to help drive reliability at CK.

What’s great about the role :

  • You will have the opportunity to contribute to an engineering first focused organization.
  • Your contributions will have a noticeable impact on Credit Karma's members and your fellow Karmanauts (that's what we call ourselves).
  • You will be involved in organizational efforts of continuous improvement to increase and ensure the reliability of Credit Karma.
  • You will get broad exposure to our full stack, consisting of forward-looking technologies such as GenAI / LLM, Incident Automation, Automated Observability at Scale, etc.
  • You will grow and learn and have fun doing it it's part of our culture.
  • And, of course, all those awesome company perks that you have probably already read about.

Minimum Basic Requirement :

  • 5+ years of experience with Site Reliability Engineering with a focus on Infrastructure, Platform, and Application (Cloud, Containerization, Container orchestration, Network, Application Reliability, Database Architecture) and an understanding of full stack and SDLC practices (Software Development Life Cycle) in DevOps or continuous release environment..
  • Experience in running critical incidents in a global or company-wide context, engaging with executives and senior leadership, and leading root cause analysis sessions.
  • Experience running and monitoring applications at scale, using metrics and tracing tools like, New Relic, Data Dog, Stackdriver, Zipkin, Prometheus, etc.
  • Professional experience with Python, Go, or similar programming languages.
  • Experience developing production quality tooling.
  • Familiarity with SRE methodologies; passionate about solving operational challenges by using automation and software.
  • Ability to communicate effectively vertically and horizontally within the organization through demonstrating written and verbal communication skills.

Preferred Qualifications :

  • Ability to drive troubleshooting through a pragmatic and collaborative approach.
  • Can construct clear and concise insights from data to promote and champion measurable improvements.
  • Experience working with Cloud Native services in a Public Cloud, e.g. Google Cloud Platform, AWS, Azure.
  • Software / Service development and / or full lifecycle ownership experience with additional languages (eg. Scala, Typescript, JS, Java, C++).

Benefits at Credit Karma includes :

  • Medical and Dental Coverage
  • Retirement Plan
  • Commuter Benefits
  • Wellness perks
  • Paid Time Off (Vacation, Sick, Baby Bonding, Cultural Observance, & More)
  • Education Perks
  • Paid Gift Week in December

J-18808-Ljbffr

6 days ago
Related jobs
Promoted
Lowe's
Charlotte, North Carolina

This includes building software and systems to manage platform infrastructure and applications to improve the reliability and quality of our suite of software solutions. This role provides primary operational support and engineering for multiple large, distributed software applications. Improve reli...

Promoted
Franklin Energy Services
Charlotte, North Carolina

The Senior Software Engineer will work within the Technology Group to expand, enhance and strengthen our technology platform to amplify the Company's competitive advantage in the market. The Senior Software Engineer works within an Agile team environment. Strong background in object-oriented softwar...

The Hartford
Charlotte, North Carolina

Ability to design and implement new software architecture patterns in Cloud that are scalable, secure and cost efficient, adhering industry standards such as multi region support with fault tolerant and data replication strategies. Executes on Production Engineering process and practices such as inc...

Red Hat, Inc.
Remote US NC
Remote

Correct software and/or configuration file of live and lab system to restore failing system and/or improve the product performance. Computer Science, Telecommunication Engineering, or a related field and three (3) years of experience in the job offered or related role. ...

95-2566122 First American Title Insurance Co
USA, North Carolina, Remote
Remote

Generative AI Engineer (REMOTE). First American is looking for a Generative AI Engineer to join our team. Collaborate with the engineering team to integrate AI models into our production systems. ...

Highmark Health
NC, Working at Home, N Carolina

Knowledge of professional software engineering practices & best practices for the full software development life cycle, including coding standards, unit testing, code reviews, source control management, build processes, testing, and operations. Travel regularly from the office to various work sites ...

Seyfarth Shaw
Charlotte, North Carolina
Remote

As a Senior Software Engineer, you will develop custom business solutions for our internal intranet platform using primarily Microsoft technologies. With your knowledge and expertise and under the supervision and mentorships of the Lead Software Engineer and Application Development Manager, you will...

The Judge Group
Charlotte, North Carolina

Role Summary: As a Senior Software Engineer, you’ll be part of our dynamic Software Engineering team. Job Title: Senior Software Engineer (Java). Your expertise will help us meet our Software Engineering goals while maintaining compliance with policies and procedures. Collaborate with cross-function...

GEICO
Charlotte, North Carolina
Remote

Our Staff Engineer works with our Sr Staff Engineer and Sr. Develop and execute technical software development strategy for the Observability Engineering domain. GEICO is seeking an experienced Staff Engineer with a passion for building high-performance, low maintenance, zero-downtime platforms, and...

GXO Logistics, Inc.
Charlotte, North Carolina

Bachelor’s degree in Software Engineering or related technical field, or certification in software engineering from ACM or IEEE or equivalent related work or military experience. Participate as a high-level technical expert in design development, coding, testing and debugging new software or signifi...