This position is posted by Jobgether on behalf of a partner company. We are currently looking for a Principal Site Reliability Engineer in the United States.
This role offers a high-impact opportunity to architect, optimize, and maintain a hybrid infrastructure connecting edge devices, on-premises systems, and cloud platforms. You will lead efforts to ensure system reliability, security, and cost-effectiveness while collaborating across engineering, data science, and product teams. The position requires deep technical expertise in cloud and edge technologies, infrastructure automation, and performance optimization. You will have ownership of platform stability, integration, and scalability, enabling diverse teams to deliver innovative, data-intensive applications. This role is ideal for a problem-solver who thrives in fast-paced, collaborative environments and is passionate about building systems that support complex, mission-critical operations.
Accountabilities
- Architect, implement, and maintain hybrid systems spanning cloud, on-premises, and edge environments.
- Integrate diverse systems and devices to create stable, high-performance platforms with robust monitoring and uptime.
- Optimize system performance at low levels (filesystem, networking, software) and high levels (cost, operational stability, supportability).
- Collaborate with cross-functional teams to develop scalable, user-friendly applications and data workflows.
- Build tools that enable seamless integration between applications and platforms within a cohesive ecosystem.
- Create and maintain comprehensive technical documentation, including architecture diagrams, data flows, and standard operating procedures.
- Evaluate and recommend emerging technologies, frameworks, and tools to enhance platform capabilities.
Requirements
8+ years of experience building and maintaining production infrastructure with Kubernetes, AWS, and bare metal.8+ years of experience with Python and Go in production environments.Proficiency with infrastructure automation tools such as Terraform, Terragrunt, Pulumi, or CDK.Strong expertise in Linux-based systems, networking, and security.Experience with CI / CD pipelines (GitHub Actions, GitLab, Jenkins) and cloud services (AWS ECS, EKS, IAM, EC2, RDS).Deep knowledge of Kubernetes internals, cluster architecture, and operator development.Strong analytical and problem-solving skills for complex infrastructure and networking challenges.Excellent communication skills for collaboration with both technical and non-technical stakeholders.Attention to detail and commitment to producing high-quality, well-documented code.Preferred :
Experience with SQL, NoSQL, and MPP databases.Familiarity with orchestration frameworks such as Airflow or Kubeflow.Exposure to C++ or Rust, or experience coordinating with teams using those languages.Previous work in autonomy, robotics, or related high-tech fields.Benefits
Competitive salary range : $166,000 - $293,000 USD per year, plus bonus eligibility.Comprehensive benefits package and equity opportunities.Remote-first work flexibility with potential on-site collaboration.Opportunity to work on cutting-edge AI, robotics, and cloud-edge integration projects.Inclusive, diverse, and high-performance work culture with mentorship and career development.Significant impact on mission-critical infrastructure supporting innovative products and platforms.Jobgether is a Talent Matching Platform that partners with companies worldwide to efficiently connect top talent with the right opportunities through AI-driven job matching.
When you apply, your profile goes through our AI-powered screening process designed to identify top talent efficiently and fairly.
🔍 Our AI evaluates your CV and LinkedIn profile thoroughly, analyzing your skills, experience, and achievements.
📊 It compares your profile to the job’s core requirements and past success factors to determine your match score.
🎯 Based on this analysis, we automatically shortlist the 3 candidates with the highest match to the role.
🧠 When necessary, our human team may perform an additional manual review to ensure no strong profile is missed.
The process is transparent, skills-based, and free of bias — focusing solely on your fit for the role.
Once the shortlist is completed, we share it directly with the company that owns the job opening. The final decision and next steps (such as interviews or additional assessments) are then made by their internal hiring team.
Thank you for your interest!
#LI-CL1