Site Reliability Engineer

US

RunwayExtenders

Take control of your future with the right tools and opportunities. Our platform connects you with top employers and offers personalized support to help you grow, succeed, and land your next great role.

Our Mission, Your Success:

We link professionals from Kosovo, Albania and North Macedonia with top US start-ups, by empowering challenge seekers like you to build successful careers. Our ultimate vision is to offer exclusive career opportunities with a focus on attracting, retaining, and developing the best talents.

Our client, Your Impact:

Our well- funded US client builds AI-powered agents that automate complex workflows by combining autonomous systems with human-in-the-loop oversight. Its focus is on delivering practical, enterprise-ready AI solutions that improve efficiency in real-world operations.

Responsibilities

On-Prem Product:

  • Build, maintain, and support Helm-based on-prem deployments for enterprise customers.
  • Manage Helm charts, dependencies, upgrades, and configurations.
  • Work with delivery teams to ensure reliable and performant on-prem infrastructure.

Cloud Infrastructure:

  • Design and manage cloud API infrastructure using Kubernetes and Terraform.
  • Build scalable pipelines for AI workloads andling millions of documents and billions of tokens.
  • Create and maintain CI/CD pipelines for containerized services.
  • Implement observability stacks (Prometheus, Grafana, Loki) for monitoring data flow and system health
  • Work closely with backend and AI teams to improve deployment performance and scalability.

Requirements:

  • Strong experience with Helm and Kubernetes (deploying, scaling, and operating workloads).
  • Hands-on experience with Terraform, Docker, CI/CD, and infrastructure automation.
  • Familiarity with monitoring and logging tools (Prometheus, Grafana, Loki).
  • Scripting skills in Bash, Python, or Golang for automation and tooling.
  • Solid understanding of networking, RBAC, and TLS/SSL in Kubernetes clusters.
  • Self-driven engineer with strong ownership, problem-solving, documentation, and collaboration skills.

Other Details:

  • Working hours: 3PM -11PM (Prishtina, Tirana, Skopje time)
  • Private health insurance
  • Access to top-quality office equipment provided by the company

We are looking for individuals who:

  • Are available for full-time engagement.
  • Consider this role as their primary professional commitment (main role).

Join our innovative team and contribute to impactful analytics solutions that drive progress in our client's team. If you are a motivated self-starter with a passion for numbers and operations, we would invite you over to apply!

Apply for Site Reliability Engineer

Interested in applying?

Create an account to apply for this job and access many other opportunities!

Sign up now!

Related Job Postings

Agent Engineer

RunwayExtenders

Golang Backend Engineer

RunwayExtenders

R&D Program Manager

RunwayExtenders