Big News: Kosli’s achives Series A milestone with Deutsche Bank as an investor - Read the announcement
Site Reliability Engineer, UK

Site Reliability Engineer, UK

technical UK (remote)

Do you want to shape the future of software delivery in the financial services industry?

Kosli is looking for a Site Reliability Engineer to join our growing team. As part of a fast-paced startup, this role is about building and maintaining a large scale data and compute cloud infrastructure that powers our SaaS platform.

About Kosli

Kosli’s mission is to change the way we deliver software in regulated industries. The world is built on mission critical software. It calculates our bank balance. It drives our cars. It diagnoses our illnesses.

We want to empower the engineers who make this software.

When you’re regulated, every change needs to be controlled. This typically means manual paperwork, meetings, delays, and more risk.

We believe this should be automated, and we’re building technology to make this happen.

We are an ambitious group of people and we want you to join us.

We are funded by leading VC investors such as Heavybit (investors in Snyk, LaunchDarkly, CircleCI, Netlify, Tailscale).

About the Role

As a Site Reliability Engineer at Kosli, you will play a pivotal role in embedding reliability into the core of our applications and infrastructure. This position combines expertise in infrastructure management, data analysis, observability, and platform development to ensure our services are robust, secure, and scalable. You will work closely with development teams to integrate reliability into the application lifecycle, leveraging your skills in Python and other languages to build resilient systems.

As Kosli continues to grow, our focus on reliability and scalability becomes increasingly important, and we need talented engineers who can bridge the gap between infrastructure and application reliability.

Key Responsibilities

  • Design and Implement Reliability: Collaborate with development teams to integrate reliability into application design and development, focusing on building fault-tolerant systems.
  • Cloud Infrastructure Management: Manage and evolve Kosli’s cloud infrastructure using Terraform and AWS, ensuring it supports scalable and reliable application deployments.
  • Security and Compliance: Lead security implementation and compliance checks across our infrastructure, ensuring alignment with industry standards.
  • Observability and Monitoring: Own and improve our monitoring and observability stack (Prometheus, Grafana & RollBar to name a few) to provide actionable insights that inform reliability improvements.
  • CI/CD Pipelines: Take ownership of build and deployment pipelines using GitHub Actions, ensuring smooth and reliable software delivery.
  • On-Premise Solutions: Lead the development of our on-premise solution for customers, focusing on reliability and scalability.
  • Shared Infrastructure Components: Develop and maintain shared infrastructure components for customers, including Terraform modules and shared GitHub Actions and GitLab Pipelines.
  • Service Level Management: Use your experience to assist in implementing and driving adoption of Service Level Agreements (SLAs), Service Level Objectives (SLOs), and error budgets to ensure alignment with business objectives and customer expectations.

You Might Be a Great Fit If You Have

  • Experience with Large-Scale Systems: A background in operating large-scale data platforms or applications with a focus on reliability.
  • Infrastructure-as-Code Expertise: Deep expertise with Terraform and AWS cloud platforms.
  • Monitoring and Observability Skills: Experience building and maintaining monitoring/observability stacks to drive reliability improvements.
  • CI/CD Proficiency: Proficiency with CI/CD pipelines, especially GitHub Actions and GitLab.
  • Reliability Track Record: A track record of improving system reliability and deployment processes.
  • Programming Skills: Familiarity with Python, Go, and shell scripting, with a focus on using these skills to enhance application reliability.
  • SRE Practices and Service Levels: Knowledge of modern SRE practices, including the implementation of SLAs, SLOs, and error budgets to manage service reliability and availability.
  • Passion for Quality Infrastructure: Passion for quality infrastructure code and modern SRE practices that prioritise reliability and scalability.
  • Problem-Solving and Collaboration: Strong problem-solving abilities, attention to detail, and clear communication skills to collaborate effectively in a distributed team.
  • Curiosity and Enthusiasm: Enthusiasm for being an early user of tools you help build and curiosity about regulated industries and compliance requirements.

What We Offer

  • Competitive salary and generous equity - we want you to own part of what you’re building
  • Remote-first environment with a focus on flexibility
  • Regular team meet-ups across Europe
  • Budget for learning and development
  • A voice in shaping both our product and our company
  • Real impact on how some of the world’s largest financial institutions deliver software

Location

Remote (UK)

Want to know how our engineers recently described our culture and working at Kosli?

Team Dynamics and Culture:

  • Strong collaboration and teamwork
  • Diverse perspectives and cultural backgrounds
  • Respectful, compassionate, and empathetic interactions
  • No blame culture with low hierarchy
  • Consensus-driven decision making
  • Openness to challenge with psychological safety
  • Willingness to bring new ideas
  • Fun, adventurous, and positive attitudes

Professional Qualities:

  • Honesty and integrity is at the centre of how we operate
  • Passion for the product and work
  • Growth and improvement mindset
  • Resilience and adaptability
  • High-quality technical skills
  • Breadth and depth of knowledge and experience
  • Ability to pivot and make decisions for the right reasons

Work Environment:

  • Flexibility in work location and tasks
  • Freedom to investigate and contribute
  • Agile
  • Possibilities for close collaboration with customers
  • Supportive onboarding for new team members

Product and Achievement:

  • High-quality product despite challenges
  • Impressive achievements for a small team
  • Strong test coverage (92%)
  • Exciting scope and potential of the product

Personal Growth:

  • Opportunities for personal and professional development
  • Ability to contribute meaningfully early on
  • Valuable experience working with clients

If you are excited by the idea of transforming software delivery in financial services and thrive in a fast-paced startup environment, we would love to hear from you!

Feel free to read some of our articles about our remote culture

Looking back on 2022: Kosli wrap-up

When you work in a startup, it’s easy to get so focussed on the day to day tasks and it can feel like nothing is really changing. It’s only when you take a step back that you can see the bigger …

10 books you need to read if you’re building a developer tool company

If you’re building developer tools in a startup, you’re always inundated by the items on your plate and the decisions you need to make. However, despite this growing mountain of tasks, one important …

“Did I break prod?😰” The day I realized Kosli would’ve known the answer

If you had the chance to read my first blogpost for Kosli, describing my first week at the company, you’ll know I wasn’t exactly a Kosli expert when I started. At the beginning I spent most of my time …

Ready to Automate Governance?

Book a consultation to see how Kosli eliminates compliance overhead and accelerates delivery.
Ready to Automate Governance?
Ready to Automate Governance?
Sounds like magic? Watch how its done.

Sounds like magic? Watch how its done.