Site Reliability Engineer II (SRE)
@
Isaac
Would you like to be part of a brilliant team at a Fintech that is here to transform the future of Education in Brazil?
At isaac, we operate with a lean Site Reliability Engineering (SRE) team that leads our Platform Engineering strategy, prioritizing the autonomy of development teams. We are focused on leveraging people productivity and service resilience, supporting dozens of developers, data engineers, and data scientists. The SRE team is critical to our success.
To strengthen our team, we are looking for a Senior SRE with expertise in infrastructure and engineering process automation, with a focus on Developer Experience.
Isaac
isaac is the largest financial solutions platform for schools. We empower institutions by offering guaranteed revenue and providing families with more payment flexibility.
Our goal is to facilitate financial management, eliminating headaches and allowing the school to focus on its mission of educating , dedicating more time and resources to putting plans into practice.
We have already helped more than 1,700 schools and 400,000 families across the country to make their dreams come true, ensuring that more than R$4 billion in monthly payments are received on time and without any hassle. Monthly payments are paid on time, all year round, without any hassle.
Isaac's culture encourages autonomy, hands-on work, teamwork, and the use of data to build innovative solutions.
We have a lot to build and transform in Brazilian Education, and you can be part of this revolution, experiencing the impact of your work on society!
Responsibilities:
- Create abstractions and provide cloud infrastructure services that help teams reduce cognitive load and friction in creating and maintaining products and services, increasing their autonomy and reducing toil .
- Advance our Cloud platform with Infrastructure as Code (IaC) using Terraform and Kubernetes as the foundation.
- Manage and optimize the components used by teams in their Continuous Integration pipelines.
- Manage Kubernetes clusters running on GKE.
- Collaborate with development teams to define the architecture through RFCs.
- Work closely with engineering teams to jointly develop solutions.
- Diagnose architectural problems and problems in services maintained by the SRE team.
- Contribute to the definition of monitoring and observability strategies for systems, SLIs and SLOs.
- Manage infrastructure incidents and support engineering teams during incidents in their applications.
Responsibilities:
- Create abstractions and provide cloud infrastructure services that help teams reduce cognitive load and friction in creating and maintaining products and services, increasing their autonomy and reducing toil .
- Advance our Cloud platform with Infrastructure as Code (IaC) using Terraform and Kubernetes as the foundation.
- Manage and optimize the components used by teams in their Continuous Integration pipelines.
- Manage Kubernetes clusters running on GKE.
- Collaborate with development teams to define the architecture through RFCs.
- Work closely with engineering teams to jointly develop solutions.
- Diagnose architectural problems and problems in services maintained by the SRE team.
- Contribute to the definition of monitoring and observability strategies for systems, SLIs and SLOs.
- Manage infrastructure incidents and support engineering teams during incidents in their applications.