About the company At Jobtome - https://weare.Jobtome.Com/- we are building a modern' cloud-native recruitment and marketing platform used at scale across multiple countries and brands. Our systems power high-traffic job distribution' integrations with external partners' and real-time data pipelines' with a strong focus on reliability' observability' and automation. Engineering is a core function of the company: we value ownership' pragmatic decision-making' and long-term technical excellence over short-term fixes. The role As a Senior Site Reliability Engineer' you will be responsible for ensuring the reliability' scalability' and performance of our production systems. You will work closely with Backend' Frontend' and Product teams to: - design resilient architectures - define reliability standards - improve observability and incident response - reduce operational toil through automation This is not a pure ops role: you will contribute to codebases' collaborate on system design' and help evolve our engineering culture toward SRE best practices. What you will do - Design' implement' and maintain reliable and scalable cloud infrastructure - Define and evolve SLIs' SLOs' and error budgets - Improve monitoring' alerting' and observability across services - Lead and participate in incident response' post-mortems' and root-cause analysis - Automate repetitive operational tasks to reduce toil - Collaborate with Backend engineers on service design' scalability' and failure modes - Improve CI/CD pipelines' deployment strategies' and release safety - Contribute to infrastructure as code and platform tooling - Act as a reliability advocate across the engineering organization Tech stack - Cloud: Google Cloud Platform (preferred)' AWS - Containers &, orchestration: Docker' Kubernetes (GKE) - Infrastructure as Code: Terraform - CI/CD: GitLab CI/CD - Observability: Cloud Monitoring' Logging' Prometheus' Grafana - Languages: Go' Python' Bash - Networking &, security: IAM' VPCs' service accounts' secrets management What we expect from a senior SRE - Strong experience running production systems at scale - Solid understanding of distributed systems and failure modes - Proven experience with SLO-driven reliability - Strong coding skills - Cloud infrastructure automation experience - Ability to debug complex cross-system issues - Ownership mindset and strong communication skills - Pragmatic approach to reliability' speed' and cost trade-offs Working model - Flexible working hours - Remote-friendly setup - Small autonomous teams - Direct collaboration with product and leadership