3 Easy Steps

  • 1Search for courses by Study Area, Level and Location
  • 2We deliver you all the matched results
  • 3Choose one or more course providers to contact you
Industry

Distance from location (kms)

Exact 5 10 25 50 100

Posted since

All 2 Days 1 Week 2 Weeks 1 Month

Sort results by

Relevance Date

23

March

Senior Site Reliability Engineer

Private Company - Ultimo, NSW

IT
Source: uWorkin

JOB DESCRIPTION

MISSION? “THE WHY”

As the Lead Site Reliability Engineer (SRE), you will be working closely with your peers, product, design and leadership to foster a culture of resilience across the Deputy platform, knowing that the work you do enables millions of users worldwide.

You have a knack for fostering a culture of good, knowing that what is seen isn’t everything. You’re able to provide the tooling and capabilities across all teams to provide them with the information they need to better understand how their work contributes to all aspects of the Deputy platform, and how they can look to improve.

RESPONSIBILITIES “THE WHAT”

    • Embed and improve technologies and uses of monitoring and logging tools for comprehensive observability.
    • Align with the Engineering and Product teams to improve their ability to manage their own platform components, and implement software-based solutions to increase observability and automatically react to adverse situations.
    • Contribute to the incident management and on-call rotations across the Deputy platform and infrastructure.
    • Engage with product engineering squads to embed a culture.
    • Remove resilience up-front, ensuring that reliability is a first-class citizen during the software development process.
    • Own the production readiness, monitoring, and capacity planning across the Deputy platform.

WHO YOU ARE “THE HOW”

    • Pragmatic engineer who can combine software and systems engineering to build and run large scale, distributed, fault tolerant systems.
    • Have a solid understanding of the nature of and common problems with distributed systems, especially when it comes to architecting, maintaining, and debugging them.
    • Capable of working with multiple programming languages for multiple purposes. We primarily use Go, PHP, JS, and Python, with Terraform for our Infrastructure as code. You have knowledge of an array of cloud native technologies, such as Hashicorp Vault, Open Telemetry, OpenAPI, and gRPC.
    • Comfortable collaborating with multiple stakeholders of both technical and non-technical capabilities.
    • Strong grasp and comfort level with containers and various virtualisation technologies, primarily Docker, ECS/ECR, and Kubernetes (EKS).
    • Strong understanding and exposure to cloud environments, primarily AWS.
    • Good understanding and experience with various storage and caching technologies, and their influence on the reliability within a complex system. We use Aurora MySQL, Redis, and RDS MySQL.