Site Reliability Engineer

Job ID: 2747
City: Atlanta
State: Georgia
Remote Policy: 100% remote is fine

Reason this role is for you:
Our client has a data-driven analytics platform and is looking for a Site Reliability Engineer (SRE) to join their Operations team. The SRE will work closely with delivery teams to define SLOS and set up monitoring and alerting for new and existing services.

The Platform Operations team brings a software engineering perspective to delivering quality operations at scale and driving automation in every aspect of the job. We believe that Engineering teams should be able to deploy whatever, and whenever, with confidence. This role can be remote. 

Role Expectations:

  • Deliver incredible products. We’re a product-first company, and we aren’t satisfied until we’ve built a product so clearly superior that customers have no reason to consider anything else.

  • Shared responsibility. We collaborate with Product Managers and Engineers, as equals, to provide optimal production services to our customers.

  • Be prepared. Proactively identify and prevent incidents through monitoring, automation, self-healing and resiliency initiatives, destructive testing, and game day exercises.

  • Be a culture champion.? As a DevOps evangelist, you’ll guide our application teams towards solid technical decisions, share knowledge and expertise, help colleagues improve their designs, and ensure the applications perform well in production.

  • Communicate well. You’ll explain your work clearly to team members, and seek feedback to build a codebase we all enjoy contributing to. You’ll keep engineering leadership apprised of important developments on your team, and areas needing attention when appropriate.

  • Be a good teammate. You’ll be helpful, open-minded, respectful, and collaborative. You’ll support your teammates and challenge them to do their best work.

  • Learn. Regardless of your level of experience or seniority, you’ll work to improve your skills and learn more about our customers and their needs.

Required Skills:

  • Designing effective monitoring/alerting and log aggregation solutions using tools including AWS CloudWatch and APM (DataDog, New Relic, AppDynamics, etc.)

  • Experience defining and measuring internal/customer facing SLO/SLAs

  • Experienced in patch management, CVE remediation, and security incident management

  • Provisioning and configuration of AWS services using AWS CLI / API and Terraform

  • Familiarity with Config Management tools (Ansible, Puppet, Chef, etc)

  • Familiarity with software-as-a-service platforms, particularly for small and medium-sized businesses

  • Developing and maintaining the software in security and regulatory compliance environments (PCI-DSS, HIPAA, SOC 2, etc.)

  • Building scripts and tools using ruby, go, bash, etc.

  • Refactoring systems to perform well at scale while still being readable and easy to maintain

  • Reviewing the code in a way that empowers your teammates while improving our codebase

Technologies and tools we use:

  • Ansible, Terraform, SemaphoreCI

  • AWS EC2, S3, EKS, RDS, DynamoDB, Redshift

  • Python, Go, Ruby, Rails, Sidekiq, and Rspec

  • Kubernetes, Helm, and Docker

  • DataDog (APM, Monitoring), AWS CloudWatch, and PagerDuty

  • Angular, Javascript, Typescript, and Ionic

  • Postgres, Redis, RabbitMQ, and Elasticsearch


  • 100% medical coverage for employees

  • Competitive HSA with company matching

  • Generous dental and vision plans

  • Paid parental leave

  • Flexible vacation policy

  • 401K options with company dollar-for-dollar match

  • Employee stock options

  • $2,000 annual educational allowance

  • Catered lunch every Tuesday *an in-office perk

  • MARTA transportation and office parking expenses covered

  • Employee charitable donation company match, up to $500 annually

  • Regular company outings and events *yes, even during COVID, except virtually

  • Remote work from home options with $500 office stipend to set up your home office

Apply for this Site Reliability Engineer job

Accepted file types: doc, docx, pdf, Max. file size: 50 MB.
This field is for validation purposes and should be left unchanged.

Related Jobs