Monday , October 18 2021

Site Reliability Engineering Manager – Staples Job Application

Website Staples

Job Description:

As Manager of the Ecommerce Site Reliability Engineering (SRE) team at Staples, you will help bridge the gap between development and operations by bringing a software engineering mindset to operational aspects such as monitoring, performance and capacity planning, and disaster response.Under the SRE Manager’s guidance, the SRE team will collaborate closely with product developers to ensure that applications include non-functional requirements such as availability, performance, security, and maintainability. The team will also partner with release engineers to ensure the efficiency of the software delivery pipeline.

Job Responsibilities:

  • Maintain a focus on engineering work as opposed to operations work
  • Manage critical production incidents and drive to resolution
  • Collaborate with stakeholders to define and implement service level objectives (SLOs) that to help make effective decisions
  • Create and maintain robust knowledge documentation in Confluence
  • Measure and drive elimination of toil (manual, repetitive, and/or reactive work)
  • Build and sustain strong relationships with application development and testing teams
  • Conduct blameless post-mortems to understand failures or service outages
  • Use software engineering approaches to solve operations problems
  • Drive reductions in mean-time-detect (MTTD) and mean-time-to-repair (MTTR)

Qualification & Experience:

  • Log Analytics tools (ex. Splunk, ELK/Elastic, Datadog, etc.)
  • Solid understanding of networking concepts and technologies (DNS, load balancing, firewalls, etc.)
  • Experience with supporting and troubleshooting Infrastructure, OS (Linux, Windows, Unix, etc.) and applications.
  • Bachelor’s degree in Computer Science or related field with continuous and progressive experience
  • 7+ years of related experience working with some of these technologies:
  • APM tools (ex. New Relic, AppDynamics, Datadog, etc.)
  • Infrastructure monitoring tools (ex. Zabbix, Prometheus, Dynatrace, New Relic, etc.)
  • Databases (ex. MongoDB, Oracle, Couchbase, Redis, MySQL, etc.)
  • Frameworks (ex. Dust/Angular, Nodejs, Spring boot, etc.)
  • Public cloud experience preferred
  • Must have strong experience with programming in one or more scripting languages (Python, Azure CLI, Powershell, etc.)
  • Experience with troubleshooting application and platform performance and stability issues
  • Experience with Agile methodology
  • Understanding of Content Distribution Networks (CDNs) preferred

Job Details:

Company: Staples

Vacancy Type:  Full Time

Job Location: Framingham, Massachusetts, US

Application Deadline: N/A

Apply Here