Date Posted: 7/20/2021Apply Now
Remote Site Reliability Engineer (SRE)
Our client looking for a Senior Site Reliability Engineer to join its Cloud Operations Team. Site Reliability Engineering (SRE) is an alternative approach to the traditional split of IT Operations and Product Development teams, pioneered by Google. SRE is driven by Software Engineers running Cloud Operations. Our SRE mission is to protect and improve the software and systems of Springbrook - with an emphasis on security, availability, performance, and capacity. It is DevOps with a software engineering focus.
As with traditional Operations groups, we keep important systems up and running despite all sources of disruption. As an SRE on the SRE team, you will have the opportunity to tackle the complex problems of scale and availability while using your expertise in coding, algorithms, complexity analysis and system design.
* Design, develop and implement software to improve Springbrook's software system availability, scalability, latency, and efficiency. Scale solution to the business need
* Lead problem solving for critical services and build automation to prevent future recurrence. Drive response automation to all non-critical service conditions to increase productivity through decreased operational load
* Drive the creation and adoption of new designs, architectures, standards, guidelines and approaches for software development
* Be responsible for ensuring that security is built into the design and development of services produced by the team
* Perform full system analysis on software performance in addition to capacity planning, and demand forecasting
* Conduct operations support including the execution of software releases, production data updates, OS patches and utilize system expertise to answer user questions around system function
* Identify and implement KPIs to measure success of our services
* Ability to be a leader/mentor to the other engineers
* Prioritize and manage work, adhering to critical project timelines in a fast-paced environment
* Participate in an on-call rotation for issues that occur after business hours
* Participate in incident response teams for service interruptions or security incidents
* Assist in compliance initiatives (PCI, SOC, NIST, etc)
* Five (5) years' experience in supporting cloud native applications and infrastructure in Azure
* Degree in Computer Science/Information Systems or a related field; or additional two (2) years equivalent work experience
* Thorough knowledge of Azure services, including the security and compliance services and experience using Azure related resources such as VNets, Resource Groups, App Services, Functions, AzureVM, NSGs, and RBAC, etc.
* Experience using Terraform, Packer, Desired State Configuration and Blueprints to deliver complex infrastructure across Azure
* Experience using Powershell for scripting and automation
* Experience operating and maintaining Windows Server environments
* Understanding of testing principles in the context of IaC
* Experience with MS SQL Server and other Azure databases, SQL, stored procedures, backup and restore, and database design
* Experience with and understanding of software source control systems, preferably Git
* Knowledge of build systems and software integration systems Strong and demonstrable experience working in continuous integration and continuous deployment systems (CI/CD) such as (TeamCity, Azure DevOps, GitHub Actions, Travis CI, etc)
* Experience with DevOps tooling
* Strong attention to detail, meticulous documentation, and repeatable process design
* Self-starter who can collaborate with others in a cross-functional team or work independently
* Excellent verbal and written communication skills<
To apply please email your resume to email@example.com
Thank you for applying to the Remote Site Reliability Engineer (SRE) position. Your job application has been routed to the appropriate branch for consideration. Please make note of your Password, as you will need this if you are selected to move forward in the process. Your email address will most likely serve as your username.