Operations / Site Reliability Engineers

IT - Systems Integration
Toronto, ON
Contract to Perm
Jan 18, 2020

Contract to permanent role

Our client....

One of North America's most recognized brand names, continues on new development and transformational initiatives. Work as part of a super collaborative team culture. 


What’s in it for you

Take a lead role in technologically advancing company. Learn new things in continually expanding technical environment. Great work culture.


  • You are a Developer / Analyst with a mix of skills in software (i.e. programming, data structures and algorithms) and systems (i.e. operating software on internal and external infrastructure at scale).
  • Constantly evaluate products and services before and after production releases to prevent, identify and fix problems that impact service availability in deploying, configuring, releasing, monitoring, recovering, and scaling.
  • Participate in on-call rotations to monitor and support our products and services, taking recovery actions prior to and after disruptions
  • Continuously improve automation – write code, write tests, analyze outcomes
  • Lead the recovery of incidents, run retrospectivesEnsure Unix/Linux standards, processes, procedures and security best practices are implemented, efficient, effective and documented


Must Have Skills:

    • 5+ years of experience writing clean code in at least one modern system or scripting language such as: Java, C#, Javascript, Python, Ruby (Java, Python and Angular preferred)
    • Build, support independent web based tools, microservices and solutions
    • Write scripts and automation using Golang/ Perl/Ruby/ Python/Groovy/Java/Bash/etc
    • Manage source control including SVN and/or GIT
  • Operations / System Administration
    • 5+ Years - Expertise in UNIX / Linux based system architecture / administration
    • Expertise with Databases (relational and NoSQL),
    • Experience with web servers (Apache, IIS),
    • Experience with networking and storage technologies.
    • Experience with Cloud technologies and platforms such as Google, AWS or Azure. 
    • 24/7 on-call support
    • Disaster recovery planning and experience building high availability systems
    • Working knowledge of complex web hosting configuration components, including firewalls, load balancers, web servers, application servers and databases
    • Understand how various systems work and their dependencies
    • Configure and manage data sources like Kafka, PostgreSQL, Memcache, Redis, Splunk, Elasticsearch, etc
    • Cloud / CI/CD / Devops
      • Experience automating software build, deployment and server configuration management.  Preferred using tools such as puppet, bamboo and jenkins.
      • Experience with CI/CD systems and pipelines
      • Must have some experience with Containers, Scaling and managing an Cloud Platform
      • Install, secure, monitor, manage and maintain Unix/Linux servers on scale using Puppet, Ansible, and Kubernetes
    • Experience and preference to work in an Agile environment
    • Experience with JIRA, Confluence, Atlassian products

Send to Friend

Send to Friend