Requisition ID: 208518
Tangerine is Canada's leading direct bank. We offer flexible and accessible banking options, innovative products, and award-winning Client service. The reason why Tangerine employees come to work each day is to help Canadians live better lives. We focus on making a difference in our communities, and that includes our own internal community. It's important to us that our employees feel empowered and enthusiastic about belonging to our Orange culture.
SRE & Production Application Support is responsible for providing technical expertise to resolve application and infrastructure technology issues on medium to highly complex projects in compliance with service standards, policies, and procedures. The role troubleshoots complex issues for all levels, evaluating service requests for adherence to service standards and processes, and identifying inconsistencies to determine potential impact and take appropriate action. The Production Application Support exercises independent judgement to analyze and resolve production application problems to identify root causes and define actions to eliminate recurrence. The role coordinates and executes project plans using control procedures to implement application and maintenance updates. In addition, the role monitors and analyzes supported services and deployment methodologies to identify opportunities for improvement and recommend solutions. The role devises new methods and procedures using strong analytic and inductive thinking.
Is this role right for you? In this role, you will:
-
You'll be joining Tangerine's SRE & Production Support team.
-
You'll be responsible for maintaining the production applications and day-to-day operational activities, manage escalations and modify established procedures / approaches to suit specific situations including 24 x 7 support and coordination of recovery efforts.
-
You will run the production environment by monitoring availability and taking a holistic view of system health.
-
Lead Daily team huddles, responsible for incident assignment and ensure timely closure of all customer escalations and problems.
-
Responsible for coaching and monitoring team and help them resolving the complex/critical production incidents/problems.
-
You'll be responsible for providing investigation and second level support on client issues, technical issues, system/web site outages and questions from all internal and external application by maintaining, prioritization and addressing to respective Tangerine technology groups and vendors.
-
Lead on-call problem escalation and outage recovery effort, not limited to code fixes in presentation and integration layer, but also provide infrastructure level investigation and support where necessary.
-
Lead post-incident technical retrospect to discover and implement remediation actions.
-
You will improve our suite of software solutions' reliability, quality, and time-to-market.
-
Measure and optimize system performance to push our capabilities forward, get ahead of customer needs, and innovate to improve continually.
-
Participate in defining SLIs, SLOs and SLAs for Enterprise Systems.
-
Gather and analyze metrics from both applications and infrastructure to assist in performance tuning and fault finding.
-
Partner with development teams to improve services through rigorous testing and release procedures.
-
Create sustainable systems and services through automation and process improvements.
-
Monitor multiple application health and discover opportunities to optimize in a continuously growing large complex hybrid environment.
Do you have the skills that will enable you to succeed in this role? We'd love to work with you if you have:
-
Be self-motivated, autonomous and a team player in a fast-paced environment.
-
Good understanding of networking concepts: TCP/IP, DNS, HTTP, TLS, OSI Model.
-
Good understanding of multi-tier applications, micro services (Docker, Kubernetes etc.)
-
Experience instrumenting and monitoring cloud hosted software stacks (preferably GCP)
-
Working knowledge of one or more programming languages (Java, Nodes, Python, etc.).
-
Basic knowledge of one or more scripting languages (Terraform, Bash etc.).
-
4-5 years of experience in developing and/or supporting complex, large-scale customer-facing platforms.
-
Strong working experience with incident management and setting up monitoring alerts.
-
Have a proficient understanding of code versioning tools, such as Git/Bitbucket.
-
Knowledge about building a highly automated production monitoring and support model, hands-on experience integrating Splunk, Ansible, Dynatrace, Sumologic, Service now ,PagerDuty.com, or equivalents.
-
Proven ability to translate ideas into technical and business realities and map technology to business problems.
-
Experience with private/public cloud services and platforms.
-
Superior verbal and written communication skills with the ability to influence decision-making with stakeholders.
-
A proactive approach to spotting problems, areas for improvement, and performance bottlenecks.
-
Exceptional written and verbal communication skills
-
Excellent problem-solving skills
-
Flexible approach to work and the ability to adapt to change
-
Prior production support or SRE experience.
-
Proficient with MS suite
What's in it for you?
-
Diversity, Equity, Inclusion & Allyship - We strive to create an inclusive culture where every employee is empowered to reach their fullest potential, respected for who they are, and are embraced through bias-free practices and inclusive values across Scotiabank. We embrace diversity and provide opportunities for all employee to learn, grow & participate through our various Employee Resource Groups (ERGs) that span across diverse gender identities, ethnicity, race, age, ability & veterans.
-
Accessibility and Workplace Accommodations - We value the unique skills and experiences each individual brings to the Bank, and are committed to creating and maintaining an inclusive and accessible environment for everyone. Scotiabank continues to locate, remove and prevent barriers so that we can build a diverse and inclusive environment while meeting accessibility requirements.
-
Upskilling through online courses, cross-functional development opportunities, and tuition assistance.
-
Competitive Rewards program including bonus, flexible vacation, personal, sick days and benefits will start on day one.
-
Community Engagement - no matter where you choose to work from; we offer opportunities for community engagement & belonging with our various programs such as hackathons, contests, cooking with friends, Humans of Digital and much more!
Work arrangements: Hybrid
#LI-Hybrid
Location(s): Canada : Ontario : Toronto
At Tangerine we value the unique skills and experiences each individual brings to the team, and are committed to creating and maintaining an inclusive and accessible environment. If you require accommodation during the recruitment and selection process, please let our Recruitment team know.