Job Description
Site Reliability Engineer (L5) - Security Engineering
The Role
Netflix has changed how people watch shows and movies, enabling on-demand access to thousands of movies and TV shows. Recently, Netflix has expanded its entertainment offering to include Live content, like the Tom Brady Roast, the SAG Awards ceremony, WWE Live events and NFL Christmas games. Bringing stories in real-time to 260+ million viewers around the world is a hard challenge, setting high requirements for levels of operational observability into key health metrics across dozens of services and systems between camera and device screens.
The Security Engineering organization ensures that Netflix can address top security risks while maintaining overall business agility, velocity, and scale.
We are expanding our investments in site reliability engineering through an embedded model in each of the following three security functions:
Identity and Authentication Security: The Identity and Authentication Security team creates and operates identity and authentication software services for Netflix. Our workforce and partners are global, and they access our information with different patterns from a variety of locations and devices. Our IAM use cases continue to become more complex as we increase investment in Netflix Originals content production ecosystems, including Gaming and LIVE streaming events. If you are curious about some of the team's work, you can watch this Building Identity for an Open Perimeter conference talk by our IAM engineers.
Platform Security: The Platform Security Team designs, creates, deploys, operates, and maintains some of the most critical security services at Netflix. Our projects include cryptography and key management, establishing identities for microservices, protecting secrets in code, issuing and managing certificates, and more. Our team is highly technical with strong backgrounds in both security and distributed systems software engineering.
Trust & Safety: The Trust & Safety team is a central security team that sets a broad strategic vision in the trust and safety space focused on various types of fraud and abuse. We execute that vision through both technical work on the team and cross-functional partnerships, addressing problems related to account takeover, service abuse, customer targeted scams, DDoS, content theft, and more. Our team builds large scale services fueled by data driven insights for automated detection and mitigation of fraud at-scale. With Netflix expanding to new product verticals such as Live, we have increased focus on ensuring high availability of Netflix infrastructure by mitigating availability risks, such as DDoS attacks, to provide a consistent viewing experience to our customers.
The Role
This role is a unique opportunity to be embedded in one of the above security teams to drive improvements in reliability and resiliency of critical Netflix infrastructure, thus supporting Netflix business growth in areas including but not limited to LIVE streaming events, Gaming and Ads! You will be responsible for owning and advancing the operational posture of these critical security systems at Netflix. In this role, you will participate in an on-call rotation to support security teams during Live events and be able to work with flexible hours based on the live events schedule. Collaborating with cross-functional teams, you will focus on implementing best practices, automation, and proactive measures to enhance the reliability of our systems. You will play a pivotal role in building robust service monitoring and observability and reducing the burden of human effort with tooling and automation.
This role is rewarding for people who have a passion for leveraging the right technology to solve business problems. We are seeking individuals with exceptional technical skills and experience in analyzing and resolving distributed systems breaks! If this excites you, we invite you to bring your unique career and life experiences to enrich the culture and diversity of our team.
What you'll need to be successful:
• You are a pragmatic engineer with a proven track record of designing, implementing, and operating scalable and reliable infrastructure to support critical services.
• You think proactively, identify sources of instability in distributed systems and analyze how complex systems fail from a reliability and resilience perspective.
• You have a proven track record of providing exceptional support through on-call and incident management for business critical time sensitive operations.
• You are an expert at developing automation tools for monitoring, deployment, and incident response to ensure efficient and reliable operations.
• You have developed, documented and maintained robust disaster recovery and business continuity plans for complex services built using distributed architecture.
• You conduct or participate in capacity planning, performance analysis, and system tuning to optimize system reliability.
• You are proficient in modern programming and scripting languages such as Java/ Python/ Javascript/Node.js.
• You are an excellent collaborator who can partner with security and software engineers, product managers and technical program managers to integrate reliability considerations into the entire software development lifecycle.
• You possess excellent verbal and written communication skills and can influence and educate your peers through effective knowledge sharing, SLA documentation and leading blame-aware incident reviews.
Nice to have:
• Experience with one of the three security functions:
• IAM fundamentals - AAA (Authentication, Authorization, Accountability), and Identity lifecycle.
• Cryptographic key management, workload identity, and public key infrastructure.
• DDOS attack analysis and remediation.
• Strong working knowledge of networking concepts and application protocols, especially TCP/IP, BGP, DNS, TLS, and HTTP/S
Compensation:
Generally, our compensation structure consists solely of an annual salary; we do not have bonuses. You choose each year how much of your compensation you want in salary versus stock options. To determine your personal top of market compensation, we rely on market indicators and consider your specific job family, background, skills, and experience to determine your compensation in the market range. The range for this role is $100,000 - $720,000.
Benefits:
Netflix provides comprehensive benefits including Health Plans, Mental Health support, a 401(k) Retirement Plan with employer match, Stock Option Program, Disability Programs, Health Savings and Flexible Spending Accounts, Family-forming benefits, and Life and Serious Injury Benefits. We also offer paid leave of absence programs. Full-time hourly employees accrue 35 days annually for paid time off to be used for vacation, holidays, and sick paid time off. Full-time salaried employees are immediately entitled to flexible time off. See more detail about our Benefits here
Culture:
Netflix is a unique culture and environment. Learn more here.
We are an equal-opportunity employer and celebrate diversity, recognizing that diversity of thought and background builds stronger teams. We approach diversity and inclusion seriously and thoughtfully. We do not discriminate on the basis of race, religion, color, ancestry, national origin, caste, sex, sexual orientation, gender, gender identity or expression, age, disability, medical condition, pregnancy, genetic makeup, marital status, or military service.
Job is open for no less than 7 days and will be removed when the position is filled.
Jobcode: Reference SBJ-rnqw51-18-117-75-218-42 in your application.