Job Description
Job Description
Process structured and unstructured data into a form suitable for analysis and reporting in analytics and machine learning environments. Leverage Databricks features to operationalize data science models and products in a cluster-computing environment to understand requirements and develop appropriate feature engineering to deliver high performing models. Enhance Feature Store current capability. Create a full machine Learning best practice for model management using MLflow, model assessment through experiment registration, model production promotion via Unity Catalog. Build data pipeline frameworks to automate high-volume and real-time data delivery. Create workflows and delta live tables to maintain batch and stream processes. Implement Dashboard suite dedicated to internal model QA process. Containerize and deploy models at scale in an AWS environment to support third party request. Leverage retool low code, components-based building platform to fast-track the deployment of fully scalable application support to Media Group and clients. Create git-version controlled applications through lower and prod environments. Provide clear data engineering technical leadership, mentoring, and best practices for data management and quality within and across teams. Manage projects, vendors, partners, and contractors as needed. Partner with various NBCU Technology teams in the design and execution of an overall Corporate Data Syndication Strategy for the current Data Mart. Improve automation of various processes in a distributed computing environment, recommend schema improvements, and help adjust queries and jobs orchestration.
This position has been designated as hybrid, generally contributing from the office, in New York, NY, a minimum of three days per week.
Qualifications
• Master's degree in Computer Science, Data Science, Data Analytics, Business Intelligence/Analytics, or related quantitative field (or foreign degree equivalent), plus three (3) years of experience in the job offered, in a Data Science/Data Engineering role focused on large amounts of structured and unstructured data processing, or in a related role.
• In the alternative, the employer will accept a Bachelor's degree in Computer Science, Data Science, Data Analytics, Business Intelligence/Analytics, or related quantitative field (or foreign degree equivalent), plus five (5) years of experience in the job offered, in a Data Science/Data Engineering role focused on large amounts of structured and unstructured data processing, or in a related role.
Employer will accept any suitable combination of education, training or experience.
The position requires each of the following skills, which must have been gained through three (3) years of experience:
• Processing large amounts of structured and unstructured data in a cluster-computing environment;
• Python, R, MLFlow, SparkML, ScikitLearn, SQL, JavaScript, Linux commands;
• Writing reusable and efficient code to automate analyses and data processes;
• Leveraging big data technologies like Spark, Airflow, Docker;
• Accessing, maintaining and supporting applications deployed in a Unix environment as well as data warehouse environments;
• Building and maintaining production-grade data pipelines.
• Working with Databricks, deploying fully automated workflows with Feature Engineering and MLFlow integration in Unity Catalog in a production environment;
• Creating Retool applications for ML models assessment and large-scale BI reporting tools.
This position is eligible for company sponsored benefits, including medical, dental and vision insurance, 401(k), paid leave, tuition reimbursement, and a variety of other discounts and perks. Learn more about the benefits offered by NBCUniversal by visiting the Benefits page of the Careers website.
Salary range: $170,373- $171,000 per year
Full-time: 40 hours/week
Jobcode: Reference SBJ-ne6e1k-3-149-239-87-42 in your application.