ZoomInfo Technologies logo

Machine Learning Engineer III

ZoomInfo Technologies
On-site
Vancouver, Washington, United States
Software Engineer

About ZoomInfo

ZoomInfo is building the next generation go-to-market platform using high-quality GTM data, agentic workflows, and a robust intelligence layer to give sales, marketing, and revenue operations teams a competitive advantage.

About the Applied AI Team

The Applied AI team builds the intelligence layer that sits between ZoomInfo's high-quality data and the application layer through which customers engage. Using a product-led growth model, this team leverages customer engagement as input to build better recommendations, scoring, classification, and generative models.

 

What you will do :

Foundation Data Quality Enhancement

  • Improve data quality for ZoomInfo's foundation datasets including firmographics, demographics, C-suite profiles, workforce information, titles, skill sets, scoops, intent signals, and web-extracted data
  • Design and implement data validation pipelines and quality metrics to ensure high-fidelity information across millions of records

Embedding and Model Development

  • Build and fine-tune embedding models using large language models (Llama) and small language models (*BERT*) for various text understanding tasks
  • Develop language-agnostic clustering and classification models using vector search technologies
  • Optimize embedding models for production deployment at petabyte scale

Named Entity Recognition & Data Extraction

  • Build high-recall NER models to extract people, organizations, locations, and industry-specific entities from web-extracted data
  • Develop robust data extraction pipelines that process diverse web content and structure unstructured information

Agentic Workflows & Evaluation

  • Design and implement agentic workflows focused on web extraction, NER, and entity resolution
  • Create comprehensive evaluation frameworks for agent performance and reliability
  • Collaborate on agent optimization and performance tuning

Scalable Production Systems

  • Deploy and maintain ML models serving millions of users daily with sub-second latency requirements
  • Work with engineering teams to ensure models integrate seamlessly into ZoomInfo's platform architecture
  • Monitor model performance and implement automated retraining pipelines to design cost-aware training & inference workflows
  • Use integrated CI/CD and testing workflows for seamless deployment

Cross-Functional Collaboration & Prototyping

  • Partner with product managers and engineering teams to translate business requirements into ML solutions
  • Prototype and benchmark emerging AI/infra tech
  • Present findings and technical solutions to stakeholders across the organization

 

What you bring:

Experience & Education

  • 3 - 5 years (1+ years post-PhD) of hands-on ML/NLP experience with demonstrated impact on production systems. Preference for masters and background in Computer Science and other allied data science/engineering disciplines.
  • Strong background in transformer architectures, embedding models, and vector search technologies
  • Experience with named entity recognition, summarization and data extraction at scale is a plus

Technical Skills

  • Proficiency in PyTorch or TensorFlow for model development and fine-tuning
  • Experience with vector databases (Pinecone, Weaviate, FAISS, OpenSearch) and hybrid retrieval systems
  • Strong software engineering skills in Python; familiarity with Go/Java is a plus
  • Knowledge of MLOps tools: Docker, Kubernetes, GitOps, feature stores, model registries

Applied AI Expertise

  • Hands-on experience with LLM fine-tuning techniques (LoRA, quantization, distillation) is a plus
  • Understanding of agentic workflows and multi-agent systems
  • Experience building language-agnostic ML solutions and cross-lingual models
  • Knowledge of entity resolution and knowledge graph concepts

Collaboration & Communication

  • Ability to work effectively in cross-functional teams and communicate technical concepts to non-technical stakeholders
  • Experience mentoring junior team members and contributing to team knowledge sharing
  • Strong problem-solving skills and ability to work independently with guidance from team leads

Preferred Qualifications

  • Experience processing large-scale unstructured data
  • Background in information retrieval and search systems
  • Familiarity with MLOps concepts, A/B testing and experimental design for ML systems
  • Knowledge of data quality frameworks and validation methodologies

 

#LI-SK

#LI-Hybrid