Senior Data Scientist, Identity Graph
About this role
About impact.com
impact.com is the world’s leading commerce partnership marketing platform, transforming the way businesses grow by enabling them to discover, manage, and scale partnerships across the entire customer journey. From affiliates and influencers to content publishers, brand ambassadors, and customer advocates, impact.com empowers brands to drive trusted, performance-based growth through authentic relationships. Its award-winning products - Performance (affiliate), Creator (influencer), and Advocate (customer referral) - unify every type of partner into one integrated platform. As consumers increasingly rely on recommendations from people and communities they trust, impact.com helps brands show up where it matters most. Today, over 5,000 global brands - including Walmart, Uber, Shopify, Lenovo, L’Oréal, and Fanatics - rely on impact.com to power more than 350,000 partnerships that deliver measurable business results.
Your Role at impact.com:
We're looking for a Senior Data Scientist with deep expertise in identity graph construction and resolution to lead a critical modernization of Impact's identity graph infrastructure. This is a technically demanding individual contributor role — you'll own the diagnostic, architectural, and implementation work required to understand our current identity graph end-to-end, identify its weaknesses, and drive meaningful, measurable improvements.
You'll start by mapping and deeply understanding the existing pipeline, then move quickly to identify and execute near-term wins while simultaneously building the research and testing infrastructure needed to evaluate next-generation identity resolution approaches. A significant part of the role involves detecting, filtering, and eliminating bad data that degrades graph quality — requiring both investigative rigor and strong engineering capability.
You'll also serve as Impact's technical counterpart with external identity and measurement partners — designing and running structured POCs, evaluating vendor solutions rigorously, and producing data-driven recommendations on build-vs-buy decisions that shape the long-term identity strategy.
What You'll Do:
Core Responsibilities
Identity Graph Audit & Documentation
• Conduct a thorough, end-to-end mapping of Impact's current identity graph construction pipeline — data sources, entity linking logic, resolution rules, graph schema, and downstream consumers.
• Document the architecture clearly and completely, creating a shared foundation of understanding for Data Science, Engineering, and Product stakeholders.
• Identify structural weaknesses, coverage gaps, resolution failures, and data quality issues that degrade graph accuracy and completeness.
• Produce a prioritized inventory of problems and opportunities, distinguishing quick wins from longer-horizon architectural improvements.
Quick Wins & Iterative Improvement
• Identify and execute high-leverage, near-term improvements to the identity graph — targeting resolution accuracy, coverage, and data freshness without requiring full architectural overhaul.
• Implement incremental enhancements to existing matching and linking logic; measure impact rigorously and communicate results to stakeholders.
• Build the habit of continuous improvement into the graph pipeline: monitoring, alerting, and iterative refinement as ongoing practice rather than one-time effort.
Bad Data Detection & Remediation
• Research, design, and implement methods to systematically identify, filter, and eliminate bad data in the identity graph — including corrupted identifiers, ghost entities, erroneous links, and stale or conflicting records.
• Build detection pipelines that surface data quality issues proactively; define quality thresholds and SLOs for graph health.
• Establish feedback loops that catch new bad data patterns as they emerge, preventing quality degradation over time.
Testing Environment & Research Infrastructure
• Develop a robust testing and experimentation environment for evaluating cutting-edge identity resolution techniques — including probabilistic matching, deterministic linking, graph-based entity resolution, and ML-based approaches.
• Design evaluation frameworks with clear, reproducible metrics for resolution precision, recall, coverage, and graph coherence.
• Research and prototype emerging methods from the academic and industry literature; validate results rigorously before recommending adoption.
• Maintain a living research agenda that tracks the state of the art in identity resolution and surfaces relevant advances to the team.
External Partner Evaluation & Buy-vs-Build
• Engage directly with external identity and measurement partners to scope and execute structured POCs — evaluating vendor technologies against Impact's specific identity graph requirements.
• Design rigorous, data-driven evaluation frameworks for vendor assessments; produce clear, evidence-based build-vs-buy recommendations for leadership.
• Serve as Impact's technical counterpart in partner conversations — understanding vendor architectures deeply, asking sharp questions, and stress-testing vendor claims against real data.
• Stay current on the external landscape for identity resolution, data enrichment, and measurement vendors; bring relevant developments to the team proactively.
Production Deployment & Engineering Collaboration
• Take improvements and new capabilities from research and POC to production, independently or in close partnership with Data Engineering — owning deployment, testing, monitoring, and iteration.
• Write clean, well-tested, production-grade code; build pipelines that are maintainable and observable by the broader team.
• Collaborate with MLOps and Platform Engineering to ensure production readiness: reliability, scalability, latency, and drift monitoring.
Insights & Stakeholder Communication
• Translate complex identity graph findings — audits, quality analyses, vendor evaluations — into clear, actionable narratives for technical and non-technical audiences.
• Present findings, tradeoffs, and recommendations in planning and leadership forums; communicate uncertainty and risk honestly.
• Contribute to documentation and knowledge sharing that makes the identity graph understandable and trustworthy across the organization.
What You Bring:
Required
• Experience: 5+ years in data science, ML engineering, or applied research, with significant hands-on experience in identity graph construction, entity resolution, or large-scale identity matching in a production environment.
• Identity graph expertise: Deep, firsthand knowledge of how identity graphs are built, maintained, and evaluated — including deterministic and probabilistic linking, entity deduplication, graph schema design, and resolution at scale.
• Data quality & bad data detection: Demonstrated experience diagnosing and remediating data quality issues in large, complex datasets — identifying corrupted records, erroneous links, and systemic pipeline failures.
• Engineering strength: Proven ability to build and deploy production-grade pipelines independently; strong Python and SQL; solid software engineering fundamentals (testing, version control, observability).
• Research rigor: Ability to design reproducible experiments, define meaningful evaluation metrics, and validate results before recommending adoption — in both internal and vendor evaluation contexts.
• Analytical communication: Ability to translate complex technical findings into clear strategic recommendations; experience presenting to cross-functional and leadership audiences.
• Education: Bachelor's in a quantitative field (CS, Statistics, Math, Engineering, or similar); Master's/PhD preferred.
Preferred / Nice to Have
• Experience with graph databases and graph analytics (Neo4j, NetworkX, or similar) applied to identity or entity data.
• Familiarity with ML-based entity resolution approaches: blocking strategies, embedding-based similarity, probabilistic matching, and hierarchical clustering.
• Experience evaluating third-party identity resolution or data enrichment vendors; familiarity with the external identity and measurement partner landscape.
• Exposure to affiliate marketing, ad tech, or performance marketing data environments and the identity challenges specific to those ecosystems.
• Experience with real-time or near-real-time identity resolution and low-latency graph querying.
• Familiarity with GCP tools (BigQuery, Vertex AI, Dataflow, Cloud Run) and/or Databricks/Spark for large-scale data processing.
• Knowledge of privacy-preserving identity techniques (differential privacy, hashing, clean rooms) and their implications for graph construction.
• Experience with device fingerprinting, cross-device identity, or cookie-less identity resolution approaches.
What Sets You Apart
• Graph intuition. You think naturally in entities, edges, and resolution logic. You understand how small errors in linking compound into large downstream failures — and you build systems that catch them early.
• Detective instincts. You love hunting bad data. You don't accept anomalies at face value — you dig until you understand the root cause, whether it's a pipeline bug, a vendor data issue, or a fundamental schema flaw.
• Vendor skepticism. When evaluating external solutions, you design rigorous tests, ask hard questions, and let data — not demos — drive your recommendation.
• Research-to-production fluency. You move comfortably between prototyping new resolution techniques and shipping them into a production pipeline. You know what "production ready" means and hold yourself to that standard.
• Clear under complexity. Identity graphs are messy and hard to explain. You can articulate what the graph represents, where it breaks down, and what fixing it is worth — to an engineer, a product manager, or a senior leader.
• Ownership mindset. You treat the graph like it's yours — because it will be. You don't wait to be asked to find the next problem; you're already looking.
Salary Range: $100,000 - $125,000 per year, plus an additional 5% variable annual bonus contingent on Company performance and eligible to receive a Restricted Stock Unit (RSU) grant.
*This is the pay range the Company believes is equitable for this position at the time of this posting. Consistent with applicable law, compensation will be determined based on the skills, qualifications, and experience of the applicant along with the requirements of the position, and the Company reserves the right to modify this pay range at any time.
Benefits and Perks:
At impact.com, we believe that when you’re happy and fulfilled, you do your best work. That’s why we’ve built a benefits package that supports your well-being, growth, and work-life balance.
• Medical, Dental, and Vision insurance
• Office-only catered lunch every Thursday, a healthy snack bar, and great coffee to keep you fueled
• Flexible spending accounts and 401(k)
• Flexible Working: Our Responsible PTO policy means you can take the time off you need to rest and recharge. We're committed to a positive work-life balance and provide a flexible environment that allows you to be happy and fulfilled in both your career and your personal life.
• Health and Wellness: Your well-being is a priority. Our mental health and wellness benefit includes up to 12 fully covered therapy/coaching sessions per year, with additional dependent coverage. We also offer a monthly gym reimbursement policy to support your physical health.
• A Stake in Our Growth: We offer Restricted Stock Units (RSUs) as part of our total compensation, giving you a stake in the company's growth with a 3-year vesting schedule, pending Board approval.
• Investing in Your Growth: We’re committed to your continuous learning. Take advantage of our free Coursera subscription and our PXA courses.
• Parental Support: We offer a generous parental leave policy, 26 weeks of fully paid leave for the primary caregiver and 13 weeks fully paid leave for the secondary caregiver.
• Technology Financial Support: We provide a technology stipend to help you set up your home office and a monthly allowance to cover your internet expenses.
impact.com is proud to be an equal-opportunity workplace. All employees and applicants for employment shall be given fair treatment and equal employment opportunity regardless of their race, ethnicity or ancestry, color or caste, religion or belief, age, sex (including gender identity, gender reassignment, sexual orientation, pregnancy/maternity), national origin, weight, neurodivergence, disability, marital and civil partnership status, caregiving status, veteran status, genetic information, political affiliation, or other prohibited non-merit factors.
#LI_Columbus