David Jansen van Vuuren

Principal Product ManagerData & ML Platforms

I build and scale machine learning products. 15+ years across e-commerce, adtech, and fintech — search, recommendations, customer data platforms. Most recently built Irrational Signals end-to-end — an ML trading-signals product — then made the call to retire it.

§ 01Selected impact

Numbers do the work.

Across roles at Takealot Group, Vicinity Media, and now Irrational Signals.
  • +20%
    search customer satisfaction

    Multi-intent ranking framework. Takealot.

  • recommendation impression share

    Flexible context-aware API. Takealot.

  • Petabyte
    customer data platform

    Composable CDP, multi-brand, identity resolution. Takealot Group.

  • 100+/hr
    trading signals

    Irrational Signals. US equities · 3 sectors · 6 batches per market day.

  • 80% → 30%
    GeoIP-to-WiFi conversion target

    Micro Networks model. Vicinity Media (proof-of-concept).

  • 15+
    years

    E-commerce, adtech, fintech. MBA. Cape Town.

§ 02ML Product Framework

A project execution framework for machine learning.

Built at Takealot Group to make the gap between exploration and productionisation explicit. The diamond below is the core visual: ideation expands into multiple candidate paths, then contracts into a single productionised model. Click any stage for detail.

The framework is about shape, not duration. Cadence varies with stack maturity, dataset readiness, and modern tooling (synthetic data generation, LLM-assisted prep, foundation-model baselines) often compresses several stages substantially.

EXPANSIONCONTRACTIONEXPLORATIONCOMMITUNSOLVEDCUSTOMERPAIN POINTIdeationRoadmapConsiderationRoadmapCommitmentDeliverable: Proof-of-conceptDeliverable: Customer-validated modelGeneratetraining dataCandidateModel 1CandidateModel nIterativeTrainingOffline POCEvaluationFinalise ModelArchitectureProductioniseStage ModelA/B Test ORInternal TestRetrainingModel MonitoringALLEVIATEDCUSTOMERPAIN POINTSTARTSHIPPED
Learnings — what changed after running the framework

Running the 6-month framework in practice surfaced two missing phases. The longer 15-month variant adds these as inline additions to the same diamond — not new structure, just acknowledgement that real ML work sometimes needs a dataset detour and a thinking pause.

  1. Learning 01
    Inserted before Generate Training Data

    Dataset Exploration Phase

    Often the bottleneck isn't the model — it's the dataset. Before committing to model training, generate multiple candidate datasets in parallel and pick the winner. Two evaluation paths: assess them via offline modelling against established metrics (when you'll need a model anyway), or — when the use case is simple enough — serve them directly as an A/B test. A caveat worth stating: if direct-serve datasets are sufficient on their own, you probably don't need a model at all.

  2. Learning 02
    Inserted after A/B Test, before retraining

    Investigation Buffer

    The A/B test doesn't always yield a clear answer. Reserve dedicated time after a test to investigate the result — understand the customer behaviour underneath the metric — instead of forcing immediate iteration. This stops cascading dependencies when an A/B test result requires deeper analysis (e.g. the Category Intent rollout, which needed investigation before promotion).

Both additions extend the project shape — a dataset arc up front, a thinking pause at the end. In practice they can add real time to a programme, though modern tooling (synthetic data generation, LLM-assisted analysis) compresses both phases substantially.

Adapting to agentic workflows

The diamond is calibrated for classical supervised machine learning — train a model against labelled data, evaluate offline, A/B test, productionise. The shape of the work still holds for agentic workflows: same expansion / contraction logic, same need for explicit checkpoints. Several stages just compress or change form.

Classical ML stage
Agentic equivalent
  • Generate training data
    Prompt design · RAG corpus · synthetic data
  • Candidate Model n
    Candidate prompts · tool sets · orchestration patterns
  • Iterative training
    Prompt iteration · retrieval tuning · tool-spec refinement
  • Offline POC evaluation
    LLM-as-judge · behavioural tests · red-team probes
  • A/B test or internal test
    A/B with completion rate · human eval · latency budgets
  • Productionise + stage
    Guardrails · rate limits · fallback prompts
  • Drift monitoring
    + hallucination rate · tool-call accuracy · latency

The two learnings (the dataset detour and the investigation buffer) carry over directly. The investigation buffer matters more in agentic work, not less — behavioural failures rarely have a clean numeric answer.

Companion artefacts

These are the artefacts that typically accompany a project on this framework. Names and roles only — the actual templates and specifications aren't included on this site; they live with the team that adopts the framework.

§ 03Experience

Career

MBA, University of Stellenbosch Business School (2013). BCom Management Sciences, University of Stellenbosch (2007).
  1. 2025 — archived

    Founder & Principal Product Manager

    Irrational Signals

    Built and ran an API-first, ML-driven trading-signals product for US equities end-to-end — data pipeline, LightGBM model, REST API, Python SDK, and billing. Validated live, then deliberately wound down.

    FintechMLAPI products
  2. 2024 — 2025

    Group Product Lead — Customer Data & ML Platforms

    Takealot Group

    End-to-end product strategy for the Customer Data Platform, CRM, fraud, and insights across the multi-brand portfolio. Architected petabyte-scale data platform and scalable analytics architecture for cross-brand activation.

    Customer Data PlatformIdentity resolutionMulti-brand
  3. 2022 — 2024

    Group Product Lead — AI/ML (Search & Recommendations)

    Takealot Group

    Machine-learning product strategy for search and recommendations across platforms. Multi-intent search framework (+20% customer satisfaction). Flexible recommendations API (3× impression share).

    SearchRecommendationsML platforms
  4. 2020 — 2022

    Product Manager — AI/ML (Search & Recommendations)

    Takealot.com

    Delivered the machine-learning product roadmap for core search and recommendation performance. Defined and tracked North Star metrics. Productionised models with data scientists and ML engineers.

    SearchRecommendations
  5. 2017 — 2020

    Product Lead — AdTech Platform

    Vicinity Media

    Owned the advertising-tech platform strategy: audience data, campaign systems, attribution. Built the data science function. Out-of-home (OOH) attribution linking offline ad exposure to in-store behaviour. WiFi-clustering and point-of-interest visit models.

    AdtechAttributionAudience
  6. 2008 — 2015

    Consulting Manager

    Cape Value Group

    Data analysis and market modeling across financial services and public sector. Business strategy, forecasting, and operational planning.

    ConsultingModeling
§ 04Case studies

Selected work, in depth.

Five projects across e-commerce, adtech, and fintech. Each one with its own page — the problem, the approach, a custom diagram, and what shipped.
§ 05Contact

Get in touch