Incrementality Testing for Paid Social in 2025: Geo vs PSAs vs Ghost Ads

29 de agosto de 2025 por
Incrementality Testing for Paid Social in 2025: Geo vs PSAs vs Ghost Ads
WarpDriven
Cover
Image Source: statics.mylandingpages.co

Incrementality is the single most defensible way to prove whether paid social actually drives outcomes beyond what would have happened anyway. In 2025—amid ATT, cookie deprecation, and walled garden opacity—three practical methods dominate: Geo experiments, PSA-based holdouts, and ghost ads. This comparison explains how each works, what it costs, where it’s available (Meta, TikTok, Snapchat, Pinterest; Google geo for context), and how to choose for your budget, scale, and privacy constraints.

Key takeaways:

  • Ghost ads: Fastest read, minimal direct control cost, but locked to a single platform’s black box.
  • PSA holdouts: Strong user-level control where ghost ads aren’t available, with direct PSA spend and careful creative neutrality required.
  • Geo experiments: Platform-agnostic and privacy-robust, ideal for market-level impact and MMM calibration if you have scale and patience.

How each method works (in plain English)

Side-by-side comparison (2025)

  • Causal strength

    • Ghost ads: High internal validity within a platform; auction-aware control.
    • PSA holdouts: Solid user-level RCT if creative neutrality and exclusions are tight.
    • Geo experiments: Strong at market level; sensitive to regional shocks/seasonality.
  • Cost to run

    • Ghost ads: Minimal direct control cost; standard media only.
    • PSA holdouts: Direct PSA spend for control (industry practice often 10–30% of test budget; verify via power needs). See ranges discussed in Criteo’s incrementality overview.
    • Geo experiments: No control media cost but opportunity cost from dark/control geos and analysis overhead. GeoLift’s documentation emphasizes planning and pre-periods to avoid underpowered tests: GeoLift walkthrough.
  • Speed to signal

    • Ghost ads: Often 2–4 weeks with sufficient conversions (heuristic; confirm via platform power tools).
    • PSA holdouts: Typically a few weeks; depends on conversion volume and control split.
    • Geo experiments: Usually 4–8+ weeks including pre/post.
  • Granularity and actionability

    • Ghost ads: Strong for audience/creative/campaign cuts.
    • PSA holdouts: Similarly granular when designed well.
    • Geo experiments: Coarser, better for channel/market-level budget calibration.
  • Privacy resilience

    • Geo experiments: Fully aggregated; highly resilient to ATT/cookies. See GeoLift framework.
    • Ghost/PSA: Run inside walled gardens with aggregate reporting; compatible with server-to-server signals and SKAN where relevant. TikTok outlines privacy approaches in its 2024–2025 PETs update.
  • Portability across platforms/channels

    • Geo experiments: Platform-agnostic; good for cross-channel lift and MMM calibration.
    • Ghost/PSA: Results live inside the garden; limited portability.

Platform availability and fit (Meta, TikTok, Snapchat, Pinterest)

  • Meta

    • Geo: Supported externally via the open-source Meta GeoLift repo.
    • Ghost ads: Meta “Conversion Lift” uses randomized control with auction awareness; setup typically via Measurement tools with Pixel/SDK/Conversions API. While thresholds vary, the method aligns with ghost bidding concepts described by Xandr’s toolkit.
    • PSA holdout: Feasible manually by serving neutral PSAs to randomized control segments; requires strict exclusions to prevent leakage.
  • TikTok

    • Ghost-style lift: TikTok’s randomized Conversion Lift Study (CLS) offers conversion-based incrementality. Overview and adoption context appear in the TikTok CLS blog (2024–2025).
    • PSA: Possible to implement manually when CLS isn’t available, with the same neutrality caveats.
    • Geo: Can be run externally with geo tooling; not a native TikTok UI feature.
  • Snapchat

    • Ghost/Platform lift: Available via managed programs; Snap emphasizes experimentation in its measurement strategy overview. Public, step-by-step docs are limited, but case work is visible in Snap’s inspiration hub, e.g., MECCA multi-week results.
    • PSA: Feasible manually with caution around audience overlap and frequency.
    • Geo: External/method-driven rather than native UI.
  • Pinterest

    • Geo and sales lift: Pinterest showcases geo-style and lift work (GeoX) via partners; see the Kvik GeoX case.
    • Ghost/Conversion Lift: Managed incrementality/sales lift programs exist; Pinterest stresses server-to-server Conversions API.
    • PSA: Possible manually; ensure neutral creative and exclusions.
  • Google (context)

Costs, timelines, and sample size heuristics

  • Geo experiments

    • Costs: No PSA/control media spend, but you pay in opportunity cost by throttling or darkening control geos. Expect analysis overhead and a longer runway.
    • Timeline: Commonly 4–8+ weeks including pre/post, with 2–5+ weeks of treatment. See the GeoLift walkthrough and blog.
    • Power: More markets and longer lookbacks improve power; run simulations with GeoLift’s tools per the GeoLift blog on lookback windows.
  • PSA-based holdouts

    • Costs: Direct PSA spend for control. Industry practice often budgets 10–30% of test media to the control cell depending on split and expected lift—treat as a heuristic from practitioner literature like Criteo’s overview.
    • Timeline: Often 2–4+ weeks if you have steady conversion volume; extend when variance is high.
    • Power: Ensure enough conversions per cell; split (e.g., 80/20) is a trade-off between power and waste.
  • Ghost ads

    • Costs: Minimal direct control spend; platform runs the randomization and counterfactual within the auction. Mechanism described in Xandr’s ghost bidding guide.
    • Timeline: Frequently the fastest (2–4 weeks) if you can generate 50–100+ conversions per cell per week (heuristic; confirm with platform power calculators when available).
    • Power: Healthy pixel/SDK/CAPI events and stable campaign settings materially improve power and stability.

Practical setup checklists and pitfalls

  • Geo experiments (GeoLift-style)

    • Checklist: Select 20–60 comparable regions; confirm geographic dispersion to minimize spillover; secure a 4–8+ week pre-period; run GeoLiftPower simulations; align a 2–5+ week treatment; monitor exogenous shocks; use synthetic control and inspect imbalance metrics. Start with the GeoLift docs and walkthrough.
    • Pitfalls: Seasonality and regional shocks; mis-matched markets; spillover between adjacent geos; underpowered designs.
  • PSA holdouts

    • Checklist: Randomize users into test/control (e.g., 70/30 or 80/20); build multiple truly neutral PSA creatives that match format and length; set strict audience exclusions to prevent leakage; cap frequency; monitor overlap reports; pre-commit control budget.
    • Pitfalls: Creative inadvertently affecting intent; leakage from lookalike/retargeting; auction dynamics diverging if PSA quality scores differ.
  • Ghost ads (auction-based control)

    • Checklist: Ensure pixel/SDK/Conversions API are healthy; define conversion window and primary KPI; freeze targeting/creative changes during the lift; prevent audience overlap with other campaigns; document exclusions, timing, and any algorithmic constraints.
    • Pitfalls: Black-box reliance; contamination from overlapping or algorithmically expanding campaigns; insufficient conversions leading to inconclusive reads.

Decision framework: which method when?

  • Low budget/low volume

    • Prefer ghost ads where available due to minimal control cost and faster reads. If unavailable, rotate small PSA holdouts on your highest-volume ad sets to accumulate evidence over time. Avoid geo until scale increases.
  • High-scale national advertisers

    • Run quarterly geo experiments to calibrate channel/portfolio ROAS and inform MMM. Layer always-on ghost-ad lift on Meta/TikTok/Snap/Pinterest for creative/audience optimization.
  • App-first (post-ATT/SKAN)

    • Favor platform lift (ghost ads) on Meta/TikTok/Snap due to user-level randomization and SKAN/server integrations. Use periodic geo experiments to capture cross-channel effects missed by single-garden tests.
  • New market or brand launch

    • Use geo experiments to measure market entry impact where baselines are unclear. Follow with ghost/PSA studies for creative and audience tuning once volume stabilizes.
  • Multi-region/multi-brand portfolios

    • Employ matched-market geo designs at staggered intervals; triangulate with platform lift studies to maintain a rolling calibration of ROAS and saturation effects.
  • Highly seasonal businesses

    • Avoid kicking off tests near major peaks. If unavoidable, extend pre-periods, use synthetic control, and plan multiple waves to smooth volatility. Google’s experiment playbooks provide guardrails for seasonality-aware design: see the Think with Google Experiments Playbook (2023).

Privacy and measurement ecosystem, 2025

  • All three methods are resilient to third‑party cookie loss and ATT. Geo works purely on aggregated geo data. Platform lift (ghost/PSA) runs entirely within walled gardens with aggregated outputs. TikTok documents privacy-enhancing technologies for measurement in its PETs announcement, and Pinterest encourages server-to-server measurement via its Conversions API guidance.

How to combine results with MMM and modeling

  • Use geo results to calibrate MMM priors and constrain channel coefficients, since geo captures cross-channel effects. GeoLift materials emphasize power simulation and diagnostics suitable for this role: see the GeoLift repo.
  • Use platform lift (ghost/PSA) to set platform- and audience-level ROAS multipliers and to validate creative hypotheses. Feed these into Bayesian budget allocators and conversion modeling.
  • Reconcile discrepancies by timing tests so their windows overlap with MMM training periods and by documenting exogenous events.

FAQs (2025)

  • What’s the fastest method to get a read?

  • How expensive are PSA holdouts versus ghost ads?

    • PSA holdouts require direct control spend (often 10–30% of test media in practice per industry explainers like Criteo’s overview). Ghost ads typically don’t require separate control spend but are limited to a single platform’s walled garden.
  • When should I prefer geo experiments?

    • When you need cross-channel, market-level truth and can afford longer timelines and some opportunity cost. Start with the GeoLift walkthrough for design and power sims.
  • Are these methods ATT- and cookie-proof?

    • They’re far more resilient than last-click attribution. Geo uses aggregated markets; platform lift uses in-garden randomization with aggregate outputs. TikTok’s privacy tech update outlines how platforms adapt measurement to privacy constraints.
  • Can I run lift on TikTok/Snap/Pinterest today?


Bottom line: In 2025 there’s no single winner. If speed and cost matter and you operate mainly in one garden, ghost ads are your workhorse. If you need portable, market-level truth, run geo experiments. If your platform lacks ghost support or you want user-level causality with full control, PSA holdouts work—just budget for the control spend and police leakage. Then triangulate results with MMM and server-side modeling for a measurement system you can defend to finance.

Incrementality Testing for Paid Social in 2025: Geo vs PSAs vs Ghost Ads
WarpDriven 29 de agosto de 2025
Compartir esta publicación
Etiquetas
Archivar