Experiment Design

Experiment Design#

Experiments work best when the question is narrow, the product surface is clear, and the expected effect is large enough to detect with the traffic available. Treat the design as an operating plan: define the decision, choose a primary signal, protect key guardrails, and keep monitoring after launch.

Product matrix#

Use the matrix to connect each product surface to a decision and a measurable signal before implementation starts.

Product surface	Decision to support	Primary signal	Guardrail
Landing page	Is the offer clear enough to continue?	Primary action click-through	Bounce rate and page speed
Product information	Can visitors find the right support path?	Privacy or open-link engagement	Contact load and broken-link rate
Scientific notes	Do study pages support exploration?	Note opens and model interaction	Time to interaction and math-render health
Analytics view	Are measurements trustworthy enough to act on?	Visitor and route-count freshness	Missing events and database write errors

Sample strategy#

Define the unit of assignment before launch: visitor, account, session, or device.
Estimate baseline rate and minimum detectable lift before choosing the experiment window.
Keep the primary segment broad; reserve narrow slices for diagnosis after the main readout.
Use guardrails for performance, accessibility, form errors, and privacy-sensitive flows.
Stop only after the preplanned sample window unless a safety guardrail fails.

Sample size calculator#

Use the product matrix as the target matrix for the experiment: choose the product surface, decision, primary signal, and guardrail before estimating sample needs. Baseline rate is the current percentage of eligible visitors, accounts, or sessions that complete the chosen primary signal in the control experience.

The calculator below calls /api/sample-size and estimates per-variant sample size for a two-arm conversion experiment.

Continuous quality monitoring#

Confirm that event names, route names, and timestamps remain stable across deploys.
Watch daily traffic, conversion, error rate, and response time for sudden shifts.
Compare new results against historical baselines before treating a lift as product truth.
Review accessibility and performance checks with every visible page change.
Keep a short decision log: hypothesis, sample window, result, action, and follow-up owner.