Playbook · Mail

A/B testing in the age of AI agents — pick which sales skill writes better email.

Traditional A/B testing pits two subject lines against each other. In the age of AI agents, you can pit two entire authorial philosophies — two installable Claude Skills — each composing a full sequence on the same product. You couldn't run this test without an agent in the loop. With Loopi, it's one prompt.

You can hand this page straight to your AI agent. Copy the URL — https://loopi.social/playbooks/ab-test-with-claude — paste it into Claude Code with Loopi connected, name your two skills and your list, and say "Read this Loopi playbook and run the test." The agent composes both sequences, schedules them, and reports the winner.

What it does for you

You stop guessing which way of writing actually lands with your audience and get a measured answer — decided by who clicks, on your own list, not by taste. By the end you'll have:

A real answer to "which voice converts." Two authorial styles, scored by primary-CTA click-rate on the same list — the winner is the one your readers actually respond to.
Two complete sequences, composed for you. One per skill, three emails each, same product and landing page — written so the only thing that differs is the voice.
Clean, separate scoring. Each sequence's clicks are attributed to its own steps, so the two never blur even though they share one list.
A defending champion. The winning skill becomes the voice you keep — and the one to beat in your next round.

Is it worth the effort? Your part is one prompt: name two skills, a product, and a list. The test then runs itself — sequence A sends, then B, and the winner report arrives once both have run (~3-4 weeks of calendar time, near-zero of your time). The payoff is a voice you keep using on every email after, chosen by data instead of gut.

How it works

A/B testing used to mean Google running a 2009 experiment on 41 shades of blue to pick the best button color — one CSS variable, millions of impressions, statistical certainty about which pixel won. That model still works for buttons. For an email sequence, where the meaningful variable is an entire authorial voice rather than a hex code, classical A/B leaves most of the lift on the table. Skills are the bigger lever.

Pick two skills

Community, Loopi's, or your own — different voices

Compose two sequences

Skill A → sequence A. Skill B → sequence B.

Run sequentially

A first, then B — same cadence, same list

Compare + fork the winner

Winning skill becomes the next round's defender

The variant axis is the whole skill — not a subject line, not an opener. This is what A/B testing looks like in the age of AI agents.

Why this is new. Classical A/B testing burns out — you run out of meaningful subject-line variations long before you exhaust the space of authorial voices. Skills are entire frameworks: a sales-trainer's playbook, a founder-voice, a growth-marketer's funnel. Pitting them against each other on your own audience tells you which framework actually maps to your readers. You couldn't do this before AI agents — no human could write three Skill-A emails AND three Skill-B emails for the same product without their own voice contaminating both sides.

Picking the two skills

The Claude Skills community has growing libraries of sales, marketing, and copywriting skills. Pick two that diverge in voice — a sales-trained framework against a founder-voice essay style, an objection-handling skill against a curiosity-led one. Every test gives you a defending champion; the next round pits it against a fresh challenger.

louisblythe/Sales-Skills — sales-trained voice: framing, objection handling, sequence pacing.
karanb192/awesome-claude-skills — curated index across categories. Browse the marketing / sales / writing sections for candidates.
Varnan-Tech/opendirectory — open directory of skills you can drop into a project.
Your own A vs your own B — write two versions of how you want emails to sound, save as separate SKILL.md files, let your audience pick.

Set it up and run it

The prerequisite is a Loopi account with a mail list and the two skills installed in your agent. Then wire Loopi to your agent with one line — Claude Desktop, Cursor, Codex, and Gemini take the same https://api.loopi.social/mcp URL in their connector settings.

connect Loopi MCP

# Claude Code — add Loopi in one line
claude mcp add --transport http loopi https://api.loopi.social/mcp

# then tell your agent:
> connect to loopi mcp        # opens a browser to authorize, once

The prompt itself is the orchestration — name the two skills, the product, the list, the cadence. A capable agent with Loopi MCP connected will load each skill in turn, compose a sequence with it, schedule sequentially, and report. No separate orchestration skill to install.

prompt to your agent

Run a skill-vs-skill A/B test on my codingai list.

Sequence A: compose a 3-email welcome sequence using the
sales-skills-louisblythe skill, pointing at https://your-product.com.

Sequence B: compose a 3-email welcome sequence using my
founder-voice skill, same product, same landing page.

Schedule sequence A starting this Thursday, and sequence B starting
two weeks later (after A has fully run). Aggregate primary-CTA
click-rate per sequence and report which skill produced the winner.

You provide

You get back

Two skills

Each one a different authorial voice — the variable being tested

Landing page + list

Same target, same audience for both sequences

AI agent

Claude, GPT, etc. via Loopi MCP

Sequence A composed

Voice of skill A, scheduled first

Sequence B composed

Voice of skill B, scheduled after A

Winner report — after both have run

Aggregate primary click-rate per skill. Arrives ~3-4 weeks in, once both sequences have fully sent.

The agent loads each skill in turn, composes a sequence with it, schedules sequentially, and reports which skill produced the higher-converting result once both have run.

If a preview tips Promotions for one skill but not the other, that's already a signal — the skill that consistently lands Primary is the one you'll want to defend in future rounds.

Pit two skills against each other — connect Loopi to your AI agent →

What Loopi handles for you

Your job is picking the two skills. Loopi handles the bookkeeping.

Per-step click attribution. Every click is bound to the specific sequence step that delivered the email, so the two sequences stay cleanly separated in analytics even though they share a list.
The primary CTA gets its own score. The button you mark as the main call-to-action is counted separately — mail.getMailContentAnalytics returns primaryClicked and a pre-computed primaryClickRate per email, so "did they click the thing that matters" is already calculated.
Per-step rollup, ready to sum. mail.getSequenceStepHistory lists every step of a sequence; mail.getMailContentAnalytics returns each step's deliveries and primary-click rate. Sum across the steps for the sequence total.
Previews to your own inbox. mail.previewContent sends any step to your own inbox first — so you can eyeball the message before either sequence ships.

Find out which skill your audience reads.

This kind of test only works on an AI-agent-driven mail platform. Connect Loopi, name two skills, point at your list — the agent does the rest.

Connect Loopi →