B2B Agency Vetting Frameworks
Last updated:Six sequenced frameworks for vetting and onboarding a B2B marketing agency that can build AI-augmented pipeline without breaking the fundamentals.
Most agency searches fail in the first 30 minutes of the first call. In our experience, the CMO asks about case studies. The agency walks through three logo slides. Nobody talks about ICP math, attribution wiring, or what the agency will actually do in week 2. Six months later, the contract gets killed and everyone blames "fit."
Fit is not the problem. Process is. Here are the six frameworks we use to stop that from happening.
This hub is The Starr Conspiracy's methodology catalog for vetting and onboarding a B2B marketing agency. Six named, sequenced frameworks move a CMO from "we need help with pipeline" to a signed partner with measurable pipeline impact inside 90 days. The sequence runs ICP diagnosis, discovery pressure test, AI readiness, attribution readiness, RFP discipline, 90-day onboarding. Each framework targets a specific failure mode (discovery theater, AI tourism, attribution cosplay, RFP-as-checklist, week-2 drift) and binds back to forecast credibility your board will actually defend.
In order, the six frameworks are:
- Framework 1: ICP-to-Demand-State Diagnosis
- Framework 2: Discovery Call Pressure Test
- Framework 3: AI and Automation Readiness Audit
- Framework 4: Pipeline Attribution Readiness Model
- Framework 5: Operating Discipline RFP
- Framework 6: 90-Day Onboarding Alignment
Here is what you will get:
- Decision logic for your next discovery call.
- Artifact-based evaluation signals for your shortlist.
- A 90-day onboarding spine you can hand to procurement without sanitizing it.
We don't sell AI experiments. You are selecting a system-builder.
The sequence matters
Diagnosing your own ICP and demand state mix has to come first: who actually converts, at what cost, and through which buying motion. Only then can you evaluate AI and automation readiness. Pressure-testing attribution requires that you already understand the agency's operating discipline, and a useful 90-day onboarding is impossible if the RFP never surfaced the right commitments in the first place.
Skip a step and you will pay for it next quarter. Misattributed spend. A wasted quarter. A sales team that stops trusting marketing's numbers. Each framework is designed to flush out one failure mode before you are emotionally committed.
Three red flags to know before you start. If an agency says they can start with tactics and "figure out ICP later," walk away. Claiming credit for everything they can measure but nothing they cannot means attribution will be theater. An agency that cannot commit to named owners by week 2 has already told you how week 12 ends.
Watch for three archetypes the frameworks are designed to expose:
- Luddites dress up 2018 playbooks as "fundamentals" and call AI a distraction.
- Tourists demo ChatGPT in the pitch and have no operating model behind it.
- Zealots want to rip out brand, message, and strategy in favor of automation theater.
Auditing an operating system is what this process actually is: people, process, data, tools. Score each framework 0, 2 and require a minimum threshold to advance. Walk away from anyone who cannot survive the second framework, because they will not survive your second quarter.
"We don't have time for this" is the wrong instinct. You don't have time not to. A bad selection costs a year and torches forecast credibility with your CFO. These six frameworks cost a week.
The six frameworks, in sequence
1. The ICP-to-Demand-State Diagnosis Framework
The ICP-to-Demand-State Diagnosis Framework is The Starr Conspiracy's diagnostic for establishing what you actually need an agency to do before you talk to one. It prevents discovery theater by making your own house honest first. Output becomes input to every framework that follows.
- ICP convergence map. Where your closed-won, expansion, and pipeline-quality ICPs actually overlap, not where marketing wishes they did.
- Demand state mix. The ratio of out-of-market, in-market, and active-evaluation buyers driving your pipeline today.
- Capability gap inventory. The specific work your team cannot do internally and why.
- What good looks like. A one-page brief a finalist agency could quote back to you in their own words. If your team cannot agree on the demand state mix in one room, you are not ready to brief an agency.
When to use: Before any outreach to agencies, and before any internal RFP drafting begins.
2. The Discovery Call Pressure Test
The Discovery Call Pressure Test is The Starr Conspiracy's interrogation framework for separating consultative partners from order-takers in the first call. It prevents discovery theater on the agency side and surfaces operating posture early.
- ICP challenge questions. Does the agency push back on your ICP brief or just nod?
- Demand state questions. Can they name the motion mix they would run and why?
- Operating questions. Who staffs the account, who owns CRM hygiene, who runs the weekly?
- AI posture questions. Augmentation, replacement, or theater?
- Attribution questions. What do they refuse to claim credit for?
- What good looks like. A follow-up memo that reframes your problem unprompted. Green flag if they reframe; red flag if they recap.
When to use: First and second calls with any agency on your longlist.
Three discovery calls booked next week? Bring us in to run Framework 2 live and capture the evidence with you.
3. The AI and Automation Readiness Audit
The AI and Automation Readiness Audit is The Starr Conspiracy's scoring model for grading an agency's AI-native capability on five operational dimensions, not marketing claims. AI augments the system; it does not replace strategy. This framework prevents AI tourism. In very early-stage agencies you can compress this audit, but only if they can still produce a live workflow walkthrough.
- Workflow integration. Where AI sits in the agency's actual delivery process.
- Human-in-the-loop discipline. What gets reviewed, by whom, against what standard.
- Data posture. What they will and will not touch in your CRM and data warehouse.
- Brand and message guardrails. How they protect strategic fundamentals through AI augmentation.
- Pass/fail criterion. Green flag if they can walk a live, anonymized AI-augmented workflow on screenshare with QA gates named. Red flag if the demo is a prompt library.
- Artifact signal. A live walkthrough of an AI-augmented workflow using anonymized inputs, with QA checkpoints documented.
When to use: After discovery, before pricing conversations.
4. The Pipeline Attribution Readiness Model
The Pipeline Attribution Readiness Model is The Starr Conspiracy's pressure test on whether an agency can actually measure what they promise to deliver. Attribution model selection depends on sales cycle and channel mix, but the requirements below are non-negotiable regardless of model. This framework prevents attribution cosplay.
- Source-of-truth alignment. Which system holds the number they will be measured on.
- Multi-touch posture. What model they recommend and what they refuse to use.
- SLA on reporting. Cadence, owner, and escalation path when numbers slip.
- Sales handoff definition. What counts as a qualified handoff and who arbitrates disputes.
- Pass/fail criterion. Green flag if they bring an anonymized reporting template with definitions for every field. Red flag if they show a dashboard screenshot without a data dictionary.
- Artifact signal. An anonymized reporting template built for training, or a live walkthrough of a reporting pack with sensitive fields blurred on screen.
When to use: Before shortlisting finalists, and before any RFP scoring rubric is built.
5. The Operating Discipline RFP Framework
The Operating Discipline RFP Framework is The Starr Conspiracy's RFP architecture that restructures the request around how the work gets done, not what gets delivered. It prevents RFP-as-checklist.
- Staffing commitment. Named humans, named hours, named accountability.
- System integration plan. CRM, marketing automation platform, data warehouse, business intelligence. Who touches what and with what permissions.
- KPI alignment. Leading and lagging indicators tied to your demand state mix.
- Change management posture. What the agency does when your strategy shifts mid-quarter.
- Pass/fail criterion. Green flag if they will commit named owners in writing before signature. Red flag if staffing is "to be determined at kickoff."
- Artifact signal. A 30-day mobilization plan attached to the proposal, not promised in kickoff.
When to use: After AI and attribution readiness are verified, before issuing the formal RFP. Bring us in before you lock the rubric.
6. The 90-Day Onboarding Alignment Framework
The 90-Day Onboarding Alignment Framework is The Starr Conspiracy's onboarding spine that phases the first quarter into commitments with named owners, named systems, and named KPIs. It prevents week-2 drift, the failure mode where access, naming conventions, and campaign QA all slip in the first 14 days and never recover.
- Weeks 1-2: Instrumentation. Attribution wiring, CRM access, naming conventions, campaign QA gates, and reporting cadence locked. Most relationships quietly die right here: the agency cannot get SSO access by Friday of week 1, and nobody escalates.
- Weeks 3-6: Activation. First campaigns shipped against the demand state mix.
- Weeks 7-10: Optimization. First measurable revenue signal, with attribution defended.
- Weeks 11-13: Operating rhythm. Weekly, monthly, and quarterly reviews on the calendar with owners.
- Pass/fail criterion. Green flag if owners are filled in on the calendar before signature. Red flag if onboarding starts with a "discovery week."
- Artifact signal. A first-30-days calendar with owners filled in before the contract is signed.
When to use: Between contract signature and kickoff, and as the scorecard for the first quarterly business review.
Run the sequence
Run them in order. The output of each is the input to the next. By the end of Framework 6, you will have either a partner producing pipeline or evidence, in writing, that the relationship needs to end before it costs you a year.
Three things change when you run this: faster decisions, fewer surprises in onboarding, and measurement your CFO will defend on forecast accuracy, CAC payback visibility, pipeline coverage, and sales trust.
Here is what you will have in hand when the sequence is finished:
- ICP and demand state brief
- Discovery memo with agency-by-agency scoring
- AI readiness score
- Attribution requirements document
- RFP rubric with pass/fail gates
- 90-day onboarding plan with named owners
Agency calls booked this month mean Framework 1 starts this week. Do not run this alone if procurement is already in motion. Bring us in before you lock the rubric and we will run Frameworks 1, 6 with your team and deliver a scored shortlist, attribution requirements, and a 90-day onboarding plan. We don't sell AI experiments. We build marketing systems that work.
Steps
The ICP-to-Demand-State Diagnosis Framework
Before you contact a single agency, diagnose where your pipeline is actually broken. Map your ICP segments against the Ten Demand States to identify which states you are currently serving well, which you are ignoring, and which need agency capability you do not have in-house. This becomes the brief no agency gets to write for you.
- •Document your top three ICP segments with revenue contribution and CAC by segment
- •Map current marketing activity against the Ten Demand States for each ICP
- •Identify two or three demand states where pipeline is leaking and internal capability is absent
- •Write a one-page diagnosis that becomes the anchor document for every agency conversation
The Discovery Call Pressure Test
Discovery calls are where most agencies reveal themselves and most buyers fail to notice. Replace your standard intro call with six structured question sets that force the agency to demonstrate consultative depth, ICP fluency, AI literacy, attribution thinking, operating cadence, and intellectual honesty. The Starr Conspiracy developed this pressure test from sitting in on hundreds of agency selection calls.
- •Send the ICP diagnosis from step one in advance and ask them to react, not pitch
- •Ask how they would deprioritize work, not just what they would do
- •Require a named example of a client engagement they ended early and why
- •Score each agency on the six question sets within two hours of the call ending
The AI and Automation Readiness Audit
Every agency now claims to be AI-native. Almost none are. This audit scores agencies on five operational dimensions of AI capability: workflow integration, model governance, data pipeline maturity, human-in-the-loop discipline, and measurable production lift. The Starr Conspiracy built this audit because the territory needed a way to separate AI theater from AI infrastructure.
- •Ask for a screen share of an actual AI-augmented workflow in production, not a slide
- •Request their model governance documentation and prompt-engineering standards
- •Verify what percentage of client deliverables touch an AI pipeline and how lift is measured
- •Score readiness from zero to four across all five dimensions and require a minimum total of fourteen
The Pipeline Attribution Readiness Model
An agency that cannot measure pipeline cannot be held accountable for pipeline. This model evaluates whether a partner can wire multi-touch attribution into your existing CRM and marketing automation stack, and whether they will commit to scored, dimensional reporting from day one. Score the agency on data integration depth, attribution model sophistication, reporting cadence, source-of-truth ownership, and willingness to be measured against revenue, not MQLs.
- •Require a sample attribution dashboard from a real client, with permission and redactions
- •Confirm which CRM and MAP platforms they have wired attribution into in the last 18 months
- •Ask what they would refuse to be measured on, and why
- •Get written commitment on the first reporting deliverable date and its contents
The Operating Discipline RFP Framework
Most RFPs ask about deliverables. Deliverables tell you almost nothing about whether an agency can actually run a quarter. This framework restructures the RFP into four operating disciplines: planning cadence, decision rights, escalation paths, and change management. The Starr Conspiracy uses this structure because the work is never the problem, the operating model is.
- •Replace the deliverables section of your RFP with the four operating disciplines
- •Require named owners on the agency side for each discipline, not titles
- •Ask for their standard weekly, monthly, and quarterly cadence artifacts as attachments
- •Score responses on specificity and walk away from anything that reads like a template
The 90-Day Onboarding Alignment Framework
The first 90 days decide whether the relationship will produce pipeline or produce excuses. This framework breaks onboarding into three named phases with explicit success criteria: Days 1 to 30 (Calibration), Days 31 to 60 (Wiring), and Days 61 to 90 (Production). Each phase has named KPIs, SLA commitments, and CRM integration milestones that get signed before the contract does.
- •Define Calibration phase exit criteria around ICP alignment and brand and message audit signoff
- •Define Wiring phase exit criteria around attribution stack integration and reporting go-live
- •Define Production phase exit criteria around first measurable pipeline contribution and unit economics review
- •Tie a meaningful percentage of fees in quarter one to phase exit criteria being met
When to Use This Framework
Use this framework when you are a CMO, VP of Marketing, or CEO at a growth-stage or enterprise B2B technology company actively evaluating an outside agency partnership to build or rebuild a pipeline engine. It fits best when you are under board or revenue pressure to deliver predictable pipeline within two to four quarters, when your internal team lacks specific capability in AI-augmented demand generation or modern attribution, and when you have been burned by a prior agency relationship that overpromised and underdelivered. Prerequisites include executive sponsorship for the selection process, willingness to share real ICP and pipeline data with finalists under NDA, and authority to commit at least 90 days of paid engagement for a fair production test. The framework is less useful for pure brand or creative projects with no pipeline accountability, for sub-twenty-thousand-dollar tactical scopes where the overhead exceeds the spend, or for situations where the real problem is product-market fit rather than marketing execution. If you are not yet sure whether you need an agency at all, run framework one in isolation first. The ICP-to-Demand-State diagnosis will tell you whether the gap is capability, capacity, or strategy, and only the first two are agency problems.
Explore this territory
Every published piece in this topical cluster, grouped by format.
Related Insights
How to Vet and Onboard a B2B Marketing Agency
Five practitioner procedures for vetting and onboarding a B2B marketing agency that builds an AI-augmented pipeline engine without sacrificing fundamentals.
GuideHow to Use Gen AI in B2B Marketing Without Wasting Budget
A practitioner's framework for gen AI in B2B marketing from The Starr Conspiracy. Strategy, sequencing, tools, and the failure modes that waste budget.
GuideMarketing Messaging Framework Template for B2B
Build a marketing messaging framework that drives alignment and converts. The Starr Conspiracy's step-by-step template for B2B teams, with examples.
GuideDemand Generation KPIs That Actually Predict Revenue
A practitioner-grade framework for demand generation KPIs: which metrics predict revenue, which are vanity, and how to report them to executives.
FrameworkB2B Agency Vetting Frameworks
Six structured frameworks for vetting B2B marketing agencies on CRM-backed pipeline proof, industry fit, and revenue attribution across complex buying cycles.
FrameworkB2B Value Proposition Frameworks
Six sequenced methodologies for building differentiated B2B value propositions that convert enterprise buying committees under GTM pressure.
About The Starr Conspiracy


Leads client delivery and experience design. Ensures every engagement delivers measurable strategic outcomes.

Drives go-to-market strategy and demand generation for TSC clients. Expert in building B2B growth engines.
Ready to talk strategy?
Book a 30-minute call to discuss how we can help your team.
Loading calendar...
Prefer email? Contact us
See what AI-native GTM looks like
Explore our AI solutions built for B2B marketers who want fundamentals and transformation in one place.
Explore solutions