agency vettingB2B marketingagency onboardingpipeline strategymarketing operations

How to Vet and Onboard a B2B Marketing Agency

JJ La PataMay 26, 2026Last updated: May 26, 2026

How to Vet and Onboard a B2B Marketing Agency: 5 Procedures for Pipeline-Focused Marketing Leaders

To vet and onboard a B2B marketing agency that builds a predictable, AI-augmented pipeline engine, follow these 5 procedures. You will need a current ICP definition, four quarters of pipeline data, CRM admin access, and CFO-aligned revenue targets. This process takes 45 to 60 days. The Starr Conspiracy recommends running all 5 procedures in sequence, not in parallel.

Step Summary Block

Audit your ICP and demand states before any agency call.
Pressure-test agencies with a structured discovery call protocol.
Evaluate agency AI capability against a live working stack.
Establish attribution and CRM integration in the first 30 days.
Align sales SLAs and 90-day pipeline goals before launch.

This is how to vet and onboard a B2B marketing agency without ending up with 18 months of activity and no pipeline movement. Most agency vetting guides stop at the question list. This one sequences the work that surrounds the pick, which is where partnerships actually break. This is not a procurement checklist. It is how you stand up the operating system that runs your next 18 months of pipeline. For category context, see our demand states glossary entry.

How to Sequence These Procedures

Order matters because each step produces inputs the next step consumes. Step 1 produces the ICP segments Step 2 stress-tests in discovery. Step 2 produces the agency ranking that determines whose AI stack you audit in Step 3. Step 3 produces the operational maturity read that tells you how much governance to build into Step 4. Step 4 produces the attribution wiring that makes the Step 5 SLA enforceable. Skip a step and the next one runs blind, meaning the agency picks targeting without segments, demos AI without governance, or launches campaigns without an attribution path. Run them in parallel and you negotiate scope before you know what good looks like. If your CEO is pushing you to compress the timeline, the right answer is to start Step 1 this week, not to collapse the sequence. Every week you delay is a week of pipeline you cannot recover by Q4.

Prerequisites / What You Need Before Starting

Skip any of these and the rest of the sequence breaks. Expect sales resistance and CFO scrutiny on at least two of these. Build the coalition before you build the shortlist.

Pipeline data, four quarters minimum. Source, stage, velocity, win rate by segment, exported from your CRM, not from a deck.
A written ICP, even a rough one. Industry, company size, buying committee roles, trigger events. If it lives in someone's head, write it down first. See our B2B marketing operations guide if you need a starting structure.
CFO-aligned revenue targets for the next 12 months. Not a marketing number. The number sales is carrying.
CRM and marketing automation admin access. You will need to grant the agency scoped permissions in Step 4. Confirm now.
A named executive sponsor. CMO or VP Marketing, not a director. Agency selection cannot be delegated below the budget owner.

If you do not have these, stop and build them. Engaging an agency without this foundation is how you buy activity instead of pipeline, and miss the quarter you were trying to save.

Step 1: Audit Your ICP and Demand States

Audit your own house before the first discovery call. Pull your closed-won deals from the last four quarters. Cluster them by industry, employee count, tech stack, and the trigger event that started the cycle. Map each cluster to a demand state (the buying posture a segment is in, from unaware to actively evaluating). In our engagements, buyers routinely ask agencies to fix targeting problems they have not diagnosed, which is how you pay agency rates for your own homework.

Decision criteria: Pass if you can name 3 to 5 concrete segments and the demand state each enters your pipeline at. Fail if any segment is defined only by industry, or if you cannot point to which segment produced the most closed-won revenue last year.

Output and expected outcome: An ICP segment list with demand-state distribution, tight enough to reframe every agency conversation that follows because the segments doing the work are now visible.

Check: The top 3 ICP segments are named, each mapped to a demand state, and the list exists in writing.

The segment list feeds directly into Step 2 as the data you will hand agencies during discovery.

Step 2: Pressure-Test Agencies With a Structured Discovery Call Protocol

A discovery call is a diagnostic, not a sales meeting. You are testing whether the agency thinks in systems or in tactics. If they can't show it live, it's not a system, it's theater.

How to run it: Prep the rubric and ICP brief 48 hours before. Run the same 10-question protocol with every agency on your shortlist. Score answers live. Debrief within 24 hours while signal is fresh.

The 10 questions, in order:

How would you segment our pipeline data if we sent it tomorrow?
What would you refuse to do in our first 90 days, and why?
Name a client where you killed a channel because it was not working. What replaced it?
How do you price, what will you not price, and what triggers a scope change?
Walk us through a live AI workflow from a real engagement.
What does your attribution setup look like in month one?
How do you handle a sales team that misses SLA?
What is your point of view on RFP processes?
Who on your team will we actually work with, and what is their tenure?
What would cause you to fire us as a client?

Rubric: Score each answer 1 to 5 on three dimensions, systems thinking, operational specificity, and willingness to disagree. Total out of 150.

Decision criteria: Pass at 110 or higher with no dimension averaging below 3. Auto-fail if the agency cannot answer Question 2 or Question 10 without a case study deflection. RFP-only shops that refuse the diagnostic format are a fail, not a pause. See our B2B agency comparison guide for the full scoring template.

Output: Agency Fit Scorecard with at least 3 agencies ranked, producing a defensible shortlist of 2 to 3 finalists you can present to the CEO without referring to vibes.

Check: Rubric scores documented per agency, at least 1 sales leader attended or reviewed the recordings, and decisions written before any second meeting.

The ranked shortlist becomes the input for the Step 3 AI audit.

Step 3: Evaluate Agency AI Capability Against a Live Working Stack

Demand a live audit, not a deck. Every agency now claims AI capability. Most have one prompt library and a Zapier account. If their "AI stack" is a prompt doc and vibes, you already have your answer.

Ask each finalist to walk through, screen-shared, a live workflow from a real engagement (sensitive details removed) where AI changes a revenue-consequential decision. Evaluate against four criteria:

Tools. Which named platforms are open during the demo. Not slideware.
Models. Which models for which task, and the reasoning behind the choice.
Governance. How they handle prompt drift (outputs degrade as inputs change), model deprecation, and client data handling.
Measurement. What changes in the pipeline number when the AI workflow runs versus when it does not, for example, SQL conversion rate on a single segment moving 15 to 25 percent after the workflow ships.

The Starr Conspiracy does not sell AI experiments. We build marketing systems that actually work, and the standard for AI capability is governance, repeatability, and measurement, not novelty. AI is augmentation, an operating-model change that protects what makes your company great, not a replacement headcount play. See our AI-native marketing operations guide for the maturity model.

Decision criteria: Pass if you watched the workflow run live, you can name the tools, and the agency can articulate the governance layer without prompting. Fail on any deck-only demo or any answer that conflates a prompt library with a system.

Output: AI stack audit notes per finalist, sufficient to decide how much governance you need to wire into Step 4.

Check: A live demo was seen per finalist; tools, models, and governance controls are written down; and the agency named what they will not let AI do unsupervised.

Those governance notes become the constraints for the Step 4 integration plan.

Step 4: Establish Attribution and CRM Integration in the First 30 Days

Complete this step by Day 30. This is the step almost every buyer skips and every successful partnership executes. Attribution is the flight recorder. Without it, your QBR has no read on which segment moved the pipeline number, which channel underperformed, or why an AI workflow shipped in Step 3 mattered.

In the first 30 days after signing, you and the agency jointly configure it. Not later. Not after the first campaign. First.

The configuration checklist:

CRM fields. Standardize Lead Source and Original Source on the Lead and Contact objects, Campaign on the Campaign object, and Opportunity Source on the Opportunity object. Decide once where source-of-truth lives, and which fields are read-only after creation.
Taxonomy. UTM example, source, medium, campaign, content, term, applied consistently across channels. Document the naming convention in a single sheet both teams can edit.
Definitions. What counts as marketing-sourced (first touch in marketing) versus influenced (any touch). Pick one, write it down.
Model. If multi-touch, agree on weighting before any campaign produces a touch.
Test. Run a test lead end to end and confirm it appears correctly in every system within 24 to 48 hours, depending on your platform sync cadence.

Grant the agency scoped read access to pipeline data, not just MQL counts. If you cannot grant CRM access by Day 14, define a read-only reporting export cadence with a named owner on both sides, weekly minimum. The objection "we cannot share data" is a governance conversation, not a stop sign. Unless legal or security constraints genuinely block external access, solve it before kickoff or every QBR becomes an argument about whose number is right.

Decision criteria: Pass when a test lead has traveled the full path, both teams agree on what the dashboard shows, and source-of-truth ownership is named per field. Fail if attribution definitions are still in debate at Day 30.

Output: Attribution specification document and verified test lead path, producing an agency accountable to revenue, not activity, from the first campaign forward.

Check: Spec signed by marketing ops and the agency lead, test lead documented with timestamps, and scoped CRM access or export cadence live.

The signed spec is the precondition for the Step 5 SLA session.

Step 5: Align Sales SLAs and 90-Day Pipeline Goals Before Launch

Complete this step before any campaign goes live. Run a joint working session with sales, marketing, and the agency in one room.

Define the MQL. Define the SQL. Define the response-time SLA (service level agreement, the contracted response window) from sales when marketing hands over a lead, in minutes, with consequences when missed. Our operating rule: 15 minutes for inbound demo requests, 4 business hours for content-sourced MQLs, with the owner named per segment. Set the 90-day pipeline goal as a single number, not a range, broken out by segment from Step 1. Agree on weekly cadence for pipeline review and monthly cadence for strategy adjustment.

The Starr Conspiracy runs this as a working meeting, not a kickoff celebration, because the artifacts produced here govern the next 12 months. The signed SLA document lives in the shared workspace, is owned jointly by the VP Sales and CMO, and is reviewed at every QBR. Expect internal politics. If sales will not show up, escalate to the CEO before signing. "Sales won't join" is the diagnosis, not the obstacle, and an agency cannot fix a handoff the organization has not committed to fixing.

Decision criteria: Pass when MQL, SQL, SLA, and the 90-day number are signed by both function heads and visible on a shared dashboard. Fail if a deputy signs on the sales side, or if the 90-day number is expressed as a range.

Output: Signed SLA document and 90-day pipeline plan with segment-level numbers, producing a handoff system the agency can be held to and sales has agreed to.

Check: Signatures from VP Sales and CMO, live dashboard, and review cadence on calendars through the first QBR.

Common Mistakes to Avoid

Treating Step 1 as optional. Buyers skip the ICP audit because it feels like work the agency should do. An agency cannot diagnose what you have not measured. Skipping this means the agency spends the first 60 days doing your homework on your dime.
Running discovery calls without a rubric. In Step 2, scoring after the call is scoring by recency bias. The agency you talked to last week always sounds best. Build the rubric first, score during the call, decide after all candidates are done.
Accepting AI capability claims without a live demo. In Step 3, decks are not evidence. If an agency cannot show the working system in 20 minutes of screen share, the system does not exist at the maturity they are claiming.
Deferring attribution to month three. In Step 4, both sides want to start campaigns. Resist. 30 days of attribution setup saves 12 months of arguing about results.
Letting sales skip the SLA session. In Step 5, if the VP of Sales sends a deputy, the SLA will not hold. Reschedule until the budget owner on the sales side is in the room. The Starr Conspiracy has watched more partnerships die at this step than any other.

The Bottom Line

Vetting a B2B marketing agency is not a procurement exercise. It is the design phase of your next 18 months of pipeline. Run these 5 procedures in order: audit ICP and demand states, pressure-test with a discovery protocol, audit AI against a live stack, wire attribution in the first 30 days, and align sales SLAs before launch. Do not collapse the sequence, and do not start campaigns before Step 5 is signed. The agencies that survive this process are worth hiring. The ones that resist it are telling you what working with them will feel like.

If you need to shortlist agencies fast without gambling your quarter, book a 30-minute agency vetting diagnostic with The Starr Conspiracy before you sign anything. You leave with a ranked shortlist rubric, a flagged-risk list per finalist, and a 30-day onboarding plan you can defend to your CEO.

Related Questions

How long should the full vetting and onboarding process take?

Plan for 45 to 60 days from first discovery call to signed 90-day plan, with another 30 days for attribution setup before campaigns launch. Our operating rule is that any sequence compressed below 30 days from first call to launch loses the attribution and SLA work, which is where first-year partnerships fail. If you cannot run the full sequence before quarter-end, start it now and stage campaign launch into next quarter. See our B2B marketing operations guide for the full timeline.

What B2B agency discovery call questions actually predict fit?

Questions that predict fit are diagnostic, not promotional. Ask what the agency would refuse to do in the first 90 days, where they killed a channel for a prior client, and how they handle scope creep. Decision rule: if every answer ends in a case study reference, they are selling, not diagnosing. You want the second kind.

How do I assess agency AI and automation capabilities without being technical?

You do not need to evaluate the models. You need to evaluate the workflow. Ask for a live screen share of a workflow where AI changes a revenue-consequential decision. Decision rule: if the demo is a slide deck or a prompt library, the capability is not operational. If the demo is a working system with named tools, governance controls, and observable outputs, it is. See our demand generation framework for how AI maps to campaign structure.

What belongs in an agency onboarding first 90 days plan?

Days 1 to 30 are attribution and CRM integration. Days 31 to 60 are SLA alignment, baseline measurement, and first-campaign launch in a single segment from the ICP audit. Days 61 to 90 are measurement, iteration, and the first quarterly business review against the pipeline number set in Step 5.

Should we run an RFP or a diagnostic selection?

For pipeline-pressured buyers, a diagnostic selection (the 5 procedures above) outperforms an RFP because the RFP measures writing quality, not operating capability. Run the diagnostic, then formalize the chosen partner with a tight MSA. If procurement requires an RFP, attach the Step 2 rubric as the evaluation criteria and the Step 3 live demo as a mandatory stage. Want help running it? Book a vetting diagnostic with The Starr Conspiracy.

Should I hire one agency or stack specialists?

For most growth-stage B2B companies under pipeline pressure, one agency that owns the integrated system outperforms three specialists you have to coordinate. Specialist stacks work when you have a mature in-house marketing ops team to orchestrate them. Decision rule: if you are reading this guide, you probably do not have that team yet.

Related Insights

Framework

AI Lead Generation Frameworks

Six named frameworks for AI-augmented demand and lead generation. Components, applicability, and decision rules for budget-constrained B2B revenue teams.

Glossary

Full-Service B2B Marketing Agency

B2B marketing agency handling strategy, demand generation, content creation, digital advertising, and marketing operations.

Glossary

B2B Demand Generation Glossary

B2B demand generation glossary: 22 essential terms for strategies, tactics, metrics, and frameworks to create predictable pipeline.

Glossary

B2B Demand Generation Glossary

B2B demand generation glossary: 22+ essential terms for CMOs and VPs evaluating agencies to rebuild predictable pipeline under ROI pressure.

Guide

How to Vet B2B Marketing Agencies for Pipeline Impact

Five procedures to vet B2B marketing agencies for pipeline and revenue impact. Audit case studies, verify CRM-backed proof, read retention signals.

Guide

Google Demand Gen Campaigns for B2B Pipeline

Five executable procedures for running Google Demand Gen campaigns B2B teams can defend on creative, inventory, and view-through attribution.

About the Author

JJ La PataChief Strategy Officer

Drives go-to-market strategy and demand generation for TSC clients. Expert in building B2B growth engines.

Ready to talk strategy?

Book a 30-minute call to discuss how we can help your team.

Loading calendar...

Prefer email? Contact us

See what AI-native GTM looks like

Explore our AI solutions built for B2B marketers who want fundamentals and transformation in one place.

Explore solutions