Lead Scoring Best Practices for B2B (2026 Guide)

Why most lead scoring models are wrong

Every scoring model we audit that was built from scratch is wrong in the same ways:

Weights are guessed, not calibrated. Someone decided a demo request should be worth 50 points and an eBook download should be worth 5. They're not wrong directionally, but the 5x:50x ratio probably isn't right — and you can't know without data.
No decay on behavioral signals. A whitepaper download 18 months ago is worth the same as one yesterday. It isn't.
No negative scoring. Competitor domains, unsubscribes, and explicit non-buying signals don't reduce score, they just fail to add to it.
MQL threshold set arbitrarily. “100 points equals MQL” — but at 100 points, what's the actual SQL conversion rate? If you don't know, the threshold is wrong.
No periodic recalibration. The model was built once and hasn't been touched in 3 years. Buying behavior changed; the model didn't.

The 5-step framework

Step 1: Get historical data

Pull 12–24 months of lead-to-SQL conversion data with all behavioral, demographic, and firmographic dimensions you'll consider scoring on. This data lives across your MAP, CRM, and (often) a data warehouse — pull it once, properly.

Minimum data needed:

Lead created date
Lead → SQL conversion date (if applicable)
Behavioral history (page visits, downloads, email engagement, form fills)
Demographic data at conversion time (title, function, seniority)
Firmographic data (company size, industry, tech stack)
Source / channel

Step 2: Identify predictive signals

Run conversion-rate analysis across each signal candidate. The question to answer for each: “What's the SQL conversion rate of leads with this signal vs. without it?”

Keep signals where the high-value group's conversion rate is at least 2x the average. Drop signals where the lift is below 1.5x — they're not worth the modeling complexity.

Step 3: Set weights based on actual lift

Weight each signal proportionally to its conversion lift, not based on intuition. If “Director+ title” has 4x the conversion of average, and “visited pricing page” has 8x the conversion, the pricing-page signal should be weighted ~2x the title signal.

Sample weighting structure for a typical B2B SaaS:

Signal	Type	Sample weight	Decay?
Demo request	Behavioral	+50	30 days
Pricing page visit	Behavioral	+25	60 days
Whitepaper download	Behavioral	+10	90 days
Email click	Behavioral	+5	90 days
Director+ title	Demographic	+20	None
Marketing/RevOps function	Demographic	+15	None
500+ employees	Firmographic	+15	None
Has Marketo/HubSpot in stack	Firmographic	+20	None
Competitor domain	Negative	-50	None
Student/researcher title	Negative	-50	None
Unsubscribed	Negative	-100	None

These are illustrative — your specific weights depend on your historical data.

Step 4: Calibrate the MQL threshold

Run scored leads through the model retroactively and plot score-vs-SQL-conversion-rate. Set the MQL threshold at the score where:

SQL conversion rate plateaus or peaks (the top 10–25% of leads)
The volume above that threshold is what your sales team can handle
SQL conversion rate at the threshold is acceptable to Sales

The score number itself is meaningless. What matters: at this score, X% of leads convert to SQL. That's what Sales is buying when they accept your MQLs.

Step 5: Add decay and review quarterly

Behavioral signals decay over 30–90 days. Demographic and firmographic scores generally don't decay — but should be re-checked annually for changes (job changes, company size changes, tech stack changes).

Quarterly review: pull the last 90 days of scored leads, check whether MQL→SQL conversion rate at threshold is still acceptable. Annual recalibration: full re-run of the model against rolling 12-month data.

Need help calibrating your scoring?

Most B2B teams' scoring models are 60% built and 40% guessed. The 30-min scoping call covers what a calibration project would look like for your instance.

Book a Call Or Send a Message

Common scoring mistakes (and how to avoid them)

Mistake 1: Scoring inflation

60% of contacts above MQL threshold means scoring is broken — usually because the threshold is too low or behavioral decay is missing. Sales starts ignoring MQLs because too many are unqualified, and the whole program loses credibility. Fix: recalibrate threshold, add decay.

Mistake 2: Scoring under-coverage

Only 5% of contacts above MQL threshold means scoring is too conservative. You're losing pipeline to leads that converted without ever being marketed to as MQLs. Fix: lower threshold, broaden positive signal set.

Mistake 3: No model documentation

Scoring rules live in Marketo or HubSpot but no one knows why they exist. When the original builder leaves, the model becomes untouchable — too risky to change without understanding the original logic. Fix: write down the model, the rationale, the calibration data, and the review schedule.

Mistake 4: Treating account scoring as contact scoring

Summing up contact scores at the account level doesn't produce a meaningful account score — companies with more contacts always score higher regardless of actual intent. Use weighted average, max-of-key-roles, or proper account-level scoring tools (6sense, Demandbase) for ABM scoring.

Industry benchmarks

Per Salesforce's State of Marketing report, high-performing B2B teams report:

MQL → SQL conversion rate: 25–40% (vs. 10–15% for typical mid-market)
SQL → SAO (Sales Accepted Opportunity): 50–70%
SAO → Won: 15–25% (varies wildly by deal size and industry)

If your MQL→SQL rate is below 15%, scoring is the most likely culprit — followed by lead routing and SLA adherence. A MOPs audit separates the causes.

Frequently Asked Questions

How many points should an MQL threshold be?

No universal answer. Set the threshold where the top 15–25% of leads sit, then adjust based on sales capacity and SQL conversion rate. The number is meaningless; the SQL conversion rate at that threshold is what matters.

Should we use behavioral, demographic, or firmographic scoring?

All three. Typical mid-market B2B model: ~40% behavioral (intent), ~30% demographic (contact fit), ~30% firmographic (account fit). Proportions vary by company.

How fast should scores decay?

Behavioral: 25% decay after 30 days, 50% after 60 days, 75% after 90 days. Demographic/firmographic: no decay, but re-check annually for changes.

Should we score in Marketo/HubSpot or in a separate tool?

Native scoring works for most B2B teams. Add 6sense / Demandbase / MadKudu when you need predictive AI, third-party intent data, or sophisticated multi-touch attribution scoring. Start native; layer in only when needed.

How often should we recalibrate the scoring model?

Review quarterly, full recalibration annually. Sooner if business model changes, sales process changes, or MQL→SQL conversion drops 20%+ from baseline.

What about negative scoring?

Essential. Common rules: -50 competitor domains,-25 non-business email domains, -50 student/researcher titles, -100 unsubscribe events.

Should we score by individual contact or by account?

Both. Contact-level for individual intent. Account-level for ABM and buying-readiness. Most B2B ABM teams need both. Native tooling (Marketo account-based smart lists, HubSpot company scoring) supports this.

What's the SLA between Marketing and Sales for MQL follow-up?

High-priority leads (demo requests): 5-minute SLA. Standard MQLs: 24-hour SLA. Per InsideSales research, conversion drops 80% past 5 minutes. SLA enforcement (Slack alerts, escalation) is high leverage.

Want help building or rebuilding your scoring model?

The 30-min scoping call covers what a calibration would look like for your team — timeline, cost, and expected lift in MQL→SQL conversion.

Book a Scoping Call Or Send a Message

Lead scoring best practices for B2B (2026)