Our goal is simple: give contractors honest, useful information so they pick the right tools. Here's exactly how we score products, where the numbers come from, what can move them, and how to flag a mistake.
How Editorial Scores Work
We don't grade every product on the same checklist. A CRM and an AI call-answering service do completely different jobs, so scoring them on the same five dimensions would be lazy and misleading. Instead, every product is scored on category-specific dimensions — the things that actually matter for that type of tool — and each dimension is weighted by how important it is to contractors specifically, not to software buyers in general.
Inside a single category, the rating is a weighted average of the dimension scores. "Contractor Fit" or "Trade Specialization" usually carries the heaviest weight, because a polished tool that doesn't understand how trades work is a tool that wastes your time.
The Categories We Score
Each of these has its own published dimension framework — click any category to see the products scored against those dimensions, plus the full weighted breakdown.
Inside a Category: AI Call Answering
Here's a real one. AI call-answering services are evaluated on these 7 dimensions, with weights that add up to 100%. Contractor Fit carries the heaviest weight because a service built for dentists that also markets to plumbers has no idea what an HVAC emergency call actually sounds like.
AI Call Answering Dimensions
7 weighted criteria · totaling 100%
Contractor Fit
How well the service understands contractor workflows, trade terminology, seasonal call patterns, and job types
20% weight
Voice Quality
How natural and human-like the AI voice sounds to callers — response latency, tone, and conversational flow
15% weight
Integrations & CRM
Native connections to contractor CRMs and field service tools like ServiceTitan, Housecall Pro, Jobber, and JobNimbus
15% weight
Value for Money
Pricing transparency, cost-per-call economics, overage charges, and ROI for a typical contractor call volume
15% weight
Agentic AI Compatibility
Public API access, webhook support, and ability to plug into custom AI agent workflows, MCP servers, and automation platforms
15% weight
Emergency Handling
Ability to detect urgent calls — burst pipes, gas leaks, no heat — and route them to the right person immediately
10% weight
Lead Capture
Quality of intake forms, caller information capture, lead scoring, and how well data flows into your systems
10% weight
Where the weights come from: we set them based on operational experience running contractor businesses, patterns in customer-review data across G2, Capterra, and Reddit, and the actual cost of getting each dimension wrong. Vendor input does not influence weights. Affiliate revenue does not influence weights. Weights are reviewed quarterly and published openly — if we change them, we say so and recalculate every affected rating.
When a Product Spans Multiple Categories
Some products serve more than one category. HubSpot is a CRM and a marketing automation platform. Thryv is a CRM and a reputation tool. JobNimbus is a CRM, a project management platform, an estimating tool, and a scheduler. These products get scored separately on each category's dimensions, and the full breakdown is visible on the review page.
The top-line rating is built from those per-category weighted scores using a 70/30 primary-vs-secondary formula, plus a flat +0.20 calibration constant capped at 5.0. The math is public, the formula is reproducible, and every component score shows up on the review page.
The Multi-Category Formula
Top-line rating, in three parts
1
Primary category, weighted at 70%
The first category in the product's category list — the lane the product positions itself around and competes hardest in. For HubSpot, that's CRM. For Thryv, that's CRM. For JobNimbus, that's CRM.
2
Average of secondary categories, weighted at 30%
Every other scored category gets averaged together first, then that average contributes 30% to the top-line. A great CRM doesn't suddenly become a worse product just because we also score it as a marketing tool — but the secondary categories still pull the rating in their direction.
3
Calibration constant: add 0.20, cap at 5.0
After the weighted score is computed, we add a flat +0.20 so our top-line ratings sit in the same 4.4–4.5 band buyers already see on Capterra and G2. We score conservatively against published feature claims and editorial benchmarks rather than vendor self-reports — calibration brings the absolute number into alignment without changing the relative ordering of products.
Single-category products skip step 2.
Their weighted score plus 0.20 is the top-line rating.
Worked Example · HubSpot
A multi-category product with one weak score and one strong one
HubSpot is scored in two categories. As a contractor CRM it earns a 3.03 weighted score — no job scheduling, no dispatch, no trade workflows. As a marketing automation platform it earns a 3.81 — genuinely strong as an MA tool. CRM is its primary category, because that's how HubSpot positions itself and where most contractor buyers land when they compare it.
(0.70 × 3.03) + (0.30 × 3.81) + 0.20 = 3.46 → rounds to 3.5
The CRM weakness drives the result because that's the lane HubSpot competes in for contractor buyers. The stronger marketing-automation score still pulls the rating up — secondary categories contribute, just less than the primary. The top-line you see on every page that mentions HubSpot is 3.5, the same number the formula returns.
Why this formula instead of an average?
We used to take a straight average across categories. That punished strong products dual-listed into adjacent categories where they're naturally weaker, and it rewarded marketing positioning over buyer accuracy. The 70/30 split keeps the primary category dominant — that's where the product earns its market position — while letting secondary scores meaningfully contribute. Updated April 26, 2026; ratings on every multi-category product were recomputed and re-published when the change shipped, and the same release added the +0.20 calibration constant.
One rating, everywhere. The number you see on a product card on a category hub, on the review page, on a comparison, in a roundup — it's always the same top-line. The per-category breakdown is still visible on every review page, so you can see exactly where a multi-category product is strong and where it's weak. We don't show different numbers in different contexts because that confuses readers and breaks trust.
The editorial score is one tradesman's evaluation against published criteria. The community score is the rest of the field weighing in — and we deliberately give more weight to the contractors who can prove they're contractors.
Two voter tiers
Anyone visiting the site can vote. Contractors who complete our verification flow have their votes weighted at the verified tier the moment they confirm their email address. Past anonymous votes from the same browser are promoted automatically when they verify — both at email confirmation and at license approval — so a contractor's earlier opinions still count once they prove who they are.
The two-gate verification process (why "verified" actually means something)
Verification has two separate gates, and they unlock different things. We split them deliberately because they answer different trust questions.
How Verification Works
Email gate → vote weight. License gate → public quote attribution.
1
Email verification (automatic, ~60 seconds)
A contractor submits the verification form, gets a magic link in their email, and clicks it. Once that magic link is clicked, their votes count at full verified weight (70% of the combined score) immediately and any prior anonymous votes from the same browser are promoted automatically.
2
License verification (manual, usually within 24 hours)
A real person — Steven, the editor — looks up their license number against their state's contractor licensing board. Once that's confirmed, any quotes they've left on votes appear publicly with their name, business, state, trade, and license attribution. Until license verification is complete, quotes are held privately on the vote.
Fastest paths through license verification
Two things speed up the manual license review meaningfully:
- Verify with a public business email that matches the license. If you submit with an email address that appears on your public business profile (Google Business, state board listing, your own contracting website) — and that profile shares the license number you entered — we can confirm the match in minutes rather than chasing down a phone confirmation. This is the fastest path.
- Make sure your public business phone number is reachable. When the email-to-license match isn't clean, we'll call the number listed for your business on your state board record or Google Business Profile. A quick "yes, I submitted that verification" is all it takes.
We chose the dual-gate model on purpose: publishing a license number publicly next to a quote is a real trust claim, and an email-only verification floor doesn't carry that weight. The faster paths above let real contractors move through quickly while still gating the part that matters for everyone reading the site.
How Community Votes Are Weighted
Verified contractors carry more weight — about 2.3× an anonymous vote
License-verified through state contractor board records — the people running the tool every day.
Anyone with a browser. One vote per product per category — cookie deduped so the same person can't stuff a ballot.
What you see on a verified contractor's vote
Every verified vote carries full attribution: the contractor's first name and last initial, business name, state, trade, and license number. We show enough to make the vote meaningful and verifiable — and we never expose anything beyond what the contractor consented to during verification.
What if no contractor has voted yet?
Cold-start handling is honest: if a product has no verified votes, we show the anonymous score directly and label it that way — no fake confidence, no padding. Once verified votes start rolling in, the score blends to the 70/30 weighting. If, in the rare case, only verified votes exist and zero anonymous ones, we show the verified score directly.
Vote scoping (per dimension, per category)
Every vote is tied to a specific product and a specific category. A contractor voting on JobNimbus as a CRM and a contractor voting on JobNimbus as an estimating tool aren't getting averaged into the same number — they're scoring two different jobs the platform does. When we roll up to a top-line community score for a multi-category product, we use the same 70/30 primary-vs-secondary blend the editorial cascade uses. Community scores do not get the +0.20 calibration constant; they're already grounded in real-world signal.
Why We Keep the Two Apart
It would be easier to mash the editorial score and the community score into one number. We don't.
The editorial score is a structured evaluation against published criteria. The community score is real-world signal from the people running the tool. When the two agree, that's a strong indicator. When they disagree, that's the most useful information on the page — and a blended average would erase it.
Real Examples · What the Gap Tells You
Editorial and community ratings rarely match exactly — that's the point
Editorial high · Community low
A polished platform with strong feature depth that real users find painful in production. The editorial review caught what's on the box; the community caught what's in the boxes you don't open until month two.
Editorial low · Community high
A scrappy tool that misses dimensions in our framework but absolutely nails the trade workflow it's built for. Worth paying attention to — that's a product the spec sheet underrates and the contractors using it understand.
Both high · or both low
Strongest signal you'll see. The editorial review and the contractor consensus are pointing in the same direction — pick accordingly.
Why Our Scores Often Run Lower Than Capterra and G2
Compare a Contractor ToolStack rating to the same product on Capterra or G2 and ours will frequently land 0.5 to 1.0 points lower. This is structural, not random — and it's deliberate.
Reason 1
Aggregator rating inflation
On Capterra and G2, 4.5 is the median in B2B SaaS. Almost every mature product clusters between 4.3 and 4.8. Reviewers self-select toward enthusiasts, vendors solicit reviews from happy customers (sometimes with gift cards), and unhappy customers churn instead of writing reviews. We use the full 1–5 scale honestly.
Reason 2
More granular methodology
Capterra reports four sub-scores that all cluster at 4.5+. We score 7 to 9 category-specific weighted dimensions. Weakness in any one of them drags the total down meaningfully — even when overall sentiment is high. A platform with strong ease of use but missing AI lands lower in our framework than in Capterra's, because AI is weighted in every category we score.
Reason 3
We score against where the category is heading
Capterra reviewers rate present sufficiency for their existing workflow. We rate against forward-looking dimensions like AI capabilities and integration depth. A platform with thin AI scores low in our framework even if its current customers don't notice the gap yet — we want our ratings to age well into 2027.
The pattern is consistent with how independent reviewers like Wirecutter, NerdWallet, and Consumer Reports score products against vendor-influenced platforms. A 4.0 from us means more than a 4.5 elsewhere — and that's deliberate.
Hands-On vs. Research-Based Reviews
Not every review is created equal, and you deserve to know the difference. We use a two-tier system, and every review is clearly labeled in its frontmatter so you know what you're getting.
Products I use daily across my businesses. First-person experience — real screenshots, actual workflow examples, opinions formed from months or years of daily use. When we say JobNimbus handles insurance restoration well, it's because we've processed hundreds of claims through it.
Products evaluated through official documentation, user reviews on G2, Capterra, and Reddit, video demos, and industry research. Thorough, but we haven't logged in and used the software day-to-day. Clearly labeled so you know the source.
Working toward making every review hands-on. We'd rather give you a well-researched review now than make you wait six months for first-hand experience with every tool.
A note on dimension scoring
The hands-on label changes the depth of color in the prose, the inclusion of personal workflow examples, and the confidence in describing daily-use friction. The dimension-by-dimension scoring uses the same framework either way — same dimensions, same weights, same source rules. Hands-on doesn't earn a rating bump, and research-based isn't penalized. The difference is editorial texture, not numerical weighting.
The 1–5 Scale and What Each Tier Means
Ratings round to one decimal place. We don't inflate scores. A 4.0 from us means something.
Rating Tiers
From "best in category" to "we'd steer you elsewhere"
Exceptional
Best in category. We'd recommend it to almost any contractor in the right trade.
Very Good
Recommended for most contractors. Strong product with minor trade-offs.
Good
Solid choice with some limitations. Works well for specific use cases.
3.2
Meaningful Gaps
Works but has meaningful gaps. There are probably better options.
<3
Not Recommended
Significant issues. We'd steer you toward alternatives.
Where Our Data Comes From
Every rating, pricing claim, feature description, and quoted review traces back to a verifiable source. We cite sources inline in every review and we'll happily walk through any individual claim with a vendor or a reader.
Official Vendor Sources
- Vendor pricing pages (verified at publication)
- Vendor product / feature pages
- Vendor blog posts and press releases
- Vendor YouTube channels and demo videos
- Knowledge bases and documentation
- Earnings calls and SEC filings (where applicable)
Independent Review Platforms
- G2 (review counts, ratings, verbatim quotes)
- Capterra and Software Advice
- TrustRadius
- GetApp
- Apple App Store and Google Play Store ratings
- Better Business Bureau profiles
Community Sources
- Reddit (r/Construction, r/Roofing, r/Contractor, r/HVAC, trade subs)
- ContractorTalk and other industry forums
- Trade publications (Roofing Contractor, Pro Builder, Construction Dive, JLC Online)
- YouTube reviews by named contractors
- LinkedIn posts from leadership and named industry experts
Verification Rules
- Pricing verified directly on the vendor's pricing page (third-party aggregator pricing is treated as stale until confirmed)
- Every quoted customer review is attributed by username and source link
- Marketing claims are flagged as marketing claims when used at all
- Unverified data is labeled "unverified" rather than presented as fact
What Can and Can't Change a Rating
We update ratings during quarterly content reviews and whenever a product ships a meaningful change. Vendors and readers can submit information that triggers a re-evaluation — but the rules for what does and doesn't move a number are public and consistent.
- Pricing changes (tier additions, increases, transparency improvements)
- New features shipping (AI launches, integrations added, modules released)
- Features being removed or sunset (QuickBooks Desktop discontinuation, deprecated tiers)
- Support quality changes documented in 5+ recent reviews
- Ownership, leadership, or roadmap changes that affect product direction
- Major bug patterns or platform stability issues documented across reviews
- Score recalibration during quarterly content reviews
- Dimension-weight adjustments (with all affected ratings recalculated and the change documented publicly)
- Vendor displeasure with the rating or with editorial framing
- Threats of legal action
- Advertising spend, sponsorship offers, or affiliate commission rates
- Partnership offers or co-marketing proposals
- Vendor-supplied "corrections" to editorial judgment (we own that)
- Marketing collateral, sales decks, or claim sheets
- Press releases announcing future features that haven't shipped
- Comparisons to competitors as a basis for adjusting our score
Correction Policy
Mistakes happen. Pricing pages change. Features ship that we missed. Integrations get added or removed. Factual corrections are a normal part of running the site — here's the process.
How to Submit a Correction
- Email the correction to
info@contractortoolstack.com
with the URL of the page in question, the specific claim or number you're disputing, and a verifiable source link supporting the correct information.
- We acknowledge within 5 business days with a yes/no on whether the correction qualifies as factual (per the rules above). If we need more information, we'll ask.
- Factual corrections are applied within one quarterly review cycle (so within ~90 days at the latest, often faster for time-sensitive items like pricing changes that already shipped).
- Editorial-judgment disputes are not corrections. If you disagree with a rating, a section title, an opinion, or a comparative framing, that's editorial judgment and we don't change it based on vendor preference. We'll read the message and consider whether it raises a factual issue underneath the disagreement — if it does, that part gets the factual-correction treatment.
- We publish a changelog when reviews update with material rating changes. The original publish date stays the same; the "Updated" date reflects the most recent revision.
Readers (not just vendors) can submit corrections through the same email if they spot stale pricing, missing features, or misattributed quotes — we're equally responsive to either source.
Our Affiliate Disclosure
Some links on this site are affiliate links. If you sign up through our link, we may earn a commission at no extra cost to you. This never influences our ratings or recommendations. We review software whether or not there's an affiliate program.
Plenty of products we recommend have no affiliate program at all. We still review them because the point of this site is helping contractors find the right tools — not maximizing our commissions.
Every page with affiliate links includes a disclosure at the top. No hiding it in the footer. No fine print.