When "ABC Company LLC" Doesn't Match "ABC Company, LLC": Solving Fuzzy Matching in Loan Underwriting
The applicant writes "ABC Company LLC" on their application. The Secretary of State database returns "A.B.C. Company, L.L.C." Same business. Different strings. Your verification fails.
This is the fuzzy matching problem, and it affects every lender processing applications at scale. A business name matching API solves this by applying intelligent algorithms that recognize variations, normalize formatting, and return confidence scores indicating match quality. Without fuzzy matching, legitimate businesses fail verification due to punctuation, abbreviation, and formatting differences that have nothing to do with fraud or business legitimacy.
Why Business Names Don't Match
Business name variations occur for predictable reasons. Understanding these patterns helps configure matching logic appropriately.
Common variation types
Punctuation differences: • "A.B.C. Company" vs. "ABC Company" • "Smith & Sons" vs. "Smith and Sons" • "O'Brien Consulting" vs. "OBrien Consulting"
Suffix variations: • "LLC" vs. "L.L.C." vs. "Limited Liability Company" • "Inc." vs. "Inc" vs. "Incorporated" • "Corp." vs. "Corporation"
Abbreviations: • "International" vs. "Intl" vs. "Int'l" • "Manufacturing" vs. "Mfg" • "Services" vs. "Svcs"
Word order and spacing: • "First National Bank" vs. "National Bank, First" • "NewYork" vs. "New York"
DBA vs. legal name: • Applicant uses trade name; state records show legal entity name • Parent company vs. subsidiary naming
According to research on company name standardization, automated fuzzy matching approaches can reduce manual harmonization effort "to less than 15% of that when done entirely manually."¹
How Fuzzy Matching Works
Fuzzy matching algorithms compare strings and return similarity scores indicating how closely two values match, even when they're not identical.
The matching process
Step 1: Preprocessing Before comparison, both strings are normalized: • Convert to consistent case (uppercase or lowercase) • Strip punctuation (periods, commas, apostrophes) • Normalize suffixes (LLC → LLC, L.L.C. → LLC, Limited Liability Company → LLC) • Remove common business words that add noise ("The," "Company," "Group")
Step 2: Algorithm comparison Multiple algorithms evaluate similarity: • Levenshtein distance: Counts the minimum edits (insertions, deletions, substitutions) needed to transform one string into another • Token matching: Compares individual words regardless of order • Phonetic matching: Identifies names that sound similar but are spelled differently
Step 3: Confidence scoring Results return as a confidence score between 0.0 and 1.0: • 1.0: Exact match after normalization • 0.90-0.99: Minor variations (punctuation, spacing) • 0.70-0.89: Moderate variations (abbreviations, word order) • Below 0.70: Significant differences requiring review
AWS Entity Resolution documentation describes how advanced fuzzy matching "bridges the gap between rule-based and ML-based approaches" by providing "the ability to set similarity thresholds on string fields using fuzzy algorithms."²
Configuring Confidence Thresholds
The right threshold settings balance automation efficiency against false positive risk. Settings too strict create unnecessary manual reviews; settings too loose pass mismatched entities.
Recommended threshold framework
High confidence (≥0.80): Auto-accept These matches proceed without manual review: • Score indicates strong similarity after normalization • Differences are cosmetic (punctuation, suffix formatting) • Risk of false positive is minimal
Medium confidence (0.60-0.79): Queue for review These matches require human judgment: • Significant but potentially legitimate variations • Word order differences or abbreviation mismatches • Could be same entity with naming inconsistencies—or could be different entity
Low confidence (<0.60): Auto-reject or flag These matches fail verification: • Names differ substantially • Likely different entities or data entry errors • Requires applicant clarification before proceeding
Threshold calibration
Your optimal thresholds depend on:
• Application volume: Higher volume favors higher auto-accept thresholds to reduce review queue • Risk tolerance: Conservative lenders set lower auto-accept thresholds • Deal size: Larger deals warrant stricter matching requirements • Historical data: Analyze past false positives and negatives to tune thresholds
Start conservative (0.85 auto-accept threshold) and adjust based on exception review outcomes.
Handling Edge Cases
Some matching scenarios require special handling beyond standard fuzzy logic.
The DBA problem
Applicants often apply using their trade name while state records show the legal entity:
• Application: "Joe's Pizza" • State record: "Joseph's Italian Restaurant LLC dba Joe's Pizza"
Solutions: • Search both primary name and DBA/trade name fields • Flag DBA mismatches for review rather than auto-reject • Require applicants to provide legal entity name on application
Parent/subsidiary confusion
Large organizations have multiple related entities:
• Application: "ABC Holdings" • State records: "ABC Holdings Inc." and "ABC Operating Company LLC" and "ABC Ventures LP"
Solutions: • Return all matching entities for underwriter selection • Display registered agent and address information to help identify correct entity • Use EIN/TIN matching as secondary verification
Common name collisions
Generic business names match multiple unrelated entities:
• Search: "First Choice Services" • Results: 47 entities in California alone
Solutions: • Require state of formation to narrow results • Use address matching as secondary filter • Return top matches ranked by confidence score
Interpreting Results
A confidence score without context creates confusion. Understanding what scores mean helps underwriters make better decisions.
What high scores indicate
A score of 0.95 means the names are nearly identical after normalization. Differences are minor: • Punctuation only • Suffix formatting • Single character variations
These should auto-accept in most workflows.
What medium scores indicate
A score of 0.72 means meaningful differences exist: • Missing or extra words • Abbreviation vs. full word • Possible but uncertain match
These require human review to confirm same entity.
What low scores indicate
A score of 0.45 means names are substantially different: • Different core words • Possible data entry error • Possible fraud attempt (applying as similar-sounding legitimate business)
These fail verification and require applicant clarification.
When "no match" is the answer
Sometimes the correct result is no match at all. A business that doesn't appear in state records after thorough searching is valuable fraud intelligence. For more on using negative verification results in underwriting decisions, see our guide on interpreting negative results.
Integration Best Practices
Implementing fuzzy matching effectively requires workflow design, not just API integration.
Application design
Collect clean data from applicants: • Require legal entity name (not just DBA) • Specify state of formation • Request EIN for secondary matching • Validate format before submission
Decision logic
Build clear rules for verification outcomes:
IF confidence >= 0.80:
status = "VERIFIED"
proceed to underwriting
ELIF confidence >= 0.60:
status = "REVIEW_REQUIRED"
queue for manual verification
ELSE:
status = "NOT_VERIFIED"
request applicant clarificationException handling
Design workflows for common exceptions: • Multiple high-confidence matches → Present options to underwriter • No matches found → Trigger additional search with variations • State website timeout → Retry with exponential backoff
Audit trail
Log matching results for compliance: • Input search term • Returned matches with confidence scores • Selected match (if multiple) • Verification decision and timestamp
Measuring Match Quality
Track these metrics to optimize fuzzy matching configuration:
Accuracy metrics: • False positive rate: Matches accepted that shouldn't have been • False negative rate: Legitimate businesses rejected or queued unnecessarily • Auto-accept rate: Percentage of verifications completing without manual review
Operational metrics: • Review queue volume: Applications requiring manual verification • Average review time: Time spent on medium-confidence matches • Threshold adjustment frequency: How often you're tuning settings
Quality targets: • Auto-accept rate: 70-85% (depending on lead quality) • False positive rate: <1% • Review queue: Manageable with current staff
From Matching to Decision
Fuzzy matching solves the "same business, different strings" problem that causes legitimate applications to fail verification. Proper configuration—preprocessing, confidence thresholds, and exception handling—turns name variations from a manual review burden into an automated workflow.
The key is treating confidence scores as decision inputs, not final answers. High scores proceed. Low scores fail. Medium scores get human attention. This framework processes volume efficiently while maintaining verification quality.
Sources
• Analytics Insight | Company Name Standardization using a Fuzzy NLP Approach
• AWS | Resolve Imperfect Data with Advanced Rule-Based Fuzzy Matching in AWS Entity Resolution
• Match Data Pro | Complete Guide to Fuzzy/Probabilistic Data Matching and Entity Resolution
• WinPure | Identity Resolution With Data Matching
• Sumsub | Matching Techniques in AML Screening












.png)