Skip to content

Latest commit

 

History

History
228 lines (162 loc) · 13.6 KB

File metadata and controls

228 lines (162 loc) · 13.6 KB

Public Data Sources for PVP Research

This is a catalog of publicly available data sources useful for building PVPs. Organized by category, with notes on what each source reveals and which verticals benefit most.

The goal is not to use every source. The goal is to identify 2 to 5 sources per campaign that, when combined, reveal something the prospect cannot easily see on their own.


Government and Regulatory Data

These are the highest-value sources for PVP research. Government data is authoritative, publicly accessible, and updated on a predictable schedule. Most prospects have never looked at it.

OSHA (Occupational Safety and Health Administration)

  • URL: osha.gov/ords/imis
  • What it contains: Workplace inspection results, violations, penalties, abatement dates
  • Update frequency: Weekly
  • Best for: Manufacturing, construction, warehousing, food processing
  • PVP angle: Facility-level violation history, penalty trends, comparison to industry benchmarks, repeat violations that indicate systemic problems
  • Cross-reference with: BLS injury rates (industry benchmark), CMS citations (healthcare overlap), state workers comp data

FDA (Food and Drug Administration)

  • URL: fda.gov/inspections-compliance-enforcement-and-criminal-investigations
  • What it contains: Warning Letters, inspection observations (483s), recalls, enforcement actions, facility registration data
  • Update frequency: Weekly for Warning Letters, varies for other datasets
  • Best for: Pharmaceuticals, medical devices, food manufacturing, cosmetics (MoCRA compliance)
  • PVP angle: Specific observations cited during inspection, repeat findings across inspections, comparison to peer facilities, upcoming compliance deadlines
  • Cross-reference with: SEC filings (if public company, financial impact), patent data (product pipeline context), competitor Warning Letters

SEC EDGAR

  • URL: sec.gov/edgar
  • What it contains: Public company filings including 10-K, 10-Q, 8-K, proxy statements, insider trading reports
  • Update frequency: As filed (continuous)
  • Best for: Any public company, or private companies with public competitors
  • PVP angle: Risk factors disclosed in filings, executive compensation tied to metrics, strategic priorities stated in earnings calls, competitor financial benchmarks
  • Cross-reference with: Industry-specific regulatory data, job postings (confirm stated priorities are being resourced), technology signals

HMDA (Home Mortgage Disclosure Act)

  • URL: ffiec.cfpb.gov/data-browser
  • What it contains: Mortgage application and origination data by institution, including denial rates, loan types, geographic distribution
  • Update frequency: Annual (with quarterly updates for some data)
  • Best for: Mortgage lending, banking, real estate, fair lending compliance
  • PVP angle: Denial rates by ZIP code vs peers, geographic lending gaps, fair lending risk indicators, market share shifts
  • Cross-reference with: Census data (demographic context), FDIC call reports (financial health), competitor HMDA data

BLS (Bureau of Labor Statistics)

  • URL: bls.gov
  • What it contains: Industry employment data, wage statistics, injury/illness rates, productivity measures, price indices
  • Update frequency: Monthly to annually depending on dataset
  • Best for: Any vertical (provides benchmarks), especially manufacturing, healthcare, construction
  • PVP angle: Industry benchmark comparisons ("your injury rate is 2x the industry average"), wage competitiveness analysis, productivity trends
  • Cross-reference with: Company-specific data from OSHA or other sources, job posting data (wage competitiveness)

CMS (Centers for Medicare and Medicaid Services)

  • URL: data.cms.gov
  • What it contains: Hospital and facility quality ratings, inspection results, deficiency citations, Medicare payment data, provider enrollment
  • Update frequency: Monthly to quarterly
  • Best for: Healthcare (hospitals, nursing homes, home health, hospice, ASCs)
  • PVP angle: Facility-level deficiency citations, star ratings vs peers, payment trends, staffing ratios, readmission rates
  • Cross-reference with: OSHA data (overlapping safety issues), state licensing data, NPPES provider data

NPPES (National Plan and Provider Enumeration System)

  • URL: npiregistry.cms.hhs.gov
  • What it contains: NPI records for every healthcare provider and organization, including practice location, taxonomy codes, authorized official
  • Update frequency: Weekly
  • Best for: Healthcare provider outreach, facility mapping
  • PVP angle: New provider registrations (growth signal), multi-location mapping, specialty distribution, organizational structure
  • Cross-reference with: CMS quality data, state licensing boards, facility permit data

DOL (Department of Labor)

  • URL: dol.gov/agencies
  • What it contains: Wage and hour enforcement data, FMLA compliance, union election data, apprenticeship program data, foreign labor certifications
  • Update frequency: Varies by dataset
  • Best for: Any employer-focused outreach, staffing, HR technology, legal services
  • PVP angle: Wage and hour violations (back wages owed), FMLA complaint patterns, union activity as a change signal
  • Cross-reference with: Job posting volume (hiring pressure), Glassdoor reviews (employee sentiment), OSHA data (systemic workplace issues)

EIA (Energy Information Administration)

  • URL: eia.gov
  • What it contains: Energy consumption data by sector and region, utility rates, generation capacity, fuel prices, emissions data
  • Update frequency: Monthly to annually
  • Best for: Energy-intensive industries (manufacturing, data centers, cannabis cultivation, real estate), energy services
  • PVP angle: Facility energy costs vs regional benchmarks, rate change impacts, efficiency opportunity sizing
  • Cross-reference with: Building permit data (new facility energy needs), utility rate filings, EPA emissions data

SAM.gov (System for Award Management)

  • URL: sam.gov
  • What it contains: Federal contract awards, entity registrations, exclusions, wage determinations, assistance listings
  • Update frequency: Daily
  • Best for: Government contractors, defense, IT services, construction (federal projects)
  • PVP angle: Contract expiration dates (re-compete opportunities), award amounts, subcontracting opportunities, exclusion risk
  • Cross-reference with: Job postings (staffing up for contract delivery), SEC filings (if public), FPDS data (detailed contract history)

EPA (Environmental Protection Agency)

  • URL: echo.epa.gov
  • What it contains: Facility compliance history, violations, inspections, enforcement actions, emissions data, permit information
  • Update frequency: Quarterly
  • Best for: Manufacturing, chemical, oil and gas, waste management
  • PVP angle: Non-compliance history, penalty exposure, permit renewal timing, comparison to peer facilities
  • Cross-reference with: OSHA data (compounding safety issues), state environmental data, EIA energy data

Commercial and Market Signals

These sources require more interpretation than government data, but they reveal buying intent, competitive dynamics, and organizational change.

Job Postings (LinkedIn, Indeed, company career pages)

  • What it reveals: Hiring a role means they have a gap. The job description tells you what problems they are trying to solve.
  • Update frequency: Daily
  • PVP angle: A company posting for a "Director of Compliance" probably does not have compliance figured out. A company posting 5 SDR roles is scaling outbound. The posting IS the pain signal.
  • Best practices: Track postings over time, not just snapshots. A posting that has been open for 90 days signals a harder problem.

Technology Stack Data (BuiltWith, Wappalyzer, SimilarTech)

  • What it reveals: What tools a company uses, when they adopted or dropped technologies, technology spend signals
  • Update frequency: Weekly to monthly
  • PVP angle: Technology changes are decision signals. Dropping a tool means they are unhappy. Adding a tool means they are investing. Running competing tools means they are fragmented.
  • Best practices: Look for recent changes (last 90 days). Static tech stacks are less interesting than moving ones.

Patent Filings (USPTO, Google Patents)

  • What it reveals: R&D direction, competitive positioning, product pipeline, investment areas
  • Update frequency: Weekly (new publications)
  • PVP angle: A competitor filing patents in a new area signals a strategic shift the prospect should know about. Useful for competitive intelligence PVPs.
  • Cross-reference with: Job postings (are they hiring for the patented area?), press releases, SEC filings (R&D spend)

Funding Data (Crunchbase, PitchBook)

  • What it reveals: Fundraising events, investor profiles, valuation changes, growth trajectory
  • Update frequency: As events occur
  • PVP angle: A recent raise means growth pressure and spending capacity. The round size and investor profile tell you what kind of growth they are pursuing.
  • Best practices: More useful as a timing signal than a data source for the PVP itself. Combine with other signals.

Building Permits (Local government, Shovels.ai)

  • What it reveals: Construction projects, renovation plans, new facilities, project timelines, general contractor assignments
  • Update frequency: Weekly to monthly depending on jurisdiction
  • PVP angle: New construction creates equipment needs, service needs, compliance requirements. Permits filed without corresponding vendor activity equals opportunity.
  • Cross-reference with: DOT permits (equipment movement), contractor licensing databases, utility connection applications

Social and Content Signals

Lower reliability than government or commercial data, but useful for understanding the prospect's mindset and priorities.

LinkedIn Activity

  • What it reveals: What topics the prospect cares about, who they engage with, what content they share, job changes
  • PVP angle: A prospect posting about a challenge gives you permission to address that challenge directly. "I saw your post about [topic]. We pulled some data that might be relevant."

Conference Attendance and Speaking

  • What it reveals: Strategic priorities, industry positioning, peer network
  • PVP angle: Conference topics indicate current focus areas. Speaking engagements signal expertise areas where they would value new data.

Content Publishing (Blog posts, whitepapers, podcasts)

  • What it reveals: Thought leadership positioning, strategic priorities, market perspective
  • PVP angle: Their published content tells you what they think is important. Delivering data that supports or challenges their published position gets attention.

Review Platforms (G2, Glassdoor, industry-specific)

  • What it reveals: Customer satisfaction patterns, employee sentiment, product gaps, competitive positioning
  • PVP angle: Patterns in negative reviews reveal real problems. "Your G2 reviews mention [specific theme] in 8 of your last 12 reviews" is concrete.

Cross-Referencing for Convergence

Single-source PVPs are weak. The value comes from combining sources. Here is how to think about it:

The Convergence Principle

Number of Sources Strength Example
1 source Trivia "You had an OSHA violation"
2 sources Interesting "You had an OSHA violation AND your injury rate is 2x industry average"
3+ sources Intelligence "You had an OSHA violation, your injury rate is 2x average, AND you just posted for a Safety Director, suggesting you know it's a problem but haven't solved it yet"

Cross-Reference Patterns

Company data + Industry benchmark + Regulatory data The most common and reliable pattern. Tells the prospect where they stand relative to their industry and what regulators see.

Company data + Competitor data + Market timing Useful for competitive intelligence PVPs. Shows the prospect what competitors are doing and why the timing matters.

Regulatory data + Job postings + Technology signals Reveals the gap between what they need to comply with and what they are currently doing about it.

Data Source Selection by Vertical

Vertical Primary Sources Secondary Sources
Healthcare CMS, OSHA, NPPES BLS, state licensing, job postings
Manufacturing OSHA, EPA, BLS DOL, patent data, BuiltWith
Financial services SEC EDGAR, HMDA, FDIC BLS, job postings, technology data
Construction Building permits, OSHA, DOT Contractor licensing, BLS, equipment registrations
Pharmaceuticals FDA, SEC EDGAR, patent data BLS, CMS (for drug pricing), clinical trial registries
Energy EIA, EPA, state utility commissions OSHA, building permits, technology data
Government contracting SAM.gov, FPDS, USASpending Job postings, SEC (for public primes), subcontracting data
Real estate Building permits, Census, HMDA EIA (energy costs), zoning records, tax assessments

Data Quality Checklist

Before using any data source in a PVP, verify:

  • The data is less than 90 days old (or the source updates infrequently by design)
  • You can link it to the specific prospect, not just their industry
  • The source is authoritative (government, established platform, or verifiable)
  • You understand what the data means in context (not just the raw number)
  • You can explain where you got it if the prospect asks
  • Combining it with your other sources produces a non-obvious insight