Transparency

Data Sources

Every feed on BSI includes timestamps and source attribution. This page documents exactly where the data comes from, how often it refreshes, and what to expect during edge-case windows like Spring Training and early-season coverage.

Providers

Highlightly Pro

Primary pipeline — live scores, box scores, team data, player profiles

College BaseballMLB
Refresh

Live scores every 30s; box scores on completion

Notes

Serves match/score data. Standings and rankings come from ESPN. Authenticated via RapidAPI.

SportsDataIO

Scores, standings, rosters, player statistics, schedules

MLBNFLNBACFBCBB
Refresh

Live scores every 30–60s; rosters daily

Notes

Primary for all professional leagues. Authenticated via Ocp-Apim-Subscription-Key header.

ESPN Site API

Scores, standings, rankings, and schedules for college baseball

College Baseball
Refresh

Live scores every 60s; standings/rankings daily

Notes

Primary for standings and rankings. Dates labeled UTC are actually ET — BSI normalizes to America/Chicago. No API key required.

Internal Systems

BSI Savant

College Baseball

Park-adjusted sabermetrics engine — wOBA, wRC+, FIP, expected stats, HAV-F scouting grades

Refresh

Every 6 hours (bsi-savant-compute cron) + daily full recompute (bsi-cbb-analytics)

NotebookLM

College Baseball

AI-powered podcast audio generation from curated source documents

Refresh

Weekly — new Audio Overviews generated from fresh sources

Storage Layers

Layer
Purpose
TTL / Lifecycle
KV (Cloudflare)
Hot cache for scores, standings, rankings
Scores: 60s | Standings: 30 min | Rankings: 30 min | Teams/Players: 24h
D1 (Cloudflare)
Structured relational data — game records, player stats, editorial metadata
Persistent — no TTL, data written by ingest workers
R2 (Cloudflare)
Static assets, media, archives, embeddings
Permanent storage with lifecycle rules for archival

Seasonal Caveats

MLB

Spring Training (Feb 15 – Mar 25): limited SportsDataIO coverage; some games unavailable until first pitch. Finalization delays of 5–10 minutes are expected.

College Baseball

Preseason (Feb 14 – Feb 20): opening weekend coverage may be patchy until conferences begin full play. Rankings update weekly during the regular season.

NFL

Off-season (Feb – Aug): no live scores. Preseason games begin in August with limited statistical depth.

NBA

Off-season (Jun – Oct): no live scores. Summer League coverage is not included.

How It Works

External APIs are never called from your browser. A Cloudflare Worker sits between you and every data provider — it fetches, transforms, caches, and serves the result. A cron job pre-warms the cache every minute for in-season sports so client requests read from KV in under 10ms.

Every API response carries a meta object with source, fetched_at, and timezone. The UI always shows when data was last updated and where it came from.

When a primary source fails, the system falls back to the next provider in the chain — then to the last-known-good KV snapshot. You'll always see data; the source label tells you how fresh it is.

For cross-reference methodology, API response times, and freshness guarantees, see the expanded Data Quality & Sources page in the Models hub.