Data Sources

Every feed on BSI includes timestamps and source attribution. This page documents exactly where the data comes from, how often it refreshes, and what to expect during edge-case windows like Spring Training and early-season coverage.

Providers

Highlightly Pro

Primary pipeline — scores, rankings, team stats, player profiles

College BaseballCollege Football
Refresh

Live scores every 30s; standings/rankings every 30 min

Notes

Canonical source. All new integrations wire here first.

SportsDataIO

Scores, standings, rosters, player statistics, schedules

MLBNFLNBACFBCBB
Refresh

Live scores every 30–60s; rosters daily

Notes

Primary for all professional leagues. Authenticated via Ocp-Apim-Subscription-Key header.

ESPN Site API

Scores, rankings, and schedules for college baseball

College Baseball
Refresh

Live scores every 60s; rankings weekly

Notes

Fallback source. ESPN dates labeled UTC are actually ET — BSI normalizes to America/Chicago. No API key required.

Storage Layers

Layer
Purpose
TTL / Lifecycle
KV (Cloudflare)
Hot cache for scores, standings, rankings
Scores: 60s | Standings: 30 min | Rankings: 30 min | Teams/Players: 24h
D1 (Cloudflare)
Structured relational data — game records, player stats, editorial metadata
Persistent — no TTL, data written by ingest workers
R2 (Cloudflare)
Static assets, media, archives, embeddings
Permanent storage with lifecycle rules for archival

Seasonal Caveats

MLB

Spring Training (Feb 15 – Mar 25): limited SportsDataIO coverage; some games unavailable until first pitch. Finalization delays of 5–10 minutes are expected.

College Baseball

Preseason (Feb 14 – Feb 20): opening weekend coverage may be patchy until conferences begin full play. Rankings update weekly during the regular season.

NFL

Off-season (Feb – Aug): no live scores. Preseason games begin in August with limited statistical depth.

NBA

Off-season (Jun – Oct): no live scores. Summer League coverage is not included.

How It Works

External APIs are never called from your browser. A Cloudflare Worker sits between you and every data provider — it fetches, transforms, caches, and serves the result. A cron job pre-warms the cache every minute for in-season sports so client requests read from KV in under 10ms.

Every API response carries a meta object with source, fetched_at, and timezone. The UI always shows when data was last updated and where it came from.

When a primary source fails, the system falls back to the next provider in the chain — then to the last-known-good KV snapshot. You'll always see data; the source label tells you how fresh it is.

For cross-reference methodology, API response times, and freshness guarantees, see the expanded Data Quality & Sources page in the Models hub.