The Pickle Nest

A case study in geospatial search and complex schema design

The Problem

Pickleball players need to find nearby games quickly. The naive solution—show all games sorted by distance—doesn't scale. With hundreds of games and complex filters (skill level, player preferences, availability), queries become slow. A player opening the app shouldn't wait 2 seconds for results.

The backend needs to support: geospatial search (find games within 10km), filtering by player skill, real-time availability status, player ratings from previous games, and complex matching logic. Simple pagination won't work—users need results in milliseconds.

Additional constraint: player ratings and historical match data inform matchmaking. This means joins across multiple tables—games, players, ratings, match history. Bad schema design would make these queries prohibitively slow.

Schema & Architecture

I built the schema around normalized design: 22 distinct models, each focused on a single concern. Players, games, ratings, match results, and availability tracking are separate entities with explicit foreign keys.

Why normalized?

Avoids data duplication. When a player's rating changes, it updates once. No stale ratings scattered across game records. Tradeoff: more joins. But controlled joins are better than data inconsistency.

Geospatial indexing

Used PostgreSQL's PostGIS extension with GIST indexes on geometry columns. "Find games within 10km" becomes a spatial query, not a full table scan. Critical for performance at scale.

Strategic denormalization

Cache computed values (average player rating, player count per game) in Redis. Source of truth is PostgreSQL. Cache is regenerated on updates. Tradeoff: eventual consistency in the cache, but strong consistency in the source.

The API layer (Prisma ORM) provides type safety. Complex queries use explicit joins and SELECT statements, not lazy loading. Every query is intentional.

Technical Decisions & Tradeoffs

REST API over GraphQL

49 REST endpoints, each with a single responsibility.

Why: GraphQL feels powerful for flexibility, but REST kept things predictable. Each endpoint is cacheable by URL. Clients know exactly what they're getting. Easier to version and deprecate.

Tradeoff: More endpoints to maintain. But with good naming conventions and clear semantics, it's simpler than debugging GraphQL N+1 problems.

Prisma ORM for type safety

TypeScript-first ORM that generates types from the schema.

Why: 22 models is a lot. Raw SQL mistakes are easy. Prisma catches them at compile time. Type checking prevents entire classes of bugs.

Tradeoff: Complex queries are harder to optimize. For simple CRUD, Prisma is perfect. For analytical queries over multiple joins, raw SQL with prepared statements might be necessary.

Simple cache invalidation strategy

Cache everything that's read frequently. Invalidate when source changes.

Why: No TTLs or complex expiry logic. When a game is created, clear nearby-games cache. When a player joins, invalidate that game's cache. Prevents stale data.

Tradeoff: More cache operations (clears), but guarantee of freshness. Better than hoping a TTL will catch your update.

Batch loading to prevent N+1 queries

When fetching games, also fetch creator details, player ratings, and related messages in one query using JOINs.

Why: Naively loading a game list would query the database 1 + N times (1 for games, N for player details). Batch loading makes it 1 query total. Massive performance difference.

Tradeoff: Requires upfront planning of what to load. Can't lazily load things later. Worth it for predictable performance.

Challenges & Solutions

Geospatial queries on large datasets

Early version had GIST index but poor query plans. "Find games within 10km" was still taking 500ms for large cities.

Solution: Used EXPLAIN ANALYZE to see the query plan. The issue: PostGIS index wasn't being used. Fixed with explicit index hints and query rewrites. Also partitioned games by region to limit index size.

Concurrent rating updates causing inconsistency

After a game, multiple players update ratings. Concurrent updates could overwrite each other or cause lost updates.

Solution: Database transactions with row-level locking. Each rating update uses SELECT FOR UPDATE to claim the row, then updates it atomically. No two updates race.

Cache invalidation cascades

One update could trigger invalidation of dozens of cache entries (nearby games, player profiles, leaderboards, etc.), creating a storm of cache clears.

Solution: Batch cache invalidations. Instead of clearing immediately, queue them and flush every 5 seconds. Prevents cache thrashing during concurrent updates.

Schema migrations without downtime

Adding a new column to a frequently-read table could lock it for seconds, causing timeouts for users.

Solution: Expand-contract pattern. First, deploy code that ignores the new column. Then add the column with a default value (non-blocking). Finally, backfill existing rows. Last, deploy code that uses the column. Each step is independent.

Performance & Outcomes

Production APIs

Each endpoint focused, cacheable, and independently deployable.

Database Models

Normalized schema prevents data duplication and inconsistency.

<200ms

Geospatial queries

With PostGIS indexing and batch loading, even complex queries complete instantly.

70%

Cache hit rate

Player profiles and recent games are served from Redis most of the time.

What this demonstrates: A well-designed schema and strategic use of caching make complex queries fast. The normalized design ensures data consistency. The separation of cache from source of truth prevents data loss. The result is a system that scales from 100 games to 100,000 without architectural changes.