Performance

Practical tuning advice for getting the most out of YokedCache in production.


Backend latency

Understanding your baseline latency helps you set expectations and diagnose problems:

Backend Typical GET latency Notes
Memory < 1 µs Limited to one process
Redis (same host) 0.1–0.5 ms Loopback network
Redis (local network) 1–3 ms LAN / same DC
Redis (cross-region) 10–100 ms WAN — cache at the edge instead
Memcached (local) 0.5–2 ms Similar to Redis

If your Redis GET p95 is > 10ms on a local network, investigate:
- Connection pool exhaustion (increase max_connections)
- Large payloads (compress or paginate)
- Redis memory pressure / evictions
- Network congestion


Connection pool sizing

Each YokedCache instance has a connection pool. The right size depends on your app's concurrency:

config = CacheConfig(
    redis_url="...",
    max_connections=50,  # adjust based on concurrency
)

Rule of thumb: max_connections ≈ expected concurrent requests per worker. For a FastAPI app with 4 Uvicorn workers handling 50 concurrent requests each, 50 connections per worker is a good starting point.

Symptoms of undersized pool:
- Requests queuing to acquire a connection
- Elevated GET/SET latency under load
- ConnectionPool exhausted errors in logs

Symptoms of oversized pool:
- High Redis memory usage from idle connections
- Too many open file descriptors on the Redis server


TTL strategy

Data type Suggested TTL Reasoning
Static config 1–24 hours Almost never changes
Product catalog 1–6 hours Changes infrequently
User profiles 5–60 min Changes occasionally
Session data 15–60 min Per-user, moderate churn
Search results 1–5 min Freshness matters
Real-time aggregations 10–60 sec Tolerate slight staleness
Rate limit counters Exact TTL Accuracy critical

Jitter: Keep TTL jitter enabled (default ±10%). It prevents synchronized expirations that would flood your DB simultaneously.

# Custom jitter range
config = CacheConfig(ttl_jitter_percent=15.0)  # ±15%

# Disable (not recommended for high-traffic)
config = CacheConfig(ttl_jitter_percent=0)

Key design

Smaller keys save memory and reduce network payload:

# Good: compact but readable
"u:42"
"p:electronics:99"
"s:abc123"

# Fine: slightly longer but clearer
"user:42"
"product:99"
"session:abc123"

# Avoid: very long keys with redundant info
"myapp_production_user_data_user_id_42_full_profile"

Avoid giant values. Storing a 5 MB blob in a single cache entry means:
- 5 MB transferred on every cache miss
- 5 MB serialized/deserialized on every hit
- Other keys evicted earlier if Redis is near maxmemory

Instead, cache only what you need or paginate:

# Cache individual items, not the whole list
await cache.set(f"product:{id}", product, ttl=3600)

# For lists, cache the IDs + fetch items individually
await cache.set("product_ids:electronics", [1, 2, 3, ...], ttl=300)

Serialization speed

Method Speed Size Use when
JSON Medium Largest Default; interoperable; debuggable
MessagePack Fast Smaller Binary data, cross-language
Pickle Varies Medium Complex Python objects

Benchmark with your actual data—the difference is often smaller than expected for typical payloads (< 10 KB). For payloads > 100 KB, consider enabling compression:

config = CacheConfig(
    enable_compression=True,
    compression_threshold=1024,  # compress values > 1 KB
)

Async vs sync

Context Recommended
FastAPI / Starlette / Django async await cache.get()
asyncio scripts await cache.get()
Sync scripts, CLI tools cache.get_sync()
Tight loops in sync code Batch with get_many_sync()

The *_sync methods run asyncio.run() per call—each creates a new event loop. This is fine for occasional use but has overhead in tight loops. If you need sync in a hot path, batch operations:

# Instead of many individual sync calls:
for uid in user_ids:
    users[uid] = cache.get_sync(f"user:{uid}")  # overhead × N

# Use batch:
results = cache.get_many_sync([f"user:{uid}" for uid in user_ids])

Batch operations

Batch operations use pipelining internally, making them much faster than looping:

# Single round trip instead of N round trips
results = await cache.get_many(["user:1", "user:2", "user:3"])

await cache.set_many({
    "user:1": u1,
    "user:2": u2,
    "user:3": u3,
}, ttl=300)

await cache.delete_many(["old:1", "old:2", "old:3"])

Redis server tuning

Add to redis.conf:

# Eviction policy for a cache (evict LRU keys when maxmemory is reached)
maxmemory 4gb
maxmemory-policy allkeys-lru

# Network
tcp-nodelay yes          # reduce latency
tcp-keepalive 300        # keep idle connections alive

# For pure cache (no persistence needed):
save ""
appendonly no

# Lazy freeing: free memory asynchronously (reduces blocking)
lazyfree-lazy-eviction yes
lazyfree-lazy-expire yes
lazyfree-lazy-server-del yes

Profiling cache impact

Compare response times with and without cache to measure impact:

import time

# Time a cache miss
await cache.delete("user:42")
start = time.perf_counter()
await get_user(42)
miss_time = time.perf_counter() - start

# Time a cache hit
start = time.perf_counter()
await get_user(42)
hit_time = time.perf_counter() - start

print(f"Miss: {miss_time*1000:.1f}ms, Hit: {hit_time*1000:.1f}ms")
print(f"Speedup: {miss_time/hit_time:.0f}x")

Common performance issues

Symptom Likely cause Fix
Hit rate drops suddenly Aggressive invalidation or TTL too short Review invalidation logic; increase TTL
GET latency spikes Pool exhaustion or Redis memory pressure Increase max_connections; add memory; enable eviction
High memory on Redis Too many keys or large values Enable LRU eviction; paginate large values; reduce TTL
Slow cold start No cache warming Warm critical data on startup
Frequent thundering herds No jitter; no single-flight Enable TTL jitter; use single_flight=True

Search documentation

Type to search. Fuzzy matching handles typos.