Performance

Practical tuning advice for getting the most out of YokedCache in production.

Backend latency

Understanding your baseline latency helps you set expectations and diagnose problems:

Backend	Typical GET latency	Notes
Memory	< 1 µs	Limited to one process
Redis (same host)	0.1–0.5 ms	Loopback network
Redis (local network)	1–3 ms	LAN / same DC
Redis (cross-region)	10–100 ms	WAN — cache at the edge instead
Memcached (local)	0.5–2 ms	Similar to Redis

If your Redis GET p95 is > 10ms on a local network, investigate:
- Connection pool exhaustion (increase max_connections)
- Large payloads (compress or paginate)
- Redis memory pressure / evictions
- Network congestion

Connection pool sizing

Each YokedCache instance has a connection pool. The right size depends on your app's concurrency:

config = CacheConfig(
    redis_url="...",
    max_connections=50,  # adjust based on concurrency
)

Rule of thumb: max_connections ≈ expected concurrent requests per worker. For a FastAPI app with 4 Uvicorn workers handling 50 concurrent requests each, 50 connections per worker is a good starting point.

Symptoms of undersized pool:
- Requests queuing to acquire a connection
- Elevated GET/SET latency under load
- ConnectionPool exhausted errors in logs

Symptoms of oversized pool:
- High Redis memory usage from idle connections
- Too many open file descriptors on the Redis server

TTL strategy

Data type	Suggested TTL	Reasoning
Static config	1–24 hours	Almost never changes
Product catalog	1–6 hours	Changes infrequently
User profiles	5–60 min	Changes occasionally
Session data	15–60 min	Per-user, moderate churn
Search results	1–5 min	Freshness matters
Real-time aggregations	10–60 sec	Tolerate slight staleness
Rate limit counters	Exact TTL	Accuracy critical

Jitter: Keep TTL jitter enabled (default ±10%). It prevents synchronized expirations that would flood your DB simultaneously.

# Custom jitter range
config = CacheConfig(ttl_jitter_percent=15.0)  # ±15%

# Disable (not recommended for high-traffic)
config = CacheConfig(ttl_jitter_percent=0)

Key design

Smaller keys save memory and reduce network payload:

# Good: compact but readable
"u:42"
"p:electronics:99"
"s:abc123"

# Fine: slightly longer but clearer
"user:42"
"product:99"
"session:abc123"

# Avoid: very long keys with redundant info
"myapp_production_user_data_user_id_42_full_profile"

Avoid giant values. Storing a 5 MB blob in a single cache entry means:
- 5 MB transferred on every cache miss
- 5 MB serialized/deserialized on every hit
- Other keys evicted earlier if Redis is near maxmemory

Instead, cache only what you need or paginate:

# Cache individual items, not the whole list
await cache.set(f"product:{id}", product, ttl=3600)

# For lists, cache the IDs + fetch items individually
await cache.set("product_ids:electronics", [1, 2, 3, ...], ttl=300)

Serialization speed

Method	Speed	Size	Use when
JSON	Medium	Largest	Default; interoperable; debuggable
MessagePack	Fast	Smaller	Binary data, cross-language
Pickle	Varies	Medium	Complex Python objects

Benchmark with your actual data—the difference is often smaller than expected for typical payloads (< 10 KB). For payloads > 100 KB, consider enabling compression:

config = CacheConfig(
    enable_compression=True,
    compression_threshold=1024,  # compress values > 1 KB
)

Async vs sync

Context	Recommended
FastAPI / Starlette / Django async	`await cache.get()`
asyncio scripts	`await cache.get()`
Sync scripts, CLI tools	`cache.get_sync()`
Tight loops in sync code	Batch with `get_many_sync()`

The *_sync methods run asyncio.run() per call—each creates a new event loop. This is fine for occasional use but has overhead in tight loops. If you need sync in a hot path, batch operations:

# Instead of many individual sync calls:
for uid in user_ids:
    users[uid] = cache.get_sync(f"user:{uid}")  # overhead × N

# Use batch:
results = cache.get_many_sync([f"user:{uid}" for uid in user_ids])

Batch operations

Batch operations use pipelining internally, making them much faster than looping:

# Single round trip instead of N round trips
results = await cache.get_many(["user:1", "user:2", "user:3"])

await cache.set_many({
    "user:1": u1,
    "user:2": u2,
    "user:3": u3,
}, ttl=300)

await cache.delete_many(["old:1", "old:2", "old:3"])

Redis server tuning

Add to redis.conf:

# Eviction policy for a cache (evict LRU keys when maxmemory is reached)
maxmemory 4gb
maxmemory-policy allkeys-lru

# Network
tcp-nodelay yes          # reduce latency
tcp-keepalive 300        # keep idle connections alive

# For pure cache (no persistence needed):
save ""
appendonly no

# Lazy freeing: free memory asynchronously (reduces blocking)
lazyfree-lazy-eviction yes
lazyfree-lazy-expire yes
lazyfree-lazy-server-del yes

Profiling cache impact

Compare response times with and without cache to measure impact:

import time

# Time a cache miss
await cache.delete("user:42")
start = time.perf_counter()
await get_user(42)
miss_time = time.perf_counter() - start

# Time a cache hit
start = time.perf_counter()
await get_user(42)
hit_time = time.perf_counter() - start

print(f"Miss: {miss_time*1000:.1f}ms, Hit: {hit_time*1000:.1f}ms")
print(f"Speedup: {miss_time/hit_time:.0f}x")

Common performance issues

Symptom	Likely cause	Fix
Hit rate drops suddenly	Aggressive invalidation or TTL too short	Review invalidation logic; increase TTL
GET latency spikes	Pool exhaustion or Redis memory pressure	Increase `max_connections`; add memory; enable eviction
High memory on Redis	Too many keys or large values	Enable LRU eviction; paginate large values; reduce TTL
Slow cold start	No cache warming	Warm critical data on startup
Frequent thundering herds	No jitter; no single-flight	Enable TTL jitter; use `single_flight=True`