Caching and Performance¶
Comprehensive guide to caching strategies and performance optimization for data operations.
Cache System Architecture¶
RustyBT provides multi-level caching:
- Memory Cache: Fast in-memory LRU cache
- Disk Cache: Persistent on-disk cache
- Bundle Cache: Cached bundle metadata
- History Cache: Historical data windows
Memory Caching¶
Monitoring Performance¶
Timing Operations¶
import time
def time_operation(func):
"""Decorator to time operations."""
@wraps(func)
def wrapper(*args, **kwargs):
start = time.time()
result = func(*args, **kwargs)
duration = time.time() - start
print(f"{func.__name__} took {duration:.2f}s")
return result
return wrapper
@time_operation
def fetch_large_dataset():
return fetch_data()
Cache Statistics¶
from rustybt.data.polars.cache_manager import CacheManager
cache = CacheManager(max_memory_mb=1024)
# Get cache stats
stats = cache.get_stats()
print(f"Hit rate: {stats['hit_rate']:.2%}")
print(f"Total hits: {stats['hits']}")
print(f"Total misses: {stats['misses']}")
print(f"Cache size: {stats['size_mb']:.2f} MB")
Best Practices¶
- Cache Hot Data: Cache frequently accessed data
- Set Appropriate TTL: Balance freshness vs performance
- Monitor Hit Rates: Track cache effectiveness
- Use Lazy Evaluation: Minimize memory usage
- Batch Operations: Reduce API calls
- Compress Storage: Use efficient formats (Parquet + zstd)
- Profile First: Identify bottlenecks before optimizing
See Also¶
- Optimization Guide - Advanced optimization techniques
- Troubleshooting - Performance debugging
- Data Catalog - Bundle caching