Part 2 of 2 in the Database Performance Optimization series
The Problem: A Trading System That Couldn’t Trade
Atlas is a quantitative trading system designed for 0-3 DTE (days to expiration) options trading. The system needs to process market data, calculate features, and execute trades in real-time. But it couldn’t start.
Database initialization was taking 6.6 seconds, blocking 37 critical trading features from loading. The system would hang during startup, never reaching the point where it could process market data or execute trades.
For a trading system where milliseconds matter, a 6-second startup delay wasn’t just slow—it was catastrophic.
Initial Investigation
Profiling the Bottleneck
Adding timing instrumentation to the database manager revealed the issue:
start_time = time.time()
self._primary_connection = duckdb.connect(database=db_path, read_only=False)
logger.debug(f"Connection created in {(time.time() - start_time)*1000:.1f}ms")
# Output: Connection created in 12.3ms âś…
schema_time = time.time()
self._initialize_schemas()
logger.debug(f"Schema check completed in {(time.time() - schema_time)*1000:.1f}ms")
# Output: Schema check completed in 6588.4ms ❌
The database connection itself was fast (12ms). The problem was schema initialization—checking if tables existed before creating them.
The Anti-Pattern
The original schema initialization used information_schema
queries:
def _initialize_schemas(self):
"""Check if tables exist, create if missing"""
# Check trades table
result = self._primary_connection.execute("""
SELECT COUNT(*) FROM information_schema.tables
WHERE table_name = 'trades'
""").fetchone()
if result[0] == 0:
self._primary_connection.execute("""
CREATE TABLE trades (...)
""")
# Repeat for quotes table...
# Repeat for each index...
This pattern queries metadata for every table and index. In DuckDB, information_schema
queries scan catalog structures—a slow operation when done repeatedly during startup.
Solution Architecture
The fix required three coordinated optimizations:
- Eliminate metadata queries → Use
CREATE ... IF NOT EXISTS
- Add critical indexes → Composite indexes for trading queries
- Optimize connection setup → Batch configuration calls
1. Fast Schema Initialization
Replace slow metadata queries with fast DDL:
def _initialize_schemas_fast(self):
"""PERFORMANCE OPTIMIZED: Initialize schemas using CREATE TABLE IF NOT EXISTS"""
start_time = time.time()
# Create tables - fast if they exist
self._primary_connection.execute("""
CREATE TABLE IF NOT EXISTS trades (
timestamp_ns BIGINT,
symbol VARCHAR,
price DOUBLE,
size INTEGER,
side VARCHAR,
exchange VARCHAR,
conditions VARCHAR,
notional DOUBLE,
date DATE
)
""")
self._primary_connection.execute("""
CREATE TABLE IF NOT EXISTS quotes (
timestamp_ns BIGINT,
symbol VARCHAR,
bid DOUBLE,
ask DOUBLE,
bid_size INTEGER,
ask_size INTEGER,
exchange VARCHAR,
conditions VARCHAR,
date DATE
)
""")
schema_time = (time.time() - start_time) * 1000
logger.debug(f"Fast schema initialization: {schema_time:.1f}ms")
Key insight: CREATE TABLE IF NOT EXISTS
is idempotent and fast. DuckDB can check existence at the catalog level without scanning information_schema
.
Performance: 6644ms → 8.2ms (810X faster)
2. Critical Trading Indexes
Add composite indexes for common query patterns:
# Index for "get all trades for symbol in time range"
self._primary_connection.execute("""
CREATE INDEX IF NOT EXISTS idx_trades_symbol_timestamp
ON trades(symbol, timestamp_ns)
""")
self._primary_connection.execute("""
CREATE INDEX IF NOT EXISTS idx_quotes_symbol_timestamp
ON quotes(symbol, timestamp_ns)
""")
# Index for daily aggregations
self._primary_connection.execute("""
CREATE INDEX IF NOT EXISTS idx_trades_date
ON trades(date)
""")
self._primary_connection.execute("""
CREATE INDEX IF NOT EXISTS idx_quotes_date
ON quotes(date)
""")
These indexes enable millisecond queries for:
- Real-time market data lookups by symbol
- Historical backtesting across time ranges
- End-of-day analytics and reporting
3. Optimized Connection Setup
Batch configuration instead of sequential calls:
def connect(self):
"""Create primary database connection - PERFORMANCE OPTIMIZED"""
start_time = time.time()
with self._connection_lock:
# Configuration
db_path = self._config.get('db_path', 'market_data.db')
threads = self._config.get('threads', 4)
memory_limit = self._config.get('memory_limit', '2GB')
# Create connection
self._primary_connection = duckdb.connect(
database=db_path,
read_only=False
)
# Batch configure
self._primary_connection.execute(f"SET threads TO {threads}")
self._primary_connection.execute(f"SET memory_limit = '{memory_limit}'")
# Fast schema init
self._initialize_schemas_fast()
total_time = (time.time() - start_time) * 1000
logger.info(f"âś… DatabaseManager connected in {total_time:.1f}ms")
# Production warning
if total_time > 50:
logger.warning(f"⚠️ Connection took {total_time:.1f}ms (target: <50ms)")
Target: <50ms for production trading system
Achieved: 8.2ms
Thread-Safe Cursor Management
Trading systems are multi-threaded—market data processing, feature calculation, and trade execution all run concurrently. The database manager needed thread-safe cursor access without lock contention.
Fast-Path Optimization
def get_cursor(self, component_name: str):
"""Get thread-local cursor with fast-path for existing cursors"""
thread_id = threading.current_thread().ident
cursor_key = f"{component_name}_{thread_id}"
# FAST PATH: No lock needed for existing cursors
if cursor_key in self._thread_cursors:
self._cursor_last_used[cursor_key] = time.time()
return self._thread_cursors[cursor_key]
# SLOW PATH: Create new cursor (requires lock)
with self._cursor_lock:
# Double-check pattern for thread safety
if cursor_key not in self._thread_cursors:
self._thread_cursors[cursor_key] = self._primary_connection.cursor()
self._cursor_last_used[cursor_key] = time.time()
return self._thread_cursors[cursor_key]
Performance:
- First cursor access: 0.01ms (with lock)
- Subsequent access: 0.00ms (lock-free fast path)
- Throughput: 200,000 cursors/sec
Automatic Cleanup
Cursors for inactive threads are cleaned up automatically:
def _cleanup_unused_cursors(self):
"""Remove cursors not used in last 60 seconds"""
current_time = time.time()
with self._cursor_lock:
cursors_to_remove = []
for cursor_key, last_used in self._cursor_last_used.items():
if current_time - last_used > 60:
cursors_to_remove.append(cursor_key)
for cursor_key in cursors_to_remove:
try:
self._thread_cursors[cursor_key].close()
except:
pass
del self._thread_cursors[cursor_key]
del self._cursor_last_used[cursor_key]
Background cleanup thread runs every 30 seconds, preventing cursor leaks.
Error Handling: Exponential Backoff
Database locks can occur under high concurrent load. The solution: intelligent retry with exponential backoff.
def execute_with_retry(self, cursor, query: str, params=None):
"""Execute query with automatic retry on lock conflicts"""
delay = 0.1 # Start with 100ms
for attempt in range(10):
try:
if params:
return cursor.execute(query, params)
else:
return cursor.execute(query)
except duckdb.IOException as e:
if "database is locked" in str(e).lower():
if attempt < 9:
logger.warning(f"Database locked, retry {attempt + 1}/10 after {delay:.1f}s")
time.sleep(delay)
delay *= 2.0 # Exponential backoff
continue
raise
raise RuntimeError("Failed to execute query after 10 attempts")
Retry strategy:
- Attempt 1: Wait 100ms
- Attempt 2: Wait 200ms
- Attempt 3: Wait 400ms
- …up to 10 attempts
This prevents transient lock conflicts from causing failures while avoiding infinite retries.
Results
Metric | Before | After | Improvement |
---|
Connection Time | 6644ms | 8.2ms | 810X faster |
Health Check | N/A | 0.1ms | Production ready |
Cursor Access (first) | N/A | 0.01ms | Thread-safe |
Cursor Access (cached) | N/A | 0.00ms | Lock-free |
Simple Query | N/A | 0.05ms | Millisecond trading |
System Impact
37 blocked features restored:
- L2 market data storage (high-frequency tick data)
- Trade analytics (historical analysis)
- Position tracking (real-time P&L)
- Risk management (portfolio metrics)
- Strategy backtesting (historical queries)
- Market surveillance (pattern detection)
- Compliance reporting (trade logs)
- Performance analytics (strategy evaluation)
- …and 29 more critical features
Trading system operational:
- Database initialization: 8.2ms (target: <50ms) âś…
- Health checks: Sub-millisecond âś…
- Thread-safe concurrent access âś…
- High-frequency data ingestion âś…
- Real-time feature calculation âś…
Production Monitoring
The optimized database manager includes built-in health checks:
def health_check(self) -> Dict[str, Any]:
"""Fast health check for production monitoring"""
start_time = time.time()
try:
if self._lifecycle_phase != LifecyclePhase.CONNECTED:
return {
'healthy': False,
'phase': self._lifecycle_phase.value,
'error': 'Not connected',
'check_time_ms': 0
}
# Fast connectivity test
cursor = self.get_cursor('health_check')
cursor.execute('SELECT 1').fetchone()
check_time = (time.time() - start_time) * 1000
return {
'healthy': True,
'phase': self._lifecycle_phase.value,
'check_time_ms': round(check_time, 2),
'active_cursors': len(self._thread_cursors),
'max_connections': self._max_connections,
'config': self._config
}
except Exception as e:
check_time = (time.time() - start_time) * 1000
return {
'healthy': False,
'error': str(e),
'check_time_ms': round(check_time, 2)
}
Production alerts:
- 🚨 Critical: Connection time >50ms
- ⚠️ Warning: Health check >1ms
- 📊 Info: >100 active cursors
Current system metrics:
- Health checks: 0.1ms (10X under warning threshold)
- Connection time: 8.2ms (6X under critical threshold)
- Active cursors: Efficient management with auto-cleanup
Key Lessons
SELECT * FROM information_schema.tables
scans catalog structures. For schema checks:
- ❌ Query
information_schema
then conditionally create
- âś… Use
CREATE TABLE IF NOT EXISTS
(idempotent and fast)
2. Index Early, Index Smart
Composite indexes on common query patterns:
CREATE INDEX idx_symbol_timestamp ON trades(symbol, timestamp_ns);
This single index serves multiple queries:
WHERE symbol = 'SPY' AND timestamp_ns > X
WHERE symbol = 'SPY' AND timestamp_ns BETWEEN X AND Y
WHERE symbol = 'SPY' ORDER BY timestamp_ns DESC LIMIT 100
3. Fast-Path Optimizations Matter
For frequently accessed resources (cursors, connections, caches):
- Check existence without locks (fast path)
- Only acquire locks when creating new resources (slow path)
- Result: 0.01ms → 0.00ms (infinite improvement)
4. Production Systems Need Monitoring
Built-in health checks enable:
- Automated production monitoring
- Performance regression detection
- Capacity planning (cursor count, connection usage)
Cost: 0.1ms overhead
Value: Production visibility and reliability
5. The Fastest Code Is Code That Doesn’t Run
Schema optimization eliminated:
- 6644ms of
information_schema
queries
- Unnecessary table existence checks
- Redundant index creation attempts
Result: 810X performance improvement with zero feature loss.
Conclusion
Optimizing the Atlas trading system’s database initialization from 6.6 seconds to 8.2ms required understanding the actual bottleneck (metadata queries), applying the right solution (CREATE IF NOT EXISTS
), and building in production-grade features (health checks, exponential backoff).
The 810X improvement wasn’t from algorithmic cleverness—it was from eliminating unnecessary work and using database features correctly.
Core principles applied:
- Profile before optimizing (found 99% of time in schema checks)
- Use database features correctly (
CREATE IF NOT EXISTS
vs information_schema
)
- Design for production (health checks, monitoring, error handling)
- Optimize the critical path (schema init was blocking everything)
For production systems where performance matters, measure everything, optimize what’s slow, and build in monitoring from day one.
Part of the Database Performance Optimization series: