diff --git a/tech_docs/database/sql_getting_started.md b/tech_docs/database/sql_getting_started.md index 9013f24..78e6503 100644 --- a/tech_docs/database/sql_getting_started.md +++ b/tech_docs/database/sql_getting_started.md @@ -1,3 +1,165 @@ +# **SQL for Forex Automation: The Pragmatic 20%** + +## **Why Syntax Matters in Automated Forex Analysis** + +### **1. Filtering & Time Bucketing (The Foundation)** +**Why you need this:** +- Your cron jobs must process only relevant data (specific pairs/timeframes) +- Raw ticks are useless - you need candlesticks for analysis +- Bad filtering = wasted resources and slow queries + +**Real-world syntax you'll actually use:** +```sql +-- Get only EUR/USD ticks from London session +SELECT * FROM ticks +WHERE symbol = 'EUR/USD' + AND EXTRACT(HOUR FROM timestamp) BETWEEN 7 AND 15 + AND timestamp > NOW() - INTERVAL '1 day'; + +-- Create 5-min candles automatically +INSERT INTO candles_5min +SELECT + symbol, + DATE_TRUNC('5 minutes', timestamp) AS candle_time, + FIRST((bid+ask)/2) AS open, + MAX(ask) AS high, + MIN(bid) AS low, + LAST((bid+ask)/2) AS close +FROM ticks +WHERE timestamp > NOW() - INTERVAL '1 hour' +GROUP BY symbol, candle_time; +``` + +### **2. Aggregates & Session Analysis (Your Edge)** +**Why you need this:** +- Spot liquidity patterns for optimal execution +- Detect when spreads widen (avoid trading then) +- Automate session-based trading strategies + +**Script-ready examples:** +```sql +-- Daily spread report (cron job at market close) +SELECT + symbol, + AVG(ask-bid) AS avg_spread, + MAX(ask-bid) AS max_spread, + EXTRACT(HOUR FROM timestamp) AS hour +FROM ticks +WHERE date = CURRENT_DATE +GROUP BY symbol, hour +ORDER BY symbol, hour; + +-- London/NY overlap volatility (alert trigger) +SELECT + STDDEV((bid+ask)/2) AS rolling_volatility +FROM ticks +WHERE symbol = 'GBP/USD' + AND EXTRACT(HOUR FROM timestamp) BETWEEN 13 AND 15 -- 8-10AM NY time + AND timestamp > NOW() - INTERVAL '30 minutes'; +``` + +### **3. Rolling Calculations (Real-Time Edge)** +**Why you need this:** +- Moving averages/support/resistance in pure SQL +- Detect breakouts without Python overhead +- Calculate volatility for position sizing + +**Automation-ready window functions:** +```sql +-- 50-period SMA for last 6 hours (run hourly) +SELECT + symbol, + AVG((bid+ask)/2) OVER ( + PARTITION BY symbol + ORDER BY timestamp + ROWS 49 PRECEDING + ) AS sma_50, + timestamp +FROM ticks +WHERE timestamp > NOW() - INTERVAL '6 hours' +ORDER BY timestamp DESC +LIMIT 1; + +-- Real-time correlation alert (EUR/USD vs. USD/CHF) +WITH last_100_ticks AS ( + SELECT + timestamp, + MAX(CASE WHEN symbol = 'EUR/USD' THEN (bid+ask)/2 END) AS eurusd, + MAX(CASE WHEN symbol = 'USD/CHF' THEN (bid+ask)/2 END) AS usdchf + FROM ticks + WHERE timestamp > NOW() - INTERVAL '5 minutes' + GROUP BY timestamp +) +SELECT + CORR(eurusd, usdchf) AS live_correlation +FROM last_100_ticks; +``` + +### **4. Optimization (Because Cron Jobs Can't Hang)** +**Why you need this:** +- Slow queries miss trading opportunities +- Unoptimized jobs pile up and crash systems +- Bad SQL wastes server resources + +**Must-implement optimizations:** +```sql +-- Essential indexes for forex data +CREATE INDEX idx_forex_symbol_time ON ticks(symbol, timestamp); +CREATE INDEX idx_forex_time ON ticks(timestamp); + +-- Partition by month for tick data +CREATE TABLE ticks ( + -- schema +) PARTITION BY RANGE (timestamp); + +-- CTEs for complex jobs (easier to debug) +WITH +london_ticks AS ( + SELECT * FROM ticks + WHERE EXTRACT(HOUR FROM timestamp) BETWEEN 7 AND 15 +), +usd_pairs AS ( + SELECT * FROM london_ticks + WHERE symbol LIKE '%USD%' +) +SELECT * FROM usd_pairs; -- Your analysis continues... +``` + +## **The Automation Workflow** +1. **Data Collection Script** (cron: every minute) + - Filters raw ticks by symbol/time + - Generates candlesticks + - Stores aggregates in reporting tables + +2. **Monitoring Script** (cron: every 5 minutes) + - Checks volatility/spreads + - Alerts on abnormal conditions + - Updates rolling metrics + +3. **End-of-Day Report** (cron: 23:59) + - Calculates daily highs/lows + - Tracks session volatility + - Logs correlation matrices + +## **What You Can Ignore (For Automation)** +- Fancy visualizations (just output CSV) +- Interactive queries (your scripts aren't interactive) +- ORM layers (raw SQL is better for cron) +- Database administration (unless you're processing 1B+ ticks/day) + +## **Next Steps for Your Automation** +1. Start with simple filtering → candle generation +2. Add session-based aggregations +3. Implement rolling calculations for signals +4. Set up alert thresholds + +Want me to outline a complete cron-ready SQL script for a specific task? For example: +- A volatility breakout detector +- An end-of-day report generator +- A spread monitor for execution quality + +--- + # **SQL for Forex Data Analysis: The 20% That Delivers 80% Results** ## **Focused Learning Roadmap**