diff --git a/tech_docs/database/sql_roadmap.md b/tech_docs/database/sql_roadmap.md new file mode 100644 index 0000000..eaea5fa --- /dev/null +++ b/tech_docs/database/sql_roadmap.md @@ -0,0 +1,170 @@ +Here's a streamlined 8-week roadmap focused purely on practical SQL skills for forex bid/ask analysis, structured for immediate application in cron jobs: + +### **Week 1-2: Core Foundations for Tick Data** +**Goal:** Process raw ticks into usable formats +**Key Skills:** +1. **Basic filtering** + ```sql + -- Isolate specific currency pairs/time windows + SELECT * FROM ticks + WHERE symbol = 'EUR/USD' + AND timestamp BETWEEN '2024-01-01 00:00' AND '2024-01-01 23:59' + ``` + +2. **Candlestick generation** + ```sql + -- 1-minute OHLC candles + SELECT + symbol, + DATE_TRUNC('minute', timestamp) AS minute, + FIRST(bid) AS open, + MAX(bid) AS high, + MIN(bid) AS low, + LAST(bid) AS close + FROM ticks + GROUP BY symbol, minute + ``` + +3. **Spread metrics** + ```sql + -- Average spread by hour + SELECT + symbol, + EXTRACT(HOUR FROM timestamp) AS hour, + AVG(ask - bid) AS avg_spread + FROM ticks + GROUP BY symbol, hour + ``` + +### **Week 3-4: Session Analysis & Basic Signals** +**Goal:** Identify trading opportunities +**Key Skills:** +1. **Session volatility** + ```sql + -- London vs. NY session comparison + SELECT + CASE + WHEN EXTRACT(HOUR FROM timestamp) BETWEEN 7 AND 15 THEN 'London' + WHEN EXTRACT(HOUR FROM timestamp) BETWEEN 13 AND 21 THEN 'NY' + ELSE 'Other' + END AS session, + STDDEV((bid+ask)/2) AS volatility + FROM ticks + GROUP BY session + ``` + +2. **Rolling spreads** + ```sql + -- 30-minute moving spread + SELECT + timestamp, + AVG(ask - bid) OVER ( + ORDER BY timestamp + ROWS BETWEEN 29 PRECEDING AND CURRENT ROW + ) AS rolling_spread + FROM ticks + WHERE symbol = 'GBP/USD' + ``` + +3. **Basic alerts** + ```sql + -- Spread widening alert + SELECT symbol, timestamp, (ask - bid) AS spread + FROM ticks + WHERE (ask - bid) > 3 * ( + SELECT AVG(ask - bid) + FROM ticks + WHERE timestamp > NOW() - INTERVAL '1 day' + ) + ``` + +### **Week 5-6: Advanced Pattern Detection** +**Goal:** Build automated signal detectors +**Key Skills:** +1. **Microprice calculation** + ```sql + -- Weighted mid-price + SELECT + timestamp, + (bid*ask_size + ask*bid_size)/(bid_size + ask_size) AS microprice + FROM ticks + ``` + +2. **Order flow imbalance** + ```sql + -- Bid/ask size ratio + SELECT + timestamp, + (bid_size - ask_size)/(bid_size + ask_size) AS imbalance + FROM ticks + WHERE ABS((bid_size - ask_size)/(bid_size + ask_size)) > 0.7 + ``` + +3. **Consecutive moves** + ```sql + -- 5+ consecutive bid increases + WITH changes AS ( + SELECT *, + CASE WHEN bid > LAG(bid) OVER (ORDER BY timestamp) THEN 1 ELSE 0 END AS is_up + FROM ticks + ) + SELECT timestamp, bid + FROM changes + WHERE is_up = 1 + ORDER BY timestamp + LIMIT 5 + ``` + +### **Week 7-8: Optimization & Productionization** +**Goal:** Make scripts robust and efficient +**Key Skills:** +1. **Indexing for time-series** + ```sql + CREATE INDEX idx_symbol_time ON ticks(symbol, timestamp); + ``` + +2. **CTEs for complex logic** + ```sql + WITH + london_ticks AS ( + SELECT * FROM ticks + WHERE EXTRACT(HOUR FROM timestamp) BETWEEN 7 AND 15 + ), + spreads AS ( + SELECT symbol, AVG(ask - bid) AS avg_spread + FROM london_ticks + GROUP BY symbol + ) + SELECT * FROM spreads WHERE avg_spread > 0.0005; + ``` + +3. **Partitioning large tables** + ```sql + CREATE TABLE ticks_partitioned ( + -- schema + ) PARTITION BY RANGE (timestamp); + ``` + +### **Daily Practice Structure** +1. **Morning (5 min):** Run basic monitoring query + ```sql + -- Current spread status + SELECT symbol, AVG(ask - bid) AS spread + FROM ticks + WHERE timestamp > NOW() - INTERVAL '15 minutes' + GROUP BY symbol; + ``` + +2. **Evening (15 min):** Build one new analysis query + - Monday: Session comparisons + - Tuesday: Rolling metrics + - Wednesday: Alert conditions + - Thursday: Optimization tweaks + - Friday: Backtest old queries + +### **Key Mindset Shifts** +1. **Think in ticks, not hours:** Your queries should process milliseconds efficiently +2. **Pre-compute everything:** Generate candlesticks/aggregates in SQL, not Python +3. **Log everything:** Every cron job should write results to a logging table + +Want the condensed 1-page cheat sheet version of this roadmap? Or should we focus next on building your first complete cron-ready SQL script? \ No newline at end of file