Files
the_information_nexus/tech_docs/database/sql_roadmap.md

4.3 KiB

Here's a streamlined 8-week roadmap focused purely on practical SQL skills for forex bid/ask analysis, structured for immediate application in cron jobs:

Week 1-2: Core Foundations for Tick Data

Goal: Process raw ticks into usable formats
Key Skills:

  1. Basic filtering

    -- Isolate specific currency pairs/time windows
    SELECT * FROM ticks 
    WHERE symbol = 'EUR/USD' 
      AND timestamp BETWEEN '2024-01-01 00:00' AND '2024-01-01 23:59'
    
  2. Candlestick generation

    -- 1-minute OHLC candles
    SELECT 
      symbol,
      DATE_TRUNC('minute', timestamp) AS minute,
      FIRST(bid) AS open,
      MAX(bid) AS high,
      MIN(bid) AS low,
      LAST(bid) AS close
    FROM ticks
    GROUP BY symbol, minute
    
  3. Spread metrics

    -- Average spread by hour
    SELECT 
      symbol,
      EXTRACT(HOUR FROM timestamp) AS hour,
      AVG(ask - bid) AS avg_spread
    FROM ticks
    GROUP BY symbol, hour
    

Week 3-4: Session Analysis & Basic Signals

Goal: Identify trading opportunities
Key Skills:

  1. Session volatility

    -- London vs. NY session comparison
    SELECT 
      CASE 
        WHEN EXTRACT(HOUR FROM timestamp) BETWEEN 7 AND 15 THEN 'London'
        WHEN EXTRACT(HOUR FROM timestamp) BETWEEN 13 AND 21 THEN 'NY'
        ELSE 'Other' 
      END AS session,
      STDDEV((bid+ask)/2) AS volatility
    FROM ticks
    GROUP BY session
    
  2. Rolling spreads

    -- 30-minute moving spread
    SELECT 
      timestamp,
      AVG(ask - bid) OVER (
        ORDER BY timestamp 
        ROWS BETWEEN 29 PRECEDING AND CURRENT ROW
      ) AS rolling_spread
    FROM ticks
    WHERE symbol = 'GBP/USD'
    
  3. Basic alerts

    -- Spread widening alert
    SELECT symbol, timestamp, (ask - bid) AS spread
    FROM ticks
    WHERE (ask - bid) > 3 * (
      SELECT AVG(ask - bid) 
      FROM ticks 
      WHERE timestamp > NOW() - INTERVAL '1 day'
    )
    

Week 5-6: Advanced Pattern Detection

Goal: Build automated signal detectors
Key Skills:

  1. Microprice calculation

    -- Weighted mid-price
    SELECT 
      timestamp,
      (bid*ask_size + ask*bid_size)/(bid_size + ask_size) AS microprice
    FROM ticks
    
  2. Order flow imbalance

    -- Bid/ask size ratio
    SELECT 
      timestamp,
      (bid_size - ask_size)/(bid_size + ask_size) AS imbalance
    FROM ticks
    WHERE ABS((bid_size - ask_size)/(bid_size + ask_size)) > 0.7
    
  3. Consecutive moves

    -- 5+ consecutive bid increases
    WITH changes AS (
      SELECT *,
        CASE WHEN bid > LAG(bid) OVER (ORDER BY timestamp) THEN 1 ELSE 0 END AS is_up
      FROM ticks
    )
    SELECT timestamp, bid
    FROM changes
    WHERE is_up = 1
    ORDER BY timestamp
    LIMIT 5
    

Week 7-8: Optimization & Productionization

Goal: Make scripts robust and efficient
Key Skills:

  1. Indexing for time-series

    CREATE INDEX idx_symbol_time ON ticks(symbol, timestamp);
    
  2. CTEs for complex logic

    WITH 
    london_ticks AS (
      SELECT * FROM ticks 
      WHERE EXTRACT(HOUR FROM timestamp) BETWEEN 7 AND 15
    ),
    spreads AS (
      SELECT symbol, AVG(ask - bid) AS avg_spread
      FROM london_ticks
      GROUP BY symbol
    )
    SELECT * FROM spreads WHERE avg_spread > 0.0005;
    
  3. Partitioning large tables

    CREATE TABLE ticks_partitioned (
      -- schema
    ) PARTITION BY RANGE (timestamp);
    

Daily Practice Structure

  1. Morning (5 min): Run basic monitoring query

    -- Current spread status
    SELECT symbol, AVG(ask - bid) AS spread 
    FROM ticks 
    WHERE timestamp > NOW() - INTERVAL '15 minutes'
    GROUP BY symbol;
    
  2. Evening (15 min): Build one new analysis query

    • Monday: Session comparisons
    • Tuesday: Rolling metrics
    • Wednesday: Alert conditions
    • Thursday: Optimization tweaks
    • Friday: Backtest old queries

Key Mindset Shifts

  1. Think in ticks, not hours: Your queries should process milliseconds efficiently
  2. Pre-compute everything: Generate candlesticks/aggregates in SQL, not Python
  3. Log everything: Every cron job should write results to a logging table

Want the condensed 1-page cheat sheet version of this roadmap? Or should we focus next on building your first complete cron-ready SQL script?