Add tech_docs/database/sql_roadmap.md

This commit is contained in:
2025-06-18 05:06:59 +00:00
parent 7c62f52dc7
commit 9a562700e2

View File

@@ -0,0 +1,170 @@
Here's a streamlined 8-week roadmap focused purely on practical SQL skills for forex bid/ask analysis, structured for immediate application in cron jobs:
### **Week 1-2: Core Foundations for Tick Data**
**Goal:** Process raw ticks into usable formats
**Key Skills:**
1. **Basic filtering**
```sql
-- Isolate specific currency pairs/time windows
SELECT * FROM ticks
WHERE symbol = 'EUR/USD'
AND timestamp BETWEEN '2024-01-01 00:00' AND '2024-01-01 23:59'
```
2. **Candlestick generation**
```sql
-- 1-minute OHLC candles
SELECT
symbol,
DATE_TRUNC('minute', timestamp) AS minute,
FIRST(bid) AS open,
MAX(bid) AS high,
MIN(bid) AS low,
LAST(bid) AS close
FROM ticks
GROUP BY symbol, minute
```
3. **Spread metrics**
```sql
-- Average spread by hour
SELECT
symbol,
EXTRACT(HOUR FROM timestamp) AS hour,
AVG(ask - bid) AS avg_spread
FROM ticks
GROUP BY symbol, hour
```
### **Week 3-4: Session Analysis & Basic Signals**
**Goal:** Identify trading opportunities
**Key Skills:**
1. **Session volatility**
```sql
-- London vs. NY session comparison
SELECT
CASE
WHEN EXTRACT(HOUR FROM timestamp) BETWEEN 7 AND 15 THEN 'London'
WHEN EXTRACT(HOUR FROM timestamp) BETWEEN 13 AND 21 THEN 'NY'
ELSE 'Other'
END AS session,
STDDEV((bid+ask)/2) AS volatility
FROM ticks
GROUP BY session
```
2. **Rolling spreads**
```sql
-- 30-minute moving spread
SELECT
timestamp,
AVG(ask - bid) OVER (
ORDER BY timestamp
ROWS BETWEEN 29 PRECEDING AND CURRENT ROW
) AS rolling_spread
FROM ticks
WHERE symbol = 'GBP/USD'
```
3. **Basic alerts**
```sql
-- Spread widening alert
SELECT symbol, timestamp, (ask - bid) AS spread
FROM ticks
WHERE (ask - bid) > 3 * (
SELECT AVG(ask - bid)
FROM ticks
WHERE timestamp > NOW() - INTERVAL '1 day'
)
```
### **Week 5-6: Advanced Pattern Detection**
**Goal:** Build automated signal detectors
**Key Skills:**
1. **Microprice calculation**
```sql
-- Weighted mid-price
SELECT
timestamp,
(bid*ask_size + ask*bid_size)/(bid_size + ask_size) AS microprice
FROM ticks
```
2. **Order flow imbalance**
```sql
-- Bid/ask size ratio
SELECT
timestamp,
(bid_size - ask_size)/(bid_size + ask_size) AS imbalance
FROM ticks
WHERE ABS((bid_size - ask_size)/(bid_size + ask_size)) > 0.7
```
3. **Consecutive moves**
```sql
-- 5+ consecutive bid increases
WITH changes AS (
SELECT *,
CASE WHEN bid > LAG(bid) OVER (ORDER BY timestamp) THEN 1 ELSE 0 END AS is_up
FROM ticks
)
SELECT timestamp, bid
FROM changes
WHERE is_up = 1
ORDER BY timestamp
LIMIT 5
```
### **Week 7-8: Optimization & Productionization**
**Goal:** Make scripts robust and efficient
**Key Skills:**
1. **Indexing for time-series**
```sql
CREATE INDEX idx_symbol_time ON ticks(symbol, timestamp);
```
2. **CTEs for complex logic**
```sql
WITH
london_ticks AS (
SELECT * FROM ticks
WHERE EXTRACT(HOUR FROM timestamp) BETWEEN 7 AND 15
),
spreads AS (
SELECT symbol, AVG(ask - bid) AS avg_spread
FROM london_ticks
GROUP BY symbol
)
SELECT * FROM spreads WHERE avg_spread > 0.0005;
```
3. **Partitioning large tables**
```sql
CREATE TABLE ticks_partitioned (
-- schema
) PARTITION BY RANGE (timestamp);
```
### **Daily Practice Structure**
1. **Morning (5 min):** Run basic monitoring query
```sql
-- Current spread status
SELECT symbol, AVG(ask - bid) AS spread
FROM ticks
WHERE timestamp > NOW() - INTERVAL '15 minutes'
GROUP BY symbol;
```
2. **Evening (15 min):** Build one new analysis query
- Monday: Session comparisons
- Tuesday: Rolling metrics
- Wednesday: Alert conditions
- Thursday: Optimization tweaks
- Friday: Backtest old queries
### **Key Mindset Shifts**
1. **Think in ticks, not hours:** Your queries should process milliseconds efficiently
2. **Pre-compute everything:** Generate candlesticks/aggregates in SQL, not Python
3. **Log everything:** Every cron job should write results to a logging table
Want the condensed 1-page cheat sheet version of this roadmap? Or should we focus next on building your first complete cron-ready SQL script?