diff --git a/tech_docs/database/sql_getting_started.md b/tech_docs/database/sql_getting_started.md index e351d90..757ffbb 100644 --- a/tech_docs/database/sql_getting_started.md +++ b/tech_docs/database/sql_getting_started.md @@ -1,3 +1,116 @@ +Here’s the **20% of SQL skills that will deliver 80% of your forex data analysis needs**, structured as a focused roadmap: + +--- + +### **SQL for Forex Data: The 20% Priority Roadmap** +#### **1. Core Skills (Weeks 1-2)** +| Skill | Why It Matters | Key Syntax | +|-------|---------------|------------| +| **Filtering Data** | Isolate specific currency pairs/timeframes | `SELECT * FROM ticks WHERE symbol='EUR/USD' AND timestamp > '2023-01-01'` | +| **Time Bucketing** | Convert ticks to candles (1min/5min/1H) | `DATE_TRUNC('hour', timestamp) AS hour` | +| **Basic Aggregates** | Calculate spreads, averages, highs/lows | `AVG(ask-bid) AS avg_spread`, `MAX(ask) AS high` | +| **Grouping** | Summarize by pair/time period | `GROUP BY symbol, DATE_TRUNC('day', timestamp)` | + +#### **2. Essential Techniques (Weeks 3-4)** +| Skill | Forex Application | Example | +|-------|-------------------|---------| +| **Joins** | Combine tick data with economic calendars | `JOIN economic_events ON ticks.date = events.date` | +| **Rolling Windows** | Calculate moving averages/volatility | `AVG(price) OVER (ORDER BY timestamp ROWS 30 PRECEDING)` | +| **Correlations** | Compare pairs (EUR/USD vs. USD/JPY) | `CORR(eurusd_mid, usdjpy_mid)` | +| **Session Analysis** | Compare London/NY/Asia volatility | `WHERE EXTRACT(HOUR FROM timestamp) IN (7,13,21)` | + +#### **3. Optimization (Week 5)** +| Skill | Impact | Implementation | +|-------|--------|----------------| +| **Indexing** | Speed up time/symbol queries | `CREATE INDEX idx_symbol_time ON ticks(symbol, timestamp)` | +| **CTEs** | Break complex queries into steps | `WITH filtered AS (...) SELECT * FROM filtered` | +| **Partitioning** | Faster queries on large datasets | `PARTITION BY RANGE (timestamp)` | + +--- + +### **Prioritized Cheat Sheet** +#### **10 Queries You’ll Use Daily** +1. **Current Spread**: + ```sql + SELECT symbol, AVG(ask-bid) AS spread + FROM ticks + WHERE timestamp > NOW() - INTERVAL '1 hour' + GROUP BY symbol; + ``` + +2. **5-Min Candles**: + ```sql + SELECT + DATE_TRUNC('5 minutes', timestamp) AS time, + MIN(bid) AS low, + MAX(ask) AS high + FROM ticks + WHERE symbol = 'GBP/USD' + GROUP BY time; + ``` + +3. **Rolling Volatility**: + ```sql + SELECT + timestamp, + STDDEV(ask) OVER (ORDER BY timestamp ROWS 100 PRECEDING) AS vol + FROM ticks + WHERE symbol = 'EUR/USD'; + ``` + +4. **Session Volume**: + ```sql + SELECT + CASE + WHEN EXTRACT(HOUR FROM timestamp) BETWEEN 7 AND 15 THEN 'London' + ELSE 'Other' + END AS session, + SUM(volume) AS total_volume + FROM ticks + GROUP BY session; + ``` + +--- + +### **Study Plan** +1. **Week 1**: Master `SELECT`, `WHERE`, `GROUP BY`, `DATE_TRUNC` + → *Goal: Generate hourly high/low/close for 1 pair* + +2. **Week 2**: Learn `JOIN`, `AVG() OVER()`, `CORR()` + → *Goal: Compare 2 pairs’ correlation last week vs. last month* + +3. **Week 3**: Optimize with indexes + CTEs + → *Goal: Make a 1M-row query run in <1 sec* + +--- + +### **Tools to Use Now** +- **Data**: Free forex ticks from [Dukascopy](https://www.dukascopy.com/) +- **Database**: DuckDB (simple, fast, no setup) +- **Visualization**: Metabase (free) or Python’s Matplotlib + +--- + +### **Avoid Until Later** +- Stored procedures +- Advanced indexing strategies +- Machine learning in SQL +- Recursive queries + +--- + +**Key Principle**: Focus on **time-based analysis** (the core of forex data) first. Everything else builds on this. + +Want the absolute bare minimum? Learn these 4 things: +1. `WHERE` + `DATE_TRUNC` (filter and bucket time) +2. `GROUP BY` (summarize data) +3. `AVG() OVER()` (rolling calculations) +4. `CORR()` (pair relationships) + +Would you like a 5-question quiz to test your 20% knowledge? + +--- + Here's a structured **Technical Guide & Roadmap for Forex Tick Data Analysis with SQL**, designed as a progressive learning path with clear milestones and reference examples: ---