Update tech_docs/database/sql_getting_started.md
This commit is contained in:
@@ -1,3 +1,116 @@
|
||||
Here’s the **20% of SQL skills that will deliver 80% of your forex data analysis needs**, structured as a focused roadmap:
|
||||
|
||||
---
|
||||
|
||||
### **SQL for Forex Data: The 20% Priority Roadmap**
|
||||
#### **1. Core Skills (Weeks 1-2)**
|
||||
| Skill | Why It Matters | Key Syntax |
|
||||
|-------|---------------|------------|
|
||||
| **Filtering Data** | Isolate specific currency pairs/timeframes | `SELECT * FROM ticks WHERE symbol='EUR/USD' AND timestamp > '2023-01-01'` |
|
||||
| **Time Bucketing** | Convert ticks to candles (1min/5min/1H) | `DATE_TRUNC('hour', timestamp) AS hour` |
|
||||
| **Basic Aggregates** | Calculate spreads, averages, highs/lows | `AVG(ask-bid) AS avg_spread`, `MAX(ask) AS high` |
|
||||
| **Grouping** | Summarize by pair/time period | `GROUP BY symbol, DATE_TRUNC('day', timestamp)` |
|
||||
|
||||
#### **2. Essential Techniques (Weeks 3-4)**
|
||||
| Skill | Forex Application | Example |
|
||||
|-------|-------------------|---------|
|
||||
| **Joins** | Combine tick data with economic calendars | `JOIN economic_events ON ticks.date = events.date` |
|
||||
| **Rolling Windows** | Calculate moving averages/volatility | `AVG(price) OVER (ORDER BY timestamp ROWS 30 PRECEDING)` |
|
||||
| **Correlations** | Compare pairs (EUR/USD vs. USD/JPY) | `CORR(eurusd_mid, usdjpy_mid)` |
|
||||
| **Session Analysis** | Compare London/NY/Asia volatility | `WHERE EXTRACT(HOUR FROM timestamp) IN (7,13,21)` |
|
||||
|
||||
#### **3. Optimization (Week 5)**
|
||||
| Skill | Impact | Implementation |
|
||||
|-------|--------|----------------|
|
||||
| **Indexing** | Speed up time/symbol queries | `CREATE INDEX idx_symbol_time ON ticks(symbol, timestamp)` |
|
||||
| **CTEs** | Break complex queries into steps | `WITH filtered AS (...) SELECT * FROM filtered` |
|
||||
| **Partitioning** | Faster queries on large datasets | `PARTITION BY RANGE (timestamp)` |
|
||||
|
||||
---
|
||||
|
||||
### **Prioritized Cheat Sheet**
|
||||
#### **10 Queries You’ll Use Daily**
|
||||
1. **Current Spread**:
|
||||
```sql
|
||||
SELECT symbol, AVG(ask-bid) AS spread
|
||||
FROM ticks
|
||||
WHERE timestamp > NOW() - INTERVAL '1 hour'
|
||||
GROUP BY symbol;
|
||||
```
|
||||
|
||||
2. **5-Min Candles**:
|
||||
```sql
|
||||
SELECT
|
||||
DATE_TRUNC('5 minutes', timestamp) AS time,
|
||||
MIN(bid) AS low,
|
||||
MAX(ask) AS high
|
||||
FROM ticks
|
||||
WHERE symbol = 'GBP/USD'
|
||||
GROUP BY time;
|
||||
```
|
||||
|
||||
3. **Rolling Volatility**:
|
||||
```sql
|
||||
SELECT
|
||||
timestamp,
|
||||
STDDEV(ask) OVER (ORDER BY timestamp ROWS 100 PRECEDING) AS vol
|
||||
FROM ticks
|
||||
WHERE symbol = 'EUR/USD';
|
||||
```
|
||||
|
||||
4. **Session Volume**:
|
||||
```sql
|
||||
SELECT
|
||||
CASE
|
||||
WHEN EXTRACT(HOUR FROM timestamp) BETWEEN 7 AND 15 THEN 'London'
|
||||
ELSE 'Other'
|
||||
END AS session,
|
||||
SUM(volume) AS total_volume
|
||||
FROM ticks
|
||||
GROUP BY session;
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### **Study Plan**
|
||||
1. **Week 1**: Master `SELECT`, `WHERE`, `GROUP BY`, `DATE_TRUNC`
|
||||
→ *Goal: Generate hourly high/low/close for 1 pair*
|
||||
|
||||
2. **Week 2**: Learn `JOIN`, `AVG() OVER()`, `CORR()`
|
||||
→ *Goal: Compare 2 pairs’ correlation last week vs. last month*
|
||||
|
||||
3. **Week 3**: Optimize with indexes + CTEs
|
||||
→ *Goal: Make a 1M-row query run in <1 sec*
|
||||
|
||||
---
|
||||
|
||||
### **Tools to Use Now**
|
||||
- **Data**: Free forex ticks from [Dukascopy](https://www.dukascopy.com/)
|
||||
- **Database**: DuckDB (simple, fast, no setup)
|
||||
- **Visualization**: Metabase (free) or Python’s Matplotlib
|
||||
|
||||
---
|
||||
|
||||
### **Avoid Until Later**
|
||||
- Stored procedures
|
||||
- Advanced indexing strategies
|
||||
- Machine learning in SQL
|
||||
- Recursive queries
|
||||
|
||||
---
|
||||
|
||||
**Key Principle**: Focus on **time-based analysis** (the core of forex data) first. Everything else builds on this.
|
||||
|
||||
Want the absolute bare minimum? Learn these 4 things:
|
||||
1. `WHERE` + `DATE_TRUNC` (filter and bucket time)
|
||||
2. `GROUP BY` (summarize data)
|
||||
3. `AVG() OVER()` (rolling calculations)
|
||||
4. `CORR()` (pair relationships)
|
||||
|
||||
Would you like a 5-question quiz to test your 20% knowledge?
|
||||
|
||||
---
|
||||
|
||||
Here's a structured **Technical Guide & Roadmap for Forex Tick Data Analysis with SQL**, designed as a progressive learning path with clear milestones and reference examples:
|
||||
|
||||
---
|
||||
|
||||
Reference in New Issue
Block a user