Update tech_docs/python/Python_programming.md

This commit is contained in:
2024-06-27 06:36:18 +00:00
parent f08e663741
commit 9fe4341754

View File

@@ -84,6 +84,526 @@ When working with data structures in Python, certain concepts become second natu
By committing these core concepts to memory, you'll be well-equipped to handle a wide range of programming challenges efficiently. By committing these core concepts to memory, you'll be well-equipped to handle a wide range of programming challenges efficiently.
---
### Stage 1 Concepts to Master
Building on the core concepts from Stage 0, Stage 1 dives deeper into more advanced techniques, optimizations, and specialized data structures. These concepts will help you write more efficient, robust, and scalable code.
---
#### 1. **Advanced List Concepts**
- **List Comprehensions**: More complex uses, including nested comprehensions and conditional logic.
```python
# Nested comprehension for a 2D list
matrix = [[i * j for j in range(5)] for i in range(5)]
# Conditional comprehension
even_squares = [x*x for x in range(10) if x % 2 == 0]
```
- **List Slicing and Extended Slices**: More advanced slicing techniques, including steps and reversing lists.
```python
# Slicing with step
evens = my_list[::2]
# Reversing a list
reversed_list = my_list[::-1]
```
- **List Methods and Built-in Functions**: Efficiently using methods like `sort()`, `reverse()`, and functions like `map()`, `filter()`, `reduce()`.
```python
from functools import reduce
# Using map to apply a function to all elements
doubled = list(map(lambda x: x * 2, my_list))
# Using filter to filter elements
even_numbers = list(filter(lambda x: x % 2 == 0, my_list))
# Using reduce to apply a rolling computation
product = reduce(lambda x, y: x * y, my_list)
```
---
#### 2. **Advanced Dictionary Concepts**
- **Defaultdict and Counter from collections**: Handling dictionaries with default values and counting elements efficiently.
```python
from collections import defaultdict, Counter
# defaultdict with a default value
default_dict = defaultdict(int)
default_dict['missing'] += 1 # {'missing': 1}
# Counter for counting elements
counter = Counter(['apple', 'banana', 'apple', 'orange'])
# Counter({'apple': 2, 'banana': 1, 'orange': 1})
```
- **Dictionary Comprehensions**: More complex comprehensions with conditional logic and transformations.
```python
# Conditional dictionary comprehension
squared_dict = {x: x*x for x in range(10) if x % 2 == 0}
```
- **Merging Dictionaries**: Techniques for merging multiple dictionaries.
```python
dict1 = {'a': 1, 'b': 2}
dict2 = {'b': 3, 'c': 4}
# Using the update() method
dict1.update(dict2) # {'a': 1, 'b': 3, 'c': 4}
# Using dictionary unpacking (Python 3.9+)
merged_dict = {**dict1, **dict2}
```
---
#### 3. **Advanced Set Concepts**
- **Set Operations**: Advanced use of union, intersection, difference, and symmetric difference.
```python
set1 = {1, 2, 3}
set2 = {3, 4, 5}
# Union
union_set = set1 | set2 # {1, 2, 3, 4, 5}
# Intersection
intersection_set = set1 & set2 # {3}
# Difference
difference_set = set1 - set2 # {1, 2}
# Symmetric Difference
sym_diff_set = set1 ^ set2 # {1, 2, 4, 5}
```
- **Frozensets**: Immutable sets for fixed collections of unique elements.
```python
frozenset1 = frozenset([1, 2, 3])
```
---
#### 4. **Advanced Tuple Concepts**
- **Named Tuples**: Using `collections.namedtuple` for more readable code when dealing with tuples.
```python
from collections import namedtuple
Point = namedtuple('Point', ['x', 'y'])
p = Point(10, 20)
```
- **Tuples as Keys in Dictionaries**: Using tuples as keys for multi-key access patterns.
```python
coords_dict = {(40.7128, 74.0060): 'New York', (34.0522, 118.2437): 'Los Angeles'}
```
---
#### 5. **Performance Optimization**
- **Time Complexity Analysis**: Understanding the time complexity of various operations on different data structures.
- **Lists**: Access O(1), Append O(1), Insert/Delete O(n)
- **Dictionaries**: Access O(1), Insert/Delete O(1)
- **Sets**: Membership Test O(1), Insert/Delete O(1)
- **Tuples**: Access O(1), Fixed size, no insert/delete
- **Space Complexity**: Evaluating the memory usage
of different data structures.
- Lists and Tuples are more memory efficient than Dictionaries and Sets due to lack of key-value pairs and hash overhead.
- **Efficient Iteration**: Using generator expressions and iterators to handle large datasets without excessive memory use.
```python
# Generator expression
gen_exp = (x*x for x in range(1000000))
```
---
### Summary of Stage 1 Concepts
1. **List Comprehensions and Slicing**: Advanced usage for cleaner, more efficient code.
2. **Defaultdict and Counter**: Leveraging `collections` for smarter dictionaries.
3. **Set Operations**: Utilizing advanced set operations for unique collections.
4. **Named Tuples**: More readable and structured tuples.
5. **Performance Analysis**: Understanding and optimizing time and space complexity.
6. **Efficient Iteration**: Using generators for large datasets.
By mastering these Stage 1 concepts, you'll be able to handle more complex data structures and write optimized, high-performance Python code.
---
### Stage 2: Advanced Data Structures and Techniques
At this stage, you'll delve into more specialized data structures and advanced techniques that are essential for handling complex problems efficiently. This includes understanding underlying implementations, mastering performance tuning, and using data structures in conjunction with algorithms.
---
#### 1. **Advanced List Techniques**
- **List vs. Deque**: When to use `collections.deque` for efficient appends and pops from both ends.
```python
from collections import deque
dq = deque([1, 2, 3])
dq.appendleft(0)
dq.append(4)
dq.pop()
dq.popleft()
```
- **List Optimization**: Techniques such as list pre-allocation for reducing reallocation overhead.
```python
size = 1000
preallocated_list = [None] * size
```
- **Memoryviews**: Efficiently handling large data buffers.
```python
import array
arr = array.array('h', [1, 2, 3, 4, 5])
mem_view = memoryview(arr)
```
---
#### 2. **Advanced Dictionary Techniques**
- **OrderedDict**: Using `collections.OrderedDict` for maintaining insertion order in older Python versions.
```python
from collections import OrderedDict
ordered_dict = OrderedDict([('a', 1), ('b', 2), ('c', 3)])
```
- **Dictionary Performance Tuning**: Minimizing collisions and optimizing hashing for custom objects.
```python
class MyObject:
def __init__(self, value):
self.value = value
def __hash__(self):
return hash(self.value)
def __eq__(self, other):
return self.value == other.value
my_dict = {MyObject(1): "one"}
```
- **Multi-Level Dictionaries**: Efficiently managing nested dictionaries.
```python
multi_dict = {
'level1': {
'level2': {
'key': 'value'
}
}
}
```
---
#### 3. **Advanced Set Techniques**
- **Set Performance Tuning**: Handling large datasets with set operations and understanding underlying implementation.
```python
large_set = set(range(1000000))
```
- **Using Sets with Custom Objects**: Ensuring uniqueness based on custom attributes.
```python
class MyObject:
def __init__(self, value):
self.value = value
def __hash__(self):
return hash(self.value)
def __eq__(self, other):
return self.value == other.value
my_set = {MyObject(1), MyObject(2)}
```
---
#### 4. **Advanced Tuple Techniques**
- **Named Tuples vs Data Classes**: When to use `collections.namedtuple` vs `dataclasses.dataclass`.
```python
from dataclasses import dataclass
@dataclass
class Point:
x: int
y: int
```
- **Tuple as Immutable Data Structure**: Leveraging immutability in multi-threaded environments for safe concurrency.
---
#### 5. **Understanding Internals**
- **Hashing**: Deep understanding of how Python's hash tables work, particularly in dictionaries and sets.
```python
def hash_example():
obj = "example"
return hash(obj)
```
- **Garbage Collection and Memory Management**: How Python manages memory, reference counting, and the garbage collector.
```python
import gc
gc.collect()
```
- **Efficient Iteration Patterns**: Using iterators and generators effectively.
```python
def efficient_iterator(n):
for i in range(n):
yield i*i
```
---
### Stage 2.5: Subject Matter Expert (SME) Level Concepts
At this level, you focus on mastering complex data structures, optimization strategies, and integration with algorithms. This includes designing custom data structures and optimizing performance at scale.
---
#### 1. **Custom Data Structures**
- **Implementing Custom Data Structures**: Designing and implementing data structures like linked lists, trees, and graphs.
```python
class Node:
def __init__(self, value):
self.value = value
self.next = None
class LinkedList:
def __init__(self):
self.head = None
def append(self, value):
new_node = Node(value)
if not self.head:
self.head = new_node
else:
current = self.head
while current.next:
current = current.next
current.next = new_node
```
- **Balanced Trees**: Implementing and using AVL trees, Red-Black trees, and B-trees.
```python
# Simplified example of a node in an AVL tree
class AVLNode:
def __init__(self, key, height=1, left=None, right=None):
self.key = key
self.height = height
self.left = left
self.right = right
```
- **Graphs**: Representing and manipulating graphs using adjacency lists and matrices.
```python
from collections import defaultdict
class Graph:
def __init__(self):
self.graph = defaultdict(list)
def add_edge(self, u, v):
self.graph[u].append(v)
def bfs(self, start):
visited = set()
queue = [start]
while queue:
vertex = queue.pop(0)
if vertex not in visited:
visited.add(vertex)
queue.extend(set(self.graph[vertex]) - visited)
return visited
```
---
#### 2. **Advanced Algorithms**
- **Algorithm Integration**: Combining data structures with algorithms for problem-solving, like using heaps for priority queues.
```python
import heapq
heap = []
heapq.heappush(heap, (5, 'write code'))
heapq.heappush(heap, (1, 'write tests'))
heapq.heappush(heap, (3, 'review code'))
while heap:
print(heapq.heappop(heap))
```
- **Dynamic Programming**: Using dictionaries and lists for memoization and tabulation techniques.
```python
def fibonacci(n, memo={}):
if n in memo:
return memo[n]
if n <= 1:
return n
memo[n] = fibonacci(n-1) + fibonacci(n-2)
return memo[n]
```
---
#### 3. **Performance Engineering**
- **Profiling and Optimization**: Using tools like `cProfile`, `timeit`, and memory profilers to optimize code.
```python
import cProfile
def my_function():
# Code to profile
pass
cProfile.run('my_function()')
```
- **Concurrency and Parallelism**: Leveraging `multiprocessing`, `threading`, and `asyncio` for concurrent and parallel execution.
```python
import asyncio
async def async_function():
await asyncio.sleep(1)
return 'Completed'
asyncio.run(async_function())
```
- **Big Data and Scalability**: Techniques for handling and processing large datasets efficiently.
```python
# Example of using Dask for parallel computing on large datasets
import dask.dataframe as dd
df = dd.read_csv('large_file.csv')
result = df.groupby('column').sum().compute()
```
---
### Summary of Stage 2 and 2.5 Concepts
1. **Custom Data Structures**: Designing and implementing linked lists, trees, graphs.
2. **Advanced Algorithms**: Integrating data structures with algorithms, mastering dynamic programming.
3. **Performance Optimization**: Profiling, concurrency, parallelism, and scalability techniques.
4. **Deep Understanding of Internals**: Hashing, memory management, and efficient iteration.
5. **Specialized Libraries and Tools**: Utilizing advanced Python libraries for optimized performance.
By mastering these advanced concepts, you'll be equipped to tackle complex programming challenges, design efficient solutions, and optimize performance at scale, solidifying your expertise as a subject matter expert in Python data structures and algorithms.
---
### Key Core Concepts to Commit to Memory
When working with data structures in Python, certain concepts become second nature over time due to their fundamental importance and frequent use. Heres a roundup of these key concepts:
---
#### 1. **List (Array) Core Concepts**
- **Ordered Collection**: Lists maintain the order of elements, making them ideal for ordered data.
- **Indexing**: Elements in a list can be accessed via their index, starting from 0.
- **Mutability**: Lists are mutable, meaning elements can be changed, added, or removed.
- **Common Methods**:
- `append()`: Adds an element to the end.
- `insert()`: Adds an element at a specific position.
- `remove()`: Removes the first occurrence of an element.
- `pop()`: Removes and returns the element at a specific position.
- `sort()`: Sorts the list in place.
- **Slicing**: Lists support slicing, allowing you to create sublists.
```python
sublist = my_list[start:end]
```
#### 2. **Dictionary Core Concepts**
- **Key-Value Pairs**: Dictionaries store data in key-value pairs, providing a mapping from keys to values.
- **Unordered Collection**: Dictionaries do not maintain order (insertion order is preserved from Python 3.7+).
- **Keys are Unique**: Each key in a dictionary must be unique.
- **Mutability**: Dictionaries are mutable; you can add, modify, or delete key-value pairs.
- **Common Methods**:
- `get()`: Returns the value for a key, with an optional default if the key is not found.
- `keys()`: Returns a view object of all keys.
- `values()`: Returns a view object of all values.
- `items()`: Returns a view object of all key-value pairs.
- `update()`: Updates the dictionary with key-value pairs from another dictionary or iterable.
- **Dictionary Comprehensions**: Compact way to create dictionaries.
```python
squares = {x: x*x for x in range(6)}
```
#### 3. **Set Core Concepts**
- **Unique Elements**: Sets store only unique elements.
- **Unordered Collection**: Sets do not maintain order.
- **Mutability**: Sets are mutable, allowing elements to be added or removed.
- **Common Operations**:
- `add()`: Adds an element to the set.
- `remove()`: Removes a specific element, raises KeyError if not found.
- `discard()`: Removes a specific element, does not raise an error if not found.
- `union()`: Returns a set containing all elements from the involved sets.
- `intersection()`: Returns a set containing only the common elements.
- `difference()`: Returns a set with elements in the first set but not in the second.
- **Set Comprehensions**: Similar to list comprehensions, used to create sets.
```python
unique_squares = {x*x for x in range(6)}
```
#### 4. **Tuple Core Concepts**
- **Ordered Collection**: Tuples maintain the order of elements.
- **Indexing**: Elements in a tuple can be accessed via their index.
- **Immutability**: Tuples are immutable, meaning their elements cannot be changed after creation.
- **Common Uses**: Often used to group related but different types of data, such as coordinates or records.
- **Packing and Unpacking**:
```python
t = (1, 2, 3)
a, b, c = t
```
---
### Summary of Core Concepts
1. **Mutability**: Understanding which data structures are mutable (lists, dictionaries, sets) and which are immutable (tuples).
2. **Order and Uniqueness**:
- Lists and tuples maintain order.
- Sets enforce uniqueness.
- Dictionaries map unique keys to values.
3. **Indexing vs. Key Access**:
- Lists and tuples use indexing.
- Dictionaries use keys for access.
4. **Common Operations**: Familiarity with basic operations and methods for each data structure.
5. **Performance Considerations**: Awareness of time complexity for common operations like access, insertion, and deletion.
6. **Comprehensions**: Using list, dictionary, and set comprehensions for concise and readable code.
By committing these core concepts to memory, you'll be well-equipped to handle a wide range of programming challenges efficiently.
### Stage 1 Concepts to Master ### Stage 1 Concepts to Master
Building on the core concepts from Stage 0, Stage 1 dives deeper into more advanced techniques, optimizations, and specialized data structures. These concepts will help you write more efficient, robust, and scalable code. Building on the core concepts from Stage 0, Stage 1 dives deeper into more advanced techniques, optimizations, and specialized data structures. These concepts will help you write more efficient, robust, and scalable code.