Update tech_docs/python/Python_programming.md

2024-06-27 06:36:18 +00:00
parent f08e663741
commit 9fe4341754
1 changed files with 520 additions and 0 deletions
--- a/tech_docs/python/Python_programming.md
+++ b/tech_docs/python/Python_programming.md
@@ -84,6 +84,526 @@ When working with data structures in Python, certain concepts become second natu
 By committing these core concepts to memory, you'll be well-equipped to handle a wide range of programming challenges efficiently.
 ---
 ### Stage 1 Concepts to Master
 Building on the core concepts from Stage 0, Stage 1 dives deeper into more advanced techniques, optimizations, and specialized data structures. These concepts will help you write more efficient, robust, and scalable code.
 ---
 #### 1. **Advanced List Concepts**
 - **List Comprehensions**: More complex uses, including nested comprehensions and conditional logic.
  ```python
  # Nested comprehension for a 2D list
  matrix = [[i * j for j in range(5)] for i in range(5)]
  # Conditional comprehension
  even_squares = [x*x for x in range(10) if x % 2 == 0]
  ```
 - **List Slicing and Extended Slices**: More advanced slicing techniques, including steps and reversing lists.
  ```python
  # Slicing with step
  evens = my_list[::2]
  # Reversing a list
  reversed_list = my_list[::-1]
  ```
 - **List Methods and Built-in Functions**: Efficiently using methods like `sort()`, `reverse()`, and functions like `map()`, `filter()`, `reduce()`.
  ```python
  from functools import reduce
  # Using map to apply a function to all elements
  doubled = list(map(lambda x: x * 2, my_list))
  # Using filter to filter elements
  even_numbers = list(filter(lambda x: x % 2 == 0, my_list))
  # Using reduce to apply a rolling computation
  product = reduce(lambda x, y: x * y, my_list)
  ```
 ---
 #### 2. **Advanced Dictionary Concepts**
 - **Defaultdict and Counter from collections**: Handling dictionaries with default values and counting elements efficiently.
  ```python
  from collections import defaultdict, Counter
  # defaultdict with a default value
  default_dict = defaultdict(int)
  default_dict['missing'] += 1  # {'missing': 1}
  # Counter for counting elements
  counter = Counter(['apple', 'banana', 'apple', 'orange'])
  # Counter({'apple': 2, 'banana': 1, 'orange': 1})
  ```
 - **Dictionary Comprehensions**: More complex comprehensions with conditional logic and transformations.
  ```python
  # Conditional dictionary comprehension
  squared_dict = {x: x*x for x in range(10) if x % 2 == 0}
  ```
 - **Merging Dictionaries**: Techniques for merging multiple dictionaries.
  ```python
  dict1 = {'a': 1, 'b': 2}
  dict2 = {'b': 3, 'c': 4}
  # Using the update() method
  dict1.update(dict2)  # {'a': 1, 'b': 3, 'c': 4}
  # Using dictionary unpacking (Python 3.9+)
  merged_dict = {**dict1, **dict2}
  ```
 ---
 #### 3. **Advanced Set Concepts**
 - **Set Operations**: Advanced use of union, intersection, difference, and symmetric difference.
  ```python
  set1 = {1, 2, 3}
  set2 = {3, 4, 5}
  # Union
  union_set = set1 | set2  # {1, 2, 3, 4, 5}
  # Intersection
  intersection_set = set1 & set2  # {3}
  # Difference
  difference_set = set1 - set2  # {1, 2}
  # Symmetric Difference
  sym_diff_set = set1 ^ set2  # {1, 2, 4, 5}
  ```
 - **Frozensets**: Immutable sets for fixed collections of unique elements.
  ```python
  frozenset1 = frozenset([1, 2, 3])
  ```
 ---
 #### 4. **Advanced Tuple Concepts**
 - **Named Tuples**: Using `collections.namedtuple` for more readable code when dealing with tuples.
  ```python
  from collections import namedtuple
  Point = namedtuple('Point', ['x', 'y'])
  p = Point(10, 20)
  ```
 - **Tuples as Keys in Dictionaries**: Using tuples as keys for multi-key access patterns.
  ```python
  coords_dict = {(40.7128, 74.0060): 'New York', (34.0522, 118.2437): 'Los Angeles'}
  ```
 ---
 #### 5. **Performance Optimization**
 - **Time Complexity Analysis**: Understanding the time complexity of various operations on different data structures.
  - **Lists**: Access O(1), Append O(1), Insert/Delete O(n)
  - **Dictionaries**: Access O(1), Insert/Delete O(1)
  - **Sets**: Membership Test O(1), Insert/Delete O(1)
  - **Tuples**: Access O(1), Fixed size, no insert/delete
 - **Space Complexity**: Evaluating the memory usage
 of different data structures.
  - Lists and Tuples are more memory efficient than Dictionaries and Sets due to lack of key-value pairs and hash overhead.
 - **Efficient Iteration**: Using generator expressions and iterators to handle large datasets without excessive memory use.
  ```python
  # Generator expression
  gen_exp = (x*x for x in range(1000000))
  ```
 ---
 ### Summary of Stage 1 Concepts
 1. **List Comprehensions and Slicing**: Advanced usage for cleaner, more efficient code.
 2. **Defaultdict and Counter**: Leveraging `collections` for smarter dictionaries.
 3. **Set Operations**: Utilizing advanced set operations for unique collections.
 4. **Named Tuples**: More readable and structured tuples.
 5. **Performance Analysis**: Understanding and optimizing time and space complexity.
 6. **Efficient Iteration**: Using generators for large datasets.
 By mastering these Stage 1 concepts, you'll be able to handle more complex data structures and write optimized, high-performance Python code.
 ---
 ### Stage 2: Advanced Data Structures and Techniques
 At this stage, you'll delve into more specialized data structures and advanced techniques that are essential for handling complex problems efficiently. This includes understanding underlying implementations, mastering performance tuning, and using data structures in conjunction with algorithms.
 ---
 #### 1. **Advanced List Techniques**
 - **List vs. Deque**: When to use `collections.deque` for efficient appends and pops from both ends.
  ```python
  from collections import deque
  dq = deque([1, 2, 3])
  dq.appendleft(0)
  dq.append(4)
  dq.pop()
  dq.popleft()
  ```
 - **List Optimization**: Techniques such as list pre-allocation for reducing reallocation overhead.
  ```python
  size = 1000
  preallocated_list = [None] * size
  ```
 - **Memoryviews**: Efficiently handling large data buffers.
  ```python
  import array
  arr = array.array('h', [1, 2, 3, 4, 5])
  mem_view = memoryview(arr)
  ```
 ---
 #### 2. **Advanced Dictionary Techniques**
 - **OrderedDict**: Using `collections.OrderedDict` for maintaining insertion order in older Python versions.
  ```python
  from collections import OrderedDict
  ordered_dict = OrderedDict([('a', 1), ('b', 2), ('c', 3)])
  ```
 - **Dictionary Performance Tuning**: Minimizing collisions and optimizing hashing for custom objects.
  ```python
  class MyObject:
      def __init__(self, value):
          self.value = value
      def __hash__(self):
          return hash(self.value)
      def __eq__(self, other):
          return self.value == other.value
  my_dict = {MyObject(1): "one"}
  ```
 - **Multi-Level Dictionaries**: Efficiently managing nested dictionaries.
  ```python
  multi_dict = {
      'level1': {
          'level2': {
              'key': 'value'
          }
      }
  }
  ```
 ---
 #### 3. **Advanced Set Techniques**
 - **Set Performance Tuning**: Handling large datasets with set operations and understanding underlying implementation.
  ```python
  large_set = set(range(1000000))
  ```
 - **Using Sets with Custom Objects**: Ensuring uniqueness based on custom attributes.
  ```python
  class MyObject:
      def __init__(self, value):
          self.value = value
      def __hash__(self):
          return hash(self.value)
      def __eq__(self, other):
          return self.value == other.value
  my_set = {MyObject(1), MyObject(2)}
  ```
 ---
 #### 4. **Advanced Tuple Techniques**
 - **Named Tuples vs Data Classes**: When to use `collections.namedtuple` vs `dataclasses.dataclass`.
  ```python
  from dataclasses import dataclass
  @dataclass
  class Point:
      x: int
      y: int
  ```
 - **Tuple as Immutable Data Structure**: Leveraging immutability in multi-threaded environments for safe concurrency.
 ---
 #### 5. **Understanding Internals**
 - **Hashing**: Deep understanding of how Python's hash tables work, particularly in dictionaries and sets.
  ```python
  def hash_example():
      obj = "example"
      return hash(obj)
  ```
 - **Garbage Collection and Memory Management**: How Python manages memory, reference counting, and the garbage collector.
  ```python
  import gc
  gc.collect()
  ```
 - **Efficient Iteration Patterns**: Using iterators and generators effectively.
  ```python
  def efficient_iterator(n):
      for i in range(n):
          yield i*i
  ```
 ---
 ### Stage 2.5: Subject Matter Expert (SME) Level Concepts
 At this level, you focus on mastering complex data structures, optimization strategies, and integration with algorithms. This includes designing custom data structures and optimizing performance at scale.
 ---
 #### 1. **Custom Data Structures**
 - **Implementing Custom Data Structures**: Designing and implementing data structures like linked lists, trees, and graphs.
  ```python
  class Node:
      def __init__(self, value):
          self.value = value
          self.next = None
  class LinkedList:
      def __init__(self):
          self.head = None
      def append(self, value):
          new_node = Node(value)
          if not self.head:
              self.head = new_node
          else:
              current = self.head
              while current.next:
                  current = current.next
              current.next = new_node
  ```
 - **Balanced Trees**: Implementing and using AVL trees, Red-Black trees, and B-trees.
  ```python
  # Simplified example of a node in an AVL tree
  class AVLNode:
      def __init__(self, key, height=1, left=None, right=None):
          self.key = key
          self.height = height
          self.left = left
          self.right = right
  ```
 - **Graphs**: Representing and manipulating graphs using adjacency lists and matrices.
  ```python
  from collections import defaultdict
  class Graph:
      def __init__(self):
          self.graph = defaultdict(list)
      def add_edge(self, u, v):
          self.graph[u].append(v)
      def bfs(self, start):
          visited = set()
          queue = [start]
          while queue:
              vertex = queue.pop(0)
              if vertex not in visited:
                  visited.add(vertex)
                  queue.extend(set(self.graph[vertex]) - visited)
          return visited
  ```
 ---
 #### 2. **Advanced Algorithms**
 - **Algorithm Integration**: Combining data structures with algorithms for problem-solving, like using heaps for priority queues.
  ```python
  import heapq
  heap = []
  heapq.heappush(heap, (5, 'write code'))
  heapq.heappush(heap, (1, 'write tests'))
  heapq.heappush(heap, (3, 'review code'))
  while heap:
      print(heapq.heappop(heap))
  ```
 - **Dynamic Programming**: Using dictionaries and lists for memoization and tabulation techniques.
  ```python
  def fibonacci(n, memo={}):
      if n in memo:
          return memo[n]
      if n <= 1:
          return n
      memo[n] = fibonacci(n-1) + fibonacci(n-2)
      return memo[n]
  ```
 ---
 #### 3. **Performance Engineering**
 - **Profiling and Optimization**: Using tools like `cProfile`, `timeit`, and memory profilers to optimize code.
  ```python
  import cProfile
  def my_function():
      # Code to profile
      pass
  cProfile.run('my_function()')
  ```
 - **Concurrency and Parallelism**: Leveraging `multiprocessing`, `threading`, and `asyncio` for concurrent and parallel execution.
  ```python
  import asyncio
  async def async_function():
      await asyncio.sleep(1)
      return 'Completed'
  asyncio.run(async_function())
  ```
 - **Big Data and Scalability**: Techniques for handling and processing large datasets efficiently.
  ```python
  # Example of using Dask for parallel computing on large datasets
  import dask.dataframe as dd
  df = dd.read_csv('large_file.csv')
  result = df.groupby('column').sum().compute()
  ```
 ---
 ### Summary of Stage 2 and 2.5 Concepts
 1. **Custom Data Structures**: Designing and implementing linked lists, trees, graphs.
 2. **Advanced Algorithms**: Integrating data structures with algorithms, mastering dynamic programming.
 3. **Performance Optimization**: Profiling, concurrency, parallelism, and scalability techniques.
 4. **Deep Understanding of Internals**: Hashing, memory management, and efficient iteration.
 5. **Specialized Libraries and Tools**: Utilizing advanced Python libraries for optimized performance.
 By mastering these advanced concepts, you'll be equipped to tackle complex programming challenges, design efficient solutions, and optimize performance at scale, solidifying your expertise as a subject matter expert in Python data structures and algorithms.
 ---
 ### Key Core Concepts to Commit to Memory
 When working with data structures in Python, certain concepts become second nature over time due to their fundamental importance and frequent use. Here’s a roundup of these key concepts:
 ---
 #### 1. **List (Array) Core Concepts**
 - **Ordered Collection**: Lists maintain the order of elements, making them ideal for ordered data.
 - **Indexing**: Elements in a list can be accessed via their index, starting from 0.
 - **Mutability**: Lists are mutable, meaning elements can be changed, added, or removed.
 - **Common Methods**:
  - `append()`: Adds an element to the end.
  - `insert()`: Adds an element at a specific position.
  - `remove()`: Removes the first occurrence of an element.
  - `pop()`: Removes and returns the element at a specific position.
  - `sort()`: Sorts the list in place.
 - **Slicing**: Lists support slicing, allowing you to create sublists.
  ```python
  sublist = my_list[start:end]
  ```
 #### 2. **Dictionary Core Concepts**
 - **Key-Value Pairs**: Dictionaries store data in key-value pairs, providing a mapping from keys to values.
 - **Unordered Collection**: Dictionaries do not maintain order (insertion order is preserved from Python 3.7+).
 - **Keys are Unique**: Each key in a dictionary must be unique.
 - **Mutability**: Dictionaries are mutable; you can add, modify, or delete key-value pairs.
 - **Common Methods**:
  - `get()`: Returns the value for a key, with an optional default if the key is not found.
  - `keys()`: Returns a view object of all keys.
  - `values()`: Returns a view object of all values.
  - `items()`: Returns a view object of all key-value pairs.
  - `update()`: Updates the dictionary with key-value pairs from another dictionary or iterable.
 - **Dictionary Comprehensions**: Compact way to create dictionaries.
  ```python
  squares = {x: x*x for x in range(6)}
  ```
 #### 3. **Set Core Concepts**
 - **Unique Elements**: Sets store only unique elements.
 - **Unordered Collection**: Sets do not maintain order.
 - **Mutability**: Sets are mutable, allowing elements to be added or removed.
 - **Common Operations**:
  - `add()`: Adds an element to the set.
  - `remove()`: Removes a specific element, raises KeyError if not found.
  - `discard()`: Removes a specific element, does not raise an error if not found.
  - `union()`: Returns a set containing all elements from the involved sets.
  - `intersection()`: Returns a set containing only the common elements.
  - `difference()`: Returns a set with elements in the first set but not in the second.
 - **Set Comprehensions**: Similar to list comprehensions, used to create sets.
  ```python
  unique_squares = {x*x for x in range(6)}
  ```
 #### 4. **Tuple Core Concepts**
 - **Ordered Collection**: Tuples maintain the order of elements.
 - **Indexing**: Elements in a tuple can be accessed via their index.
 - **Immutability**: Tuples are immutable, meaning their elements cannot be changed after creation.
 - **Common Uses**: Often used to group related but different types of data, such as coordinates or records.
 - **Packing and Unpacking**:
  ```python
  t = (1, 2, 3)
  a, b, c = t
  ```
 ---
 ### Summary of Core Concepts
 1. **Mutability**: Understanding which data structures are mutable (lists, dictionaries, sets) and which are immutable (tuples).
 2. **Order and Uniqueness**:
   - Lists and tuples maintain order.
   - Sets enforce uniqueness.
   - Dictionaries map unique keys to values.
 3. **Indexing vs. Key Access**:
   - Lists and tuples use indexing.
   - Dictionaries use keys for access.
 4. **Common Operations**: Familiarity with basic operations and methods for each data structure.
 5. **Performance Considerations**: Awareness of time complexity for common operations like access, insertion, and deletion.
 6. **Comprehensions**: Using list, dictionary, and set comprehensions for concise and readable code.
 By committing these core concepts to memory, you'll be well-equipped to handle a wide range of programming challenges efficiently.
 ### Stage 1 Concepts to Master
 Building on the core concepts from Stage 0, Stage 1 dives deeper into more advanced techniques, optimizations, and specialized data structures. These concepts will help you write more efficient, robust, and scalable code.