diff --git a/tech_docs/python/Python_programming.md b/tech_docs/python/Python_programming.md index 9c545a7..ba4bdcb 100644 --- a/tech_docs/python/Python_programming.md +++ b/tech_docs/python/Python_programming.md @@ -1,3 +1,694 @@ +### Key Core Concepts to Commit to Memory + +When working with data structures in Python, certain concepts become second nature over time due to their fundamental importance and frequent use. Here’s a roundup of these key concepts: + +--- + +#### 1. **List (Array) Core Concepts** + +- **Ordered Collection**: Lists maintain the order of elements, making them ideal for ordered data. +- **Indexing**: Elements in a list can be accessed via their index, starting from 0. +- **Mutability**: Lists are mutable, meaning elements can be changed, added, or removed. +- **Common Methods**: + - `append()`: Adds an element to the end. + - `insert()`: Adds an element at a specific position. + - `remove()`: Removes the first occurrence of an element. + - `pop()`: Removes and returns the element at a specific position. + - `sort()`: Sorts the list in place. +- **Slicing**: Lists support slicing, allowing you to create sublists. + ```python + sublist = my_list[start:end] + ``` + +#### 2. **Dictionary Core Concepts** + +- **Key-Value Pairs**: Dictionaries store data in key-value pairs, providing a mapping from keys to values. +- **Unordered Collection**: Dictionaries do not maintain order (insertion order is preserved from Python 3.7+). +- **Keys are Unique**: Each key in a dictionary must be unique. +- **Mutability**: Dictionaries are mutable; you can add, modify, or delete key-value pairs. +- **Common Methods**: + - `get()`: Returns the value for a key, with an optional default if the key is not found. + - `keys()`: Returns a view object of all keys. + - `values()`: Returns a view object of all values. + - `items()`: Returns a view object of all key-value pairs. + - `update()`: Updates the dictionary with key-value pairs from another dictionary or iterable. +- **Dictionary Comprehensions**: Compact way to create dictionaries. + ```python + squares = {x: x*x for x in range(6)} + ``` + +#### 3. **Set Core Concepts** + +- **Unique Elements**: Sets store only unique elements. +- **Unordered Collection**: Sets do not maintain order. +- **Mutability**: Sets are mutable, allowing elements to be added or removed. +- **Common Operations**: + - `add()`: Adds an element to the set. + - `remove()`: Removes a specific element, raises KeyError if not found. + - `discard()`: Removes a specific element, does not raise an error if not found. + - `union()`: Returns a set containing all elements from the involved sets. + - `intersection()`: Returns a set containing only the common elements. + - `difference()`: Returns a set with elements in the first set but not in the second. +- **Set Comprehensions**: Similar to list comprehensions, used to create sets. + ```python + unique_squares = {x*x for x in range(6)} + ``` + +#### 4. **Tuple Core Concepts** + +- **Ordered Collection**: Tuples maintain the order of elements. +- **Indexing**: Elements in a tuple can be accessed via their index. +- **Immutability**: Tuples are immutable, meaning their elements cannot be changed after creation. +- **Common Uses**: Often used to group related but different types of data, such as coordinates or records. +- **Packing and Unpacking**: + ```python + t = (1, 2, 3) + a, b, c = t + ``` + +--- + +### Summary of Core Concepts + +1. **Mutability**: Understanding which data structures are mutable (lists, dictionaries, sets) and which are immutable (tuples). +2. **Order and Uniqueness**: + - Lists and tuples maintain order. + - Sets enforce uniqueness. + - Dictionaries map unique keys to values. +3. **Indexing vs. Key Access**: + - Lists and tuples use indexing. + - Dictionaries use keys for access. +4. **Common Operations**: Familiarity with basic operations and methods for each data structure. +5. **Performance Considerations**: Awareness of time complexity for common operations like access, insertion, and deletion. +6. **Comprehensions**: Using list, dictionary, and set comprehensions for concise and readable code. + +By committing these core concepts to memory, you'll be well-equipped to handle a wide range of programming challenges efficiently. + +### Stage 1 Concepts to Master + +Building on the core concepts from Stage 0, Stage 1 dives deeper into more advanced techniques, optimizations, and specialized data structures. These concepts will help you write more efficient, robust, and scalable code. + +--- + +#### 1. **Advanced List Concepts** + +- **List Comprehensions**: More complex uses, including nested comprehensions and conditional logic. + ```python + # Nested comprehension for a 2D list + matrix = [[i * j for j in range(5)] for i in range(5)] + + # Conditional comprehension + even_squares = [x*x for x in range(10) if x % 2 == 0] + ``` + +- **List Slicing and Extended Slices**: More advanced slicing techniques, including steps and reversing lists. + ```python + # Slicing with step + evens = my_list[::2] + + # Reversing a list + reversed_list = my_list[::-1] + ``` + +- **List Methods and Built-in Functions**: Efficiently using methods like `sort()`, `reverse()`, and functions like `map()`, `filter()`, `reduce()`. + ```python + from functools import reduce + + # Using map to apply a function to all elements + doubled = list(map(lambda x: x * 2, my_list)) + + # Using filter to filter elements + even_numbers = list(filter(lambda x: x % 2 == 0, my_list)) + + # Using reduce to apply a rolling computation + product = reduce(lambda x, y: x * y, my_list) + ``` + +--- + +#### 2. **Advanced Dictionary Concepts** + +- **Defaultdict and Counter from collections**: Handling dictionaries with default values and counting elements efficiently. + ```python + from collections import defaultdict, Counter + + # defaultdict with a default value + default_dict = defaultdict(int) + default_dict['missing'] += 1 # {'missing': 1} + + # Counter for counting elements + counter = Counter(['apple', 'banana', 'apple', 'orange']) + # Counter({'apple': 2, 'banana': 1, 'orange': 1}) + ``` + +- **Dictionary Comprehensions**: More complex comprehensions with conditional logic and transformations. + ```python + # Conditional dictionary comprehension + squared_dict = {x: x*x for x in range(10) if x % 2 == 0} + ``` + +- **Merging Dictionaries**: Techniques for merging multiple dictionaries. + ```python + dict1 = {'a': 1, 'b': 2} + dict2 = {'b': 3, 'c': 4} + + # Using the update() method + dict1.update(dict2) # {'a': 1, 'b': 3, 'c': 4} + + # Using dictionary unpacking (Python 3.9+) + merged_dict = {**dict1, **dict2} + ``` + +--- + +#### 3. **Advanced Set Concepts** + +- **Set Operations**: Advanced use of union, intersection, difference, and symmetric difference. + ```python + set1 = {1, 2, 3} + set2 = {3, 4, 5} + + # Union + union_set = set1 | set2 # {1, 2, 3, 4, 5} + + # Intersection + intersection_set = set1 & set2 # {3} + + # Difference + difference_set = set1 - set2 # {1, 2} + + # Symmetric Difference + sym_diff_set = set1 ^ set2 # {1, 2, 4, 5} + ``` + +- **Frozensets**: Immutable sets for fixed collections of unique elements. + ```python + frozenset1 = frozenset([1, 2, 3]) + ``` + +--- + +#### 4. **Advanced Tuple Concepts** + +- **Named Tuples**: Using `collections.namedtuple` for more readable code when dealing with tuples. + ```python + from collections import namedtuple + + Point = namedtuple('Point', ['x', 'y']) + p = Point(10, 20) + ``` + +- **Tuples as Keys in Dictionaries**: Using tuples as keys for multi-key access patterns. + ```python + coords_dict = {(40.7128, 74.0060): 'New York', (34.0522, 118.2437): 'Los Angeles'} + ``` + +--- + +#### 5. **Performance Optimization** + +- **Time Complexity Analysis**: Understanding the time complexity of various operations on different data structures. + - **Lists**: Access O(1), Append O(1), Insert/Delete O(n) + - **Dictionaries**: Access O(1), Insert/Delete O(1) + - **Sets**: Membership Test O(1), Insert/Delete O(1) + - **Tuples**: Access O(1), Fixed size, no insert/delete + +- **Space Complexity**: Evaluating the memory usage of different data structures. + - Lists and Tuples are more memory efficient than Dictionaries and Sets due to lack of key-value pairs and hash overhead. + +- **Efficient Iteration**: Using generator expressions and iterators to handle large datasets without excessive memory use. + ```python + # Generator expression + gen_exp = (x*x for x in range(1000000)) + ``` + +--- + +### Summary of Stage 1 Concepts + +1. **List Comprehensions and Slicing**: Advanced usage for cleaner, more efficient code. +2. **Defaultdict and Counter**: Leveraging `collections` for smarter dictionaries. +3. **Set Operations**: Utilizing advanced set operations for unique collections. +4. **Named Tuples**: More readable and structured tuples. +5. **Performance Analysis**: Understanding and optimizing time and space complexity. +6. **Efficient Iteration**: Using generators for large datasets. + +By mastering these Stage 1 concepts, you'll be able to handle more complex data structures and write optimized, high-performance Python code. + +### Stage 2: Advanced Data Structures and Techniques + +At this stage, you'll delve into more specialized data structures and advanced techniques that are essential for handling complex problems efficiently. This includes understanding underlying implementations, mastering performance tuning, and using data structures in conjunction with algorithms. + +--- + +#### 1. **Advanced List Techniques** + +- **List vs. Deque**: When to use `collections.deque` for efficient appends and pops from both ends. + ```python + from collections import deque + + dq = deque([1, 2, 3]) + dq.appendleft(0) + dq.append(4) + dq.pop() + dq.popleft() + ``` + +- **List Optimization**: Techniques such as list pre-allocation for reducing reallocation overhead. + ```python + size = 1000 + preallocated_list = [None] * size + ``` + +- **Memoryviews**: Efficiently handling large data buffers. + ```python + import array + + arr = array.array('h', [1, 2, 3, 4, 5]) + mem_view = memoryview(arr) + ``` + +--- + +#### 2. **Advanced Dictionary Techniques** + +- **OrderedDict**: Using `collections.OrderedDict` for maintaining insertion order in older Python versions. + ```python + from collections import OrderedDict + + ordered_dict = OrderedDict([('a', 1), ('b', 2), ('c', 3)]) + ``` + +- **Dictionary Performance Tuning**: Minimizing collisions and optimizing hashing for custom objects. + ```python + class MyObject: + def __init__(self, value): + self.value = value + + def __hash__(self): + return hash(self.value) + + def __eq__(self, other): + return self.value == other.value + + my_dict = {MyObject(1): "one"} + ``` + +- **Multi-Level Dictionaries**: Efficiently managing nested dictionaries. + ```python + multi_dict = { + 'level1': { + 'level2': { + 'key': 'value' + } + } + } + ``` + +--- + +#### 3. **Advanced Set Techniques** + +- **Set Performance Tuning**: Handling large datasets with set operations and understanding underlying implementation. + ```python + large_set = set(range(1000000)) + ``` + +- **Using Sets with Custom Objects**: Ensuring uniqueness based on custom attributes. + ```python + class MyObject: + def __init__(self, value): + self.value = value + + def __hash__(self): + return hash(self.value) + + def __eq__(self, other): + return self.value == other.value + + my_set = {MyObject(1), MyObject(2)} + ``` + +--- + +#### 4. **Advanced Tuple Techniques** + +- **Named Tuples vs Data Classes**: When to use `collections.namedtuple` vs `dataclasses.dataclass`. + ```python + from dataclasses import dataclass + + @dataclass + class Point: + x: int + y: int + ``` + +- **Tuple as Immutable Data Structure**: Leveraging immutability in multi-threaded environments for safe concurrency. + +--- + +#### 5. **Understanding Internals** + +- **Hashing**: Deep understanding of how Python's hash tables work, particularly in dictionaries and sets. + ```python + def hash_example(): + obj = "example" + return hash(obj) + ``` + +- **Garbage Collection and Memory Management**: How Python manages memory, reference counting, and the garbage collector. + ```python + import gc + + gc.collect() + ``` + +- **Efficient Iteration Patterns**: Using iterators and generators effectively. + ```python + def efficient_iterator(n): + for i in range(n): + yield i*i + ``` + +--- + +### Stage 2.5: Subject Matter Expert (SME) Level Concepts + +At this level, you focus on mastering complex data structures, optimization strategies, and integration with algorithms. This includes designing custom data structures and optimizing performance at scale. + +--- + +#### 1. **Custom Data Structures** + +- **Implementing Custom Data Structures**: Designing and implementing data structures like linked lists, trees, and graphs. + ```python + class Node: + def __init__(self, value): + self.value = value + self.next = None + + class LinkedList: + def __init__(self): + self.head = None + + def append(self, value): + new_node = Node(value) + if not self.head: + self.head = new_node + else: + current = self.head + while current.next: + current = current.next + current.next = new_node + ``` + +- **Balanced Trees**: Implementing and using AVL trees, Red-Black trees, and B-trees. + ```python + # Simplified example of a node in an AVL tree + class AVLNode: + def __init__(self, key, height=1, left=None, right=None): + self.key = key + self.height = height + self.left = left + self.right = right + ``` + +- **Graphs**: Representing and manipulating graphs using adjacency lists and matrices. + ```python + from collections import defaultdict + + class Graph: + def __init__(self): + self.graph = defaultdict(list) + + def add_edge(self, u, v): + self.graph[u].append(v) + + def bfs(self, start): + visited = set() + queue = [start] + while queue: + vertex = queue.pop(0) + if vertex not in visited: + visited.add(vertex) + queue.extend(set(self.graph[vertex]) - visited) + return visited + ``` + +--- + +#### 2. **Advanced Algorithms** + +- **Algorithm Integration**: Combining data structures with algorithms for problem-solving, like using heaps for priority queues. + ```python + import heapq + + heap = [] + heapq.heappush(heap, (5, 'write code')) + heapq.heappush(heap, (1, 'write tests')) + heapq.heappush(heap, (3, 'review code')) + + while heap: + print(heapq.heappop(heap)) + ``` + +- **Dynamic Programming**: Using dictionaries and lists for memoization and tabulation techniques. + ```python + def fibonacci(n, memo={}): + if n in memo: + return memo[n] + if n <= 1: + return n + memo[n] = fibonacci(n-1) + fibonacci(n-2) + return memo[n] + ``` + +--- + +#### 3. **Performance Engineering** + +- **Profiling and Optimization**: Using tools like `cProfile`, `timeit`, and memory profilers to optimize code. + ```python + import cProfile + + def my_function(): + # Code to profile + pass + + cProfile.run('my_function()') + ``` + +- **Concurrency and Parallelism**: Leveraging `multiprocessing`, `threading`, and `asyncio` for concurrent and parallel execution. + ```python + import asyncio + + async def async_function(): + await asyncio.sleep(1) + return 'Completed' + + asyncio.run(async_function()) + ``` + +- **Big Data and Scalability**: Techniques for handling and processing large datasets efficiently. + ```python + # Example of using Dask for parallel computing on large datasets + import dask.dataframe as dd + + df = dd.read_csv('large_file.csv') + result = df.groupby('column').sum().compute() + ``` + +--- + +### Summary of Stage 2 and 2.5 Concepts + +1. **Custom Data Structures**: Designing and implementing linked lists, trees, graphs. +2. **Advanced Algorithms**: Integrating data structures with algorithms, mastering dynamic programming. +3. **Performance Optimization**: Profiling, concurrency, parallelism, and scalability techniques. +4. **Deep Understanding of Internals**: Hashing, memory management, and efficient iteration. +5. **Specialized Libraries and Tools**: Utilizing advanced Python libraries for optimized performance. + +By mastering these advanced concepts, you'll be equipped to tackle complex programming challenges, design efficient solutions, and optimize performance at scale, solidifying your expertise as a subject matter expert in Python data structures and algorithms. + +--- + +Sure! Let's create a comprehensive and concise reference guide that covers various data structures, including lists, dictionaries, sets, and tuples, along with their use cases, performance considerations, and real-world examples. + +--- + +## Reference Guide: Choosing the Right Data Structure in Python + +### Overview + +This guide covers the fundamental data structures in Python: lists, dictionaries, sets, and tuples. It provides a detailed comparison to help you decide which data structure to use based on the nature of your data and the operations you need to perform. + +--- + +### 1. Lists (Arrays) + +#### Characteristics + +- **Order**: Maintains the order of elements. +- **Indexed**: Accessed via indices. +- **Homogeneous Data**: Typically used for similar types of elements. +- **Mutability**: Mutable (elements can be changed). + +#### Operations + +- **Access**: O(1) time complexity. +- **Insertion/Deletion**: O(n) for arbitrary positions; O(1) for appending. + +#### Use Cases + +- **Sequential Data**: Storing a list of items, such as grades or prices. +- **Fixed Set of Items**: When you need to maintain a specific order. + +#### Example: Storing Student Grades + +```python +grades = [85, 92, 78, 90, 88] +second_student_grade = grades[1] # 92 +average_grade = sum(grades) / len(grades) # 86.6 +grades.append(95) +grades.insert(2, 83) +``` + +--- + +### 2. Dictionaries (Dicts) + +#### Characteristics + +- **Key-Value Pairs**: Each element is stored with a unique key. +- **Unordered**: Does not maintain order (order preservation from Python 3.7+). +- **Heterogeneous Data**: Suitable for diverse elements. +- **Mutability**: Mutable. + +#### Operations + +- **Access by Key**: O(1) time complexity. +- **Insertion/Deletion**: O(1) average time complexity. + +#### Use Cases + +- **Mapping Relationships**: Storing product prices, contact information. +- **Fast Lookups**: When quick access by a unique key is needed. + +#### Example: Storing Product Inventory + +```python +prices = {'Apple': 10.99, 'Banana': 5.49, 'Carrot': 7.99, 'Doughnut': 2.99, 'Eggplant': 15.99} +banana_price = prices['Banana'] # 5.49 +total_price = sum(prices.values()) # 43.45 +prices['Fig'] = 9.99 +prices['Carrot'] = 6.99 +``` + +--- + +### 3. Sets + +#### Characteristics + +- **Unique Elements**: All elements are unique. +- **Unordered**: Does not maintain order. +- **Heterogeneous Data**: Can store different types of elements. +- **Mutability**: Mutable (elements can be added or removed). + +#### Operations + +- **Membership Test**: O(1) time complexity. +- **Insertion/Deletion**: O(1) average time complexity. + +#### Use Cases + +- **Unique Collections**: Storing unique items, such as user IDs. +- **Set Operations**: Mathematical set operations like union, intersection, difference. + +#### Example: Managing a Collection of Unique Tags + +```python +tags = {'python', 'data', 'science'} +tags.add('machine learning') +tags.remove('data') +intersection = tags.intersection({'python', 'AI'}) # {'python'} +``` + +--- + +### 4. Tuples + +#### Characteristics + +- **Order**: Maintains the order of elements. +- **Indexed**: Accessed via indices. +- **Immutable**: Elements cannot be changed after creation. +- **Homogeneous or Heterogeneous Data**: Can store similar or different types of elements. + +#### Operations + +- **Access**: O(1) time complexity. +- **Immutability**: Once created, elements cannot be modified. + +#### Use Cases + +- **Fixed Data**: Storing a fixed collection of items, such as coordinates. +- **Heterogeneous Grouping**: Grouping different types of data together. + +#### Example: Storing Geographic Coordinates + +```python +coordinates = (40.7128, 74.0060) +latitude = coordinates[0] # 40.7128 +longitude = coordinates[1] # 74.0060 +``` + +--- + +### Detailed Comparison + +| Feature | List | Dictionary | Set | Tuple | +|------------------|----------------------------|---------------------------|---------------------------|--------------------------| +| **Order** | Ordered | Unordered (Ordered 3.7+) | Unordered | Ordered | +| **Access** | Indexed by integer | Accessed by key | Unordered | Indexed by integer | +| **Mutability** | Mutable | Mutable | Mutable | Immutable | +| **Uniqueness** | Allows duplicates | Keys are unique | Unique elements | Allows duplicates | +| **Access Time** | O(1) | O(1) | O(1) | O(1) | +| **Insertion Time**| O(1) at end, O(n) elsewhere| O(1) | O(1) | N/A | +| **Deletion Time**| O(n) | O(1) | O(1) | N/A | +| **Use Case** | Sequential data | Mapping relationships | Unique collections | Fixed data | + +--- + +### Advanced Topics + +#### 1. Performance Considerations + +- **Memory Overhead**: Dictionaries use more memory due to storing keys and values, while lists and tuples are more memory efficient. +- **Speed**: For large datasets with frequent insertions and deletions, dictionaries and sets can be faster due to O(1) average time complexity. + +#### 2. Common Pitfalls + +- **Improper Use**: Using lists for key-value pairs or dictionaries for simple sequences can lead to inefficient code. +- **Mutability**: Both lists and dictionaries are mutable; unintended changes can occur if not handled carefully. Tuples, being immutable, prevent accidental changes. + +#### 3. Choosing the Right Data Structure + +- **Lists**: Use for ordered, homogeneous data where sequence matters. +- **Dictionaries**: Use for key-value pairs and fast lookups. +- **Sets**: Use for collections of unique items and set operations. +- **Tuples**: Use for fixed, ordered collections and heterogeneous data grouping. + +--- + +### Conclusion + +Selecting the appropriate data structure depends on the specific needs of your application. By understanding the characteristics, operations, and use cases of lists, dictionaries, sets, and tuples, you can make informed decisions to write efficient and maintainable code. + +--- + +--- + Sorting is a fundamental operation in computer science and programming, used to arrange the elements of a list or array in a particular order (ascending or descending). There are numerous sorting algorithms, each with its own characteristics and use cases. Below are some common sorting algorithms along with their time complexities and a brief description. ### Common Sorting Algorithms