Files
2024-05-01 12:28:44 -06:00

3.5 KiB

YAML, a recursive acronym for "YAML Ain't Markup Language," is a human-readable data serialization standard that can be used in conjunction with all programming languages and is often used for writing configuration files. While Python does not include a built-in library for YAML, the third-party library PyYAML is widely used for parsing and generating YAML files.

PyYAML Usage Guide

Installation

PyYAML can be installed via pip. It's straightforward to add to your project:

pip install PyYAML

Basic Operations

Loading YAML

PyYAML provides functions like yaml.load() and yaml.safe_load() to parse YAML from a string or file. The safe_load() function is recommended for loading untrusted input to avoid executing arbitrary Python objects.

Parsing YAML from a String
import yaml

yaml_string = """
- hero:
    name: John Doe
    age: 30
- villain:
    name: Jane Doe
    age: 25
"""

data = yaml.safe_load(yaml_string)
print(data)
Reading YAML from a File
with open('data.yaml', 'r') as file:
    data = yaml.safe_load(file)
    print(data)

Dumping YAML

To serialize Python objects into a YAML string or file, use yaml.dump().

Converting Python Object to YAML String
data = {
    'hero': {'name': 'John Doe', 'age': 30},
    'villain': {'name': 'Jane Doe', 'age': 25}
}

yaml_string = yaml.dump(data)
print(yaml_string)
Writing YAML Data to a File
with open('output.yaml', 'w') as file:
    yaml.dump(data, file)

Advanced Usage

Custom Python Objects

PyYAML can serialize and deserialize custom Python objects through constructors and representers.

Serializing Custom Objects
class Hero:
    def __init__(self, name, age):
        self.name = name
        self.age = age

yaml.add_representer(Hero, lambda dumper, obj: dumper.represent_dict({'name': obj.name, 'age': obj.age}))

hero = Hero("John Doe", 30)
print(yaml.dump(hero))
Deserializing Custom Objects
def hero_constructor(loader, node):
    fields = loader.construct_mapping(node)
    return Hero(**fields)

yaml.add_constructor('!Hero', hero_constructor)

yaml_string = "!Hero {name: John Doe, age: 30}"
hero = yaml.safe_load(yaml_string)
print(hero.name, hero.age)

Use Cases

  • Configuration Files: YAML's readability makes it ideal for configuration files used in applications and services.

  • Data Serialization: YAML is useful for serializing complex data structures, such as trees or objects, in a format that can be edited by humans.

  • Infrastructure as Code (IaC): In DevOps, YAML is commonly used to define and manage infrastructure through code for cloud services, container orchestration (like Kubernetes), and automation tools.

Best Practices

  • Use safe_load(): Always prefer safe_load() over load() when parsing YAML to avoid executing arbitrary Python objects contained within the YAML file.

  • Keep It Simple: Although YAML supports complex structures, maintaining simplicity in your YAML documents ensures they remain readable and maintainable.

  • Indentation: Pay attention to indentation, as it is significant in YAML and a common source of errors.

PyYAML provides a powerful interface for working with YAML in Python, combining the flexibility of YAML with the expressiveness of Python. Whether you're configuring software, defining infrastructure, or simply need a readable format for data serialization, PyYAML equips you with the tools necessary to work effectively with YAML data.