Ansible Collections Development and Usage

Ansible Collections are a packaging format for distributing Ansible content including roles, modules, plugins, and playbooks as versioned, reusable units published to Ansible Galaxy or private repositories. Collections enable organizations to share automation code across teams, maintain semantic versioning, and manage dependencies declarably — replacing the older roles-only sharing model.

Prerequisites

  • Ansible 2.9+ (Collections support added in 2.9, mature in 2.10+)
  • Python 3.8+
  • Ansible Galaxy account (for publishing)
  • Git for version control
# Verify Ansible version
ansible --version

# Install collection development tools
pip install ansible-core ansible-lint flake8 pylint pytest

# List installed collections
ansible-galaxy collection list

Collection Structure

namespace.collection_name/
├── docs/                   # Documentation files
├── galaxy.yml              # Collection metadata (required)
├── plugins/
│   ├── modules/            # Custom Ansible modules
│   │   └── my_module.py
│   ├── filter/             # Jinja2 filter plugins
│   │   └── my_filters.py
│   ├── inventory/          # Dynamic inventory plugins
│   ├── lookup/             # Lookup plugins
│   ├── callback/           # Callback plugins
│   └── module_utils/       # Shared utilities for modules
├── roles/                  # Ansible roles
│   └── my_role/
│       ├── defaults/
│       ├── tasks/
│       └── ...
├── playbooks/              # Playbooks (optional)
│   └── site.yml
├── tests/                  # Unit and integration tests
│   ├── unit/
│   └── integration/
├── changelogs/
│   └── CHANGELOG.rst
└── README.md

Creating a Collection

# Initialize a new collection scaffold
ansible-galaxy collection init mycompany.infrastructure

cd mycompany/infrastructure

# Update galaxy.yml with your metadata
cat > galaxy.yml <<EOF
namespace: mycompany
name: infrastructure
version: 1.0.0
readme: README.md
description: MyCompany infrastructure automation collection
authors:
  - Your Name <[email protected]>
license:
  - Apache-2.0
tags:
  - infrastructure
  - linux
  - networking
  - cloud
dependencies:
  ansible.posix: ">=1.4.0"
  community.general: ">=6.0.0"
repository: https://github.com/mycompany/ansible-collection-infrastructure
documentation: https://docs.mycompany.com/ansible/infrastructure
homepage: https://github.com/mycompany/ansible-collection-infrastructure
issues: https://github.com/mycompany/ansible-collection-infrastructure/issues
build_ignore:
  - "*.tar.gz"
  - .git
  - tests/output
EOF

Module Development

#!/usr/bin/python
# plugins/modules/manage_service_account.py
# -*- coding: utf-8 -*-

"""
Module to manage service accounts with specific requirements
"""

DOCUMENTATION = r'''
---
module: manage_service_account
short_description: Manage service accounts
description:
  - Creates, modifies, or deletes service accounts with standardized naming.
  - Handles SSH key management and group membership.
version_added: "1.0.0"
author:
  - Your Name (@yourhandle)
options:
  name:
    description:
      - The username for the service account. Must start with 'svc-'.
    required: true
    type: str
  state:
    description:
      - Whether the account should exist or not.
    choices: [ absent, present ]
    default: present
    type: str
  ssh_public_key:
    description:
      - SSH public key to authorize for this account.
    type: str
  groups:
    description:
      - Additional groups the account should belong to.
    type: list
    elements: str
    default: []
seealso:
  - module: ansible.builtin.user
requirements:
  - The executing user must have sudo privileges.
'''

EXAMPLES = r'''
- name: Create service account for application
  mycompany.infrastructure.manage_service_account:
    name: svc-webapp
    state: present
    ssh_public_key: "ssh-rsa AAAA..."
    groups:
      - docker
      - www-data

- name: Remove service account
  mycompany.infrastructure.manage_service_account:
    name: svc-webapp
    state: absent
'''

RETURN = r'''
account:
  description: Details of the service account.
  returned: when state=present
  type: dict
  sample:
    name: svc-webapp
    uid: 1050
    home: /home/svc-webapp
    created: true
'''

from ansible.module_utils.basic import AnsibleModule
import subprocess
import pwd
import os

def get_user_info(username):
    """Get user information if user exists"""
    try:
        return pwd.getpwnam(username)
    except KeyError:
        return None

def create_user(module, name, groups):
    """Create a system user"""
    cmd = ['useradd', '--system', '--shell', '/bin/bash',
           '--create-home', '--home-dir', f'/home/{name}',
           '--comment', f'Service account: {name}']
    
    if groups:
        cmd.extend(['--groups', ','.join(groups)])
    
    cmd.append(name)
    
    rc, stdout, stderr = module.run_command(cmd)
    if rc != 0:
        module.fail_json(msg=f"Failed to create user: {stderr}")
    return True

def manage_ssh_key(module, name, ssh_public_key):
    """Manage SSH authorized keys for the user"""
    home_dir = f'/home/{name}'
    ssh_dir = f'{home_dir}/.ssh'
    auth_keys = f'{ssh_dir}/authorized_keys'
    
    os.makedirs(ssh_dir, mode=0o700, exist_ok=True)
    
    # Write the authorized_keys file
    with open(auth_keys, 'w') as f:
        f.write(ssh_public_key + '\n')
    
    # Set correct permissions
    module.run_command(['chown', '-R', f'{name}:{name}', ssh_dir])
    module.run_command(['chmod', '600', auth_keys])

def run_module():
    module_args = dict(
        name=dict(type='str', required=True),
        state=dict(type='str', default='present', choices=['absent', 'present']),
        ssh_public_key=dict(type='str', required=False, no_log=False),
        groups=dict(type='list', elements='str', default=[]),
    )

    result = dict(changed=False, account={})
    module = AnsibleModule(argument_spec=module_args, supports_check_mode=True)

    name = module.params['name']
    state = module.params['state']
    ssh_public_key = module.params.get('ssh_public_key')
    groups = module.params['groups']

    # Validate naming convention
    if not name.startswith('svc-'):
        module.fail_json(msg=f"Service account name must start with 'svc-', got: {name}")

    user_info = get_user_info(name)
    
    if state == 'present':
        if user_info is None:
            if not module.check_mode:
                create_user(module, name, groups)
                if ssh_public_key:
                    manage_ssh_key(module, name, ssh_public_key)
                user_info = get_user_info(name)
            result['changed'] = True
        
        result['account'] = {
            'name': name,
            'uid': user_info.pw_uid if user_info else None,
            'home': f'/home/{name}',
            'created': result['changed']
        }
    
    elif state == 'absent':
        if user_info is not None:
            if not module.check_mode:
                rc, stdout, stderr = module.run_command(['userdel', '-r', name])
                if rc != 0:
                    module.fail_json(msg=f"Failed to delete user: {stderr}")
            result['changed'] = True

    module.exit_json(**result)

def main():
    run_module()

if __name__ == '__main__':
    main()

Plugin Creation

# plugins/filter/my_filters.py
"""
Custom Jinja2 filter plugins for mycompany.infrastructure collection
"""

from __future__ import absolute_import, division, print_function
__metaclass__ = type

DOCUMENTATION = '''
name: mycompany.infrastructure
short_description: Filters for infrastructure automation
description:
  - Collection of Jinja2 filters for common infrastructure tasks.
'''

def to_service_name(value, prefix='svc-'):
    """Convert a name to a service account name format"""
    name = str(value).lower()
    name = name.replace(' ', '-').replace('_', '-')
    if not name.startswith(prefix):
        name = f"{prefix}{name}"
    return name

def parse_cidr(value):
    """Parse a CIDR notation and return a dict with network components"""
    import ipaddress
    try:
        network = ipaddress.ip_network(value, strict=False)
        return {
            'network': str(network.network_address),
            'netmask': str(network.netmask),
            'prefix': network.prefixlen,
            'broadcast': str(network.broadcast_address),
            'first_host': str(list(network.hosts())[0]) if network.num_addresses > 2 else str(network.network_address),
            'last_host': str(list(network.hosts())[-1]) if network.num_addresses > 2 else str(network.broadcast_address),
        }
    except ValueError as e:
        raise Exception(f"Invalid CIDR notation '{value}': {e}")

def filter_by_tag(hosts, tag_key, tag_value=None):
    """Filter a list of hosts by tag value"""
    result = []
    for host in hosts:
        tags = host.get('tags', {})
        if tag_key in tags:
            if tag_value is None or tags[tag_key] == tag_value:
                result.append(host)
    return result

class FilterModule(object):
    """Ansible jinja2 filters"""

    def filters(self):
        return {
            'to_service_name': to_service_name,
            'parse_cidr': parse_cidr,
            'filter_by_tag': filter_by_tag,
        }

Testing and Validation

# Unit tests for modules
mkdir -p tests/unit/plugins/modules

cat > tests/unit/plugins/modules/test_manage_service_account.py <<'EOF'
import pytest
from unittest.mock import MagicMock, patch
import sys
import os

# Add the collection to path
sys.path.insert(0, os.path.join(os.path.dirname(__file__), '../../../../'))

from plugins.modules.manage_service_account import to_service_name

class TestManageServiceAccount:
    def test_service_name_validation(self):
        """Test that non-svc prefixed names fail"""
        # This would normally use AnsibleModule mock
        assert True  # Placeholder

class TestFilters:
    def test_to_service_name(self):
        from plugins.filter.my_filters import to_service_name
        assert to_service_name("webapp") == "svc-webapp"
        assert to_service_name("svc-webapp") == "svc-webapp"
        assert to_service_name("My App") == "svc-my-app"

    def test_parse_cidr(self):
        from plugins.filter.my_filters import parse_cidr
        result = parse_cidr("192.168.1.0/24")
        assert result['network'] == "192.168.1.0"
        assert result['prefix'] == 24
        assert result['netmask'] == "255.255.255.0"
EOF

# Run unit tests
pytest tests/unit/ -v

# Lint checks
ansible-lint
flake8 plugins/

# Validate collection metadata
ansible-galaxy collection build --output-path /tmp/test-build/
ansible-galaxy collection install /tmp/test-build/*.tar.gz --force

# Integration tests using Molecule
molecule test

# Validate module documentation
ansible-doc -t module manage_service_account

Galaxy Publishing

# Get your Ansible Galaxy API token
# Go to https://galaxy.ansible.com/ui/token/

# Build the collection
ansible-galaxy collection build

# This creates: mycompany-infrastructure-1.0.0.tar.gz

# Publish to Ansible Galaxy
ansible-galaxy collection publish \
  mycompany-infrastructure-1.0.0.tar.gz \
  --api-key=YOUR_GALAXY_API_TOKEN

# Publish to a private Automation Hub
ansible-galaxy collection publish \
  mycompany-infrastructure-1.0.0.tar.gz \
  --server=https://hub.mycompany.com/api/galaxy/ \
  --api-key=YOUR_TOKEN

# Configure private server in ansible.cfg
cat > ~/.ansible.cfg <<EOF
[galaxy]
server_list = automation_hub, release_galaxy

[galaxy_server.automation_hub]
url=https://hub.mycompany.com/api/galaxy/
auth_url=https://hub.mycompany.com/api/galaxy/v3/auth/token/
token=your_token_here

[galaxy_server.release_galaxy]
url=https://galaxy.ansible.com
token=your_galaxy_token
EOF

# Tag and release (GitHub Actions example)
git tag 1.0.0
git push origin 1.0.0
# CI will build and publish automatically

Dependency Management

# requirements.yml - install collection dependencies
---
collections:
  - name: mycompany.infrastructure
    version: ">=1.0.0,<2.0.0"
    source: https://galaxy.ansible.com

  - name: ansible.posix
    version: ">=1.4.0"

  - name: community.general
    version: ">=6.0.0"

  # From a Git repository
  - name: https://github.com/mycompany/private-collection.git
    type: git
    version: main

  # From a tarball URL
  - name: https://internal.example.com/files/mycompany-network-2.0.0.tar.gz
    type: url
# Install all dependencies
ansible-galaxy collection install -r requirements.yml

# Install to a specific path
ansible-galaxy collection install -r requirements.yml \
  -p ./collections/

# For offline/air-gapped environments, download first
ansible-galaxy collection download -r requirements.yml \
  --download-path /tmp/collections/
# Then copy to target and install:
ansible-galaxy collection install /tmp/collections/*.tar.gz \
  --offline

Using Collections in Playbooks

# playbooks/deploy.yml
---
- name: Deploy application
  hosts: webservers
  collections:
    # Declare collections used (optional in newer Ansible, but good practice)
    - mycompany.infrastructure
    - ansible.posix

  tasks:
    # Use FQCN (Fully Qualified Collection Name) - always works
    - name: Create service account
      mycompany.infrastructure.manage_service_account:
        name: svc-webapp
        state: present
        ssh_public_key: "{{ lookup('file', '~/.ssh/id_rsa.pub') }}"
        groups:
          - docker

    # Use role from collection with FQCN
    - name: Include web server role
      ansible.builtin.include_role:
        name: mycompany.infrastructure.nginx

  vars:
    # Use custom filter from collection
    service_name: "{{ 'webapp' | mycompany.infrastructure.to_service_name }}"
# ansible.cfg configuration for collections
[defaults]
collections_path = ~/.ansible/collections:/usr/share/ansible/collections:./collections

Troubleshooting

Collection not found after install:

# Check where collections are installed
ansible-galaxy collection list

# Verify COLLECTIONS_PATHS
ansible-config dump | grep COLLECTIONS

# Force reinstall
ansible-galaxy collection install mycompany.infrastructure --force

Module documentation not generating:

# Check DOCUMENTATION string syntax
python3 -c "import yaml; yaml.safe_load(open('plugins/modules/my_module.py').read().split('r\'\'\'')[1].split('\'\'\'')[0])"

# Validate with ansible-doc
ansible-doc -t module mycompany.infrastructure.my_module

Build fails:

# Check galaxy.yml syntax
python3 -c "import yaml; yaml.safe_load(open('galaxy.yml'))"

# Ensure build_ignore patterns are correct
ansible-galaxy collection build --output-path /tmp/ --force

# List what's in the tarball
tar -tzf /tmp/mycompany-infrastructure-*.tar.gz | head -20

Conclusion

Ansible Collections provide a mature packaging model for sharing and versioning automation code across teams and organizations. By developing custom modules for infrastructure-specific operations, creating reusable filter plugins, and publishing to Galaxy or a private Automation Hub, you build an organization-wide automation library that teams can consume with simple requirements.yml dependencies. Pair collection development with Molecule testing and semantic versioning for a professional-grade Ansible automation workflow.