Pipeline Configuration

Execute complete LiDAR processing workflows using declarative YAML configuration files. The pipeline command allows you to automate the entire process from download to patch creation.

Overview

The pipeline command provides a powerful way to manage complex workflows:

Declarative: Define your workflow in a YAML file
Reproducible: Version control your processing parameters
Flexible: Configure only the stages you need
Shareable: Easy collaboration with configuration files

Pipeline Architecture

Workflow Stages

Quick Start

1. Create Example Configuration

# Create a full pipeline configuration
ign-lidar-hd pipeline my_config.yaml --create-example full

# Or create stage-specific configs
ign-lidar-hd pipeline enrich_config.yaml --create-example enrich
ign-lidar-hd pipeline patch_config.yaml --create-example patch

2. Edit Configuration

# my_config.yaml
global:
  num_workers: 4

enrich:
  input_dir: "data/raw"
  output: "data/enriched"
  mode: "full"
  add_rgb: true
  rgb_cache_dir: "cache/orthophotos"

patch:
  input_dir: "data/enriched"
  output: "data/patches"
  lod_level: "LOD2"
  num_points: 16384

3. Run Pipeline

ign-lidar-hd pipeline my_config.yaml

Configuration Structure

Global Settings

Settings that apply to all stages:

global:
  num_workers: 4 # Number of parallel workers
  output_dir: "data/" # Base output directory (optional)

Download Stage

Download LiDAR tiles from IGN:

download:
  bbox: "2.3, 48.8, 2.4, 48.9" # WGS84: lon_min,lat_min,lon_max,lat_max
  output: "data/raw" # Output directory
  max_tiles: 10 # Optional: limit tiles
  num_workers: 3 # Optional: parallel downloads

Enrich Stage

Enrich LAZ files with geometric features and RGB:

enrich:
  input_dir: "data/raw" # Input LAZ files
  output: "data/enriched" # Output directory
  mode: "full" # 'core' or 'full'
  k_neighbors: 10 # Neighbors for features
  use_gpu: true # GPU acceleration
  add_rgb: true # Add RGB from orthophotos
  rgb_cache_dir: "cache/ortho" # RGB cache directory
  num_workers: 4 # Parallel processing
  auto_convert_qgis: false # QGIS format conversion
  force: false # Force reprocessing

Patch Stage

Create training patches:

patch:
  input_dir: "data/enriched" # Input LAZ files
  output: "data/patches" # Output directory
  lod_level: "LOD2" # 'LOD2' or 'LOD3'
  patch_size: 150.0 # Patch size in meters
  patch_overlap: 0.1 # Overlap ratio (0.0-1.0)
  num_points: 16384 # Points per patch
  augment: true # Data augmentation
  num_augmentations: 3 # Augmented versions
  num_workers: 4 # Parallel processing
  include_architectural_style: false # Style features
  style_encoding: "constant" # 'constant' or 'multihot'
  force: false # Force reprocessing

Example Workflows

Full Pipeline

Download, enrich, and create patches in one workflow:

# pipeline_full.yaml
global:
  num_workers: 4

download:
  bbox: "2.3, 48.8, 2.4, 48.9"
  output: "data/raw"
  max_tiles: 10

enrich:
  input_dir: "data/raw"
  output: "data/enriched"
  mode: "full"
  use_gpu: true
  add_rgb: true
  rgb_cache_dir: "cache/orthophotos"

patch:
  input_dir: "data/enriched"
  output: "data/patches"
  lod_level: "LOD2"
  patch_size: 150.0
  num_points: 16384
  augment: true

Run with:

ign-lidar-hd pipeline pipeline_full.yaml

Enrich Only

Process existing tiles with geometric features and RGB:

# pipeline_enrich.yaml
global:
  num_workers: 4

enrich:
  input_dir: "data/raw"
  output: "data/enriched"
  mode: "full"
  k_neighbors: 10
  use_gpu: true
  add_rgb: true
  rgb_cache_dir: "cache/orthophotos"

Patch Only

Create patches from already enriched tiles:

# pipeline_patch.yaml
global:
  num_workers: 4

patch:
  input_dir: "data/enriched"
  output: "data/patches"
  lod_level: "LOD2"
  patch_size: 150.0
  num_points: 16384

Use Cases

Production Workflow

High-quality processing with all features:

global:
  num_workers: 8

enrich:
  mode: "full" # All features
  use_gpu: true
  add_rgb: true

patch:
  num_points: 16384 # Full patches
  augment: true
  num_augmentations: 5

Development/Testing

Fast iteration with minimal processing:

global:
  num_workers: 2

enrich:
  mode: "core" # Basic features
  use_gpu: false
  add_rgb: false

patch:
  num_points: 4096 # Smaller patches
  augment: false

Regional Processing

Process different regions with specific settings:

# paris_urban.yaml
enrich:
  input_dir: "paris_tiles/"
  mode: "full"
  add_rgb: true

patch:
  input_dir: "paris_enriched/"
  lod_level: "LOD3"

Python API

Use configurations programmatically:

from pathlib import Path
from ign_lidar.pipeline_config import PipelineConfig
from ign_lidar.cli import cmd_pipeline

# Load configuration
config = PipelineConfig(Path("my_config.yaml"))

# Check configured stages
print(f"Has download: {config.has_download}")
print(f"Has enrich: {config.has_enrich}")
print(f"Has patch: {config.has_patch}")

# Get stage configuration
if config.has_enrich:
    enrich_cfg = config.get_enrich_config()
    print(f"Mode: {enrich_cfg['mode']}")
    print(f"RGB: {enrich_cfg.get('add_rgb', False)}")

Create Configuration Programmatically

import yaml
from pathlib import Path

config = {
    'global': {'num_workers': 4},
    'enrich': {
        'input_dir': 'data/raw',
        'output': 'data/enriched',
        'mode': 'full',
        'add_rgb': True,
    },
    'patch': {
        'input_dir': 'data/enriched',
        'output': 'data/patches',
        'lod_level': 'LOD2',
    },
}

with open('config.yaml', 'w') as f:
    yaml.dump(config, f, default_flow_style=False)

Benefits

Reproducibility

Version control your configurations
Exact same parameters every time
Easy to track changes

Simplicity

Before (multiple commands):

ign-lidar-hd download --bbox "..." --output data/raw
ign-lidar-hd enrich --input-dir data/raw --output data/enriched ...
ign-lidar-hd patch --input-dir data/enriched --output data/patches ...

After (single command):

ign-lidar-hd pipeline my_workflow.yaml

Collaboration

Share configuration files with team
Document processing workflows
Create reusable templates

Best Practices

1. Use Descriptive Names

✅ Good:
├── paris_urban_LOD2.yaml
├── rural_buildings_LOD3.yaml
└── test_small_dataset.yaml

❌ Avoid:
├── config1.yaml
├── test.yaml
└── new.yaml

2. Add Comments

enrich:
  # Use building mode for urban areas with complex geometry
  mode: "full"

  # RGB improves classification accuracy by 5-10%
  add_rgb: true

  # GPU cuts processing time by 60%
  use_gpu: true

3. Use Relative Paths

# ✅ Good - portable
enrich:
  input_dir: "data/raw"
  output: "data/enriched"

# ❌ Avoid - hard to share
enrich:
  input_dir: "/home/user/project/data/raw"

4. Version Your Configs

# Version: 1.2
# Date: 2025-10-03
# Author: Data Team
# Purpose: Production pipeline for urban classification

global:
  num_workers: 8

Troubleshooting

Configuration Not Found

Error: Configuration file not found: my_config.yaml

Solution: Use absolute path

ign-lidar-hd pipeline $(pwd)/my_config.yaml

Invalid YAML Syntax

Error: YAML parse error

Solution: Validate YAML

python -c "import yaml; yaml.safe_load(open('config.yaml'))"

Stage Failed

Error: Enrich stage failed

Solution: Run stage separately to debug

ign-lidar-hd enrich --input-dir data/raw --output data/test

Overview​

Pipeline Architecture​

Workflow Stages​

Quick Start​

1. Create Example Configuration​

2. Edit Configuration​

3. Run Pipeline​

Configuration Structure​

Global Settings​

Download Stage​

Enrich Stage​

Patch Stage​

Example Workflows​

Full Pipeline​

Enrich Only​

Patch Only​

Use Cases​

Production Workflow​

Development/Testing​

Regional Processing​

Python API​

Create Configuration Programmatically​

Benefits​

Reproducibility​

Simplicity​

Collaboration​

Best Practices​

1. Use Descriptive Names​

2. Add Comments​

3. Use Relative Paths​

4. Version Your Configs​

Troubleshooting​

Configuration Not Found​

Invalid YAML Syntax​

Stage Failed​

See Also​