Aller au contenu principal

Pipeline Configuration

Execute complete LiDAR processing workflows using declarative YAML configuration files. The pipeline command allows you to automate the entire process from download to patch creation.

Overview

The pipeline command provides a powerful way to manage complex workflows:

  • Declarative: Define your workflow in a YAML file
  • Reproducible: Version control your processing parameters
  • Flexible: Configure only the stages you need
  • Shareable: Easy collaboration with configuration files

Pipeline Architecture

Workflow Stages

Quick Start

1. Create Example Configuration

# Create a full pipeline configuration
ign-lidar-hd pipeline my_config.yaml --create-example full

# Or create stage-specific configs
ign-lidar-hd pipeline enrich_config.yaml --create-example enrich
ign-lidar-hd pipeline patch_config.yaml --create-example patch

2. Edit Configuration

# my_config.yaml
global:
num_workers: 4

enrich:
input_dir: "data/raw"
output: "data/enriched"
mode: "full"
add_rgb: true
rgb_cache_dir: "cache/orthophotos"

patch:
input_dir: "data/enriched"
output: "data/patches"
lod_level: "LOD2"
num_points: 16384

3. Run Pipeline

ign-lidar-hd pipeline my_config.yaml

Configuration Structure

Global Settings

Settings that apply to all stages:

global:
num_workers: 4 # Number of parallel workers
output_dir: "data/" # Base output directory (optional)

Download Stage

Download LiDAR tiles from IGN:

download:
bbox: "2.3, 48.8, 2.4, 48.9" # WGS84: lon_min,lat_min,lon_max,lat_max
output: "data/raw" # Output directory
max_tiles: 10 # Optional: limit tiles
num_workers: 3 # Optional: parallel downloads

Enrich Stage

Enrich LAZ files with geometric features and RGB:

enrich:
input_dir: "data/raw" # Input LAZ files
output: "data/enriched" # Output directory
mode: "full" # 'core' or 'full'
k_neighbors: 10 # Neighbors for features
use_gpu: true # GPU acceleration
add_rgb: true # Add RGB from orthophotos
rgb_cache_dir: "cache/ortho" # RGB cache directory
num_workers: 4 # Parallel processing
auto_convert_qgis: false # QGIS format conversion
force: false # Force reprocessing

Patch Stage

Create training patches:

patch:
input_dir: "data/enriched" # Input LAZ files
output: "data/patches" # Output directory
lod_level: "LOD2" # 'LOD2' or 'LOD3'
patch_size: 150.0 # Patch size in meters
patch_overlap: 0.1 # Overlap ratio (0.0-1.0)
num_points: 16384 # Points per patch
augment: true # Data augmentation
num_augmentations: 3 # Augmented versions
num_workers: 4 # Parallel processing
include_architectural_style: false # Style features
style_encoding: "constant" # 'constant' or 'multihot'
force: false # Force reprocessing

Example Workflows

Full Pipeline

Download, enrich, and create patches in one workflow:

# pipeline_full.yaml
global:
num_workers: 4

download:
bbox: "2.3, 48.8, 2.4, 48.9"
output: "data/raw"
max_tiles: 10

enrich:
input_dir: "data/raw"
output: "data/enriched"
mode: "full"
use_gpu: true
add_rgb: true
rgb_cache_dir: "cache/orthophotos"

patch:
input_dir: "data/enriched"
output: "data/patches"
lod_level: "LOD2"
patch_size: 150.0
num_points: 16384
augment: true

Run with:

ign-lidar-hd pipeline pipeline_full.yaml

Enrich Only

Process existing tiles with geometric features and RGB:

# pipeline_enrich.yaml
global:
num_workers: 4

enrich:
input_dir: "data/raw"
output: "data/enriched"
mode: "full"
k_neighbors: 10
use_gpu: true
add_rgb: true
rgb_cache_dir: "cache/orthophotos"

Patch Only

Create patches from already enriched tiles:

# pipeline_patch.yaml
global:
num_workers: 4

patch:
input_dir: "data/enriched"
output: "data/patches"
lod_level: "LOD2"
patch_size: 150.0
num_points: 16384

Use Cases

Production Workflow

High-quality processing with all features:

global:
num_workers: 8

enrich:
mode: "full" # All features
use_gpu: true
add_rgb: true

patch:
num_points: 16384 # Full patches
augment: true
num_augmentations: 5

Development/Testing

Fast iteration with minimal processing:

global:
num_workers: 2

enrich:
mode: "core" # Basic features
use_gpu: false
add_rgb: false

patch:
num_points: 4096 # Smaller patches
augment: false

Regional Processing

Process different regions with specific settings:

# paris_urban.yaml
enrich:
input_dir: "paris_tiles/"
mode: "full"
add_rgb: true

patch:
input_dir: "paris_enriched/"
lod_level: "LOD3"

Python API

Use configurations programmatically:

from pathlib import Path
from ign_lidar.pipeline_config import PipelineConfig
from ign_lidar.cli import cmd_pipeline

# Load configuration
config = PipelineConfig(Path("my_config.yaml"))

# Check configured stages
print(f"Has download: {config.has_download}")
print(f"Has enrich: {config.has_enrich}")
print(f"Has patch: {config.has_patch}")

# Get stage configuration
if config.has_enrich:
enrich_cfg = config.get_enrich_config()
print(f"Mode: {enrich_cfg['mode']}")
print(f"RGB: {enrich_cfg.get('add_rgb', False)}")

Create Configuration Programmatically

import yaml
from pathlib import Path

config = {
'global': {'num_workers': 4},
'enrich': {
'input_dir': 'data/raw',
'output': 'data/enriched',
'mode': 'full',
'add_rgb': True,
},
'patch': {
'input_dir': 'data/enriched',
'output': 'data/patches',
'lod_level': 'LOD2',
},
}

with open('config.yaml', 'w') as f:
yaml.dump(config, f, default_flow_style=False)

Benefits

Reproducibility

  • Version control your configurations
  • Exact same parameters every time
  • Easy to track changes

Simplicity

Before (multiple commands):

ign-lidar-hd download --bbox "..." --output data/raw
ign-lidar-hd enrich --input-dir data/raw --output data/enriched ...
ign-lidar-hd patch --input-dir data/enriched --output data/patches ...

After (single command):

ign-lidar-hd pipeline my_workflow.yaml

Collaboration

  • Share configuration files with team
  • Document processing workflows
  • Create reusable templates

Best Practices

1. Use Descriptive Names

✅ Good:
├── paris_urban_LOD2.yaml
├── rural_buildings_LOD3.yaml
└── test_small_dataset.yaml

❌ Avoid:
├── config1.yaml
├── test.yaml
└── new.yaml

2. Add Comments

enrich:
# Use building mode for urban areas with complex geometry
mode: "full"

# RGB improves classification accuracy by 5-10%
add_rgb: true

# GPU cuts processing time by 60%
use_gpu: true

3. Use Relative Paths

# ✅ Good - portable
enrich:
input_dir: "data/raw"
output: "data/enriched"

# ❌ Avoid - hard to share
enrich:
input_dir: "/home/user/project/data/raw"

4. Version Your Configs

# Version: 1.2
# Date: 2025-10-03
# Author: Data Team
# Purpose: Production pipeline for urban classification

global:
num_workers: 8

Troubleshooting

Configuration Not Found

Error: Configuration file not found: my_config.yaml

Solution: Use absolute path

ign-lidar-hd pipeline $(pwd)/my_config.yaml

Invalid YAML Syntax

Error: YAML parse error

Solution: Validate YAML

python -c "import yaml; yaml.safe_load(open('config.yaml'))"

Stage Failed

Error: Enrich stage failed

Solution: Run stage separately to debug

ign-lidar-hd enrich --input-dir data/raw --output data/test

See Also