Pipeline Configuration
Execute complete LiDAR processing workflows using declarative YAML configuration files. The pipeline command allows you to automate the entire process from download to patch creation.
Overview
The pipeline
command provides a powerful way to manage complex workflows:
- Declarative: Define your workflow in a YAML file
- Reproducible: Version control your processing parameters
- Flexible: Configure only the stages you need
- Shareable: Easy collaboration with configuration files
Pipeline Architecture
Workflow Stages
Quick Start
1. Create Example Configuration
# Create a full pipeline configuration
ign-lidar-hd pipeline my_config.yaml --create-example full
# Or create stage-specific configs
ign-lidar-hd pipeline enrich_config.yaml --create-example enrich
ign-lidar-hd pipeline patch_config.yaml --create-example patch
2. Edit Configuration
# my_config.yaml
global:
num_workers: 4
enrich:
input_dir: "data/raw"
output: "data/enriched"
mode: "full"
add_rgb: true
rgb_cache_dir: "cache/orthophotos"
patch:
input_dir: "data/enriched"
output: "data/patches"
lod_level: "LOD2"
num_points: 16384
3. Run Pipeline
ign-lidar-hd pipeline my_config.yaml
Configuration Structure
Global Settings
Settings that apply to all stages:
global:
num_workers: 4 # Number of parallel workers
output_dir: "data/" # Base output directory (optional)
Download Stage
Download LiDAR tiles from IGN:
download:
bbox: "2.3, 48.8, 2.4, 48.9" # WGS84: lon_min,lat_min,lon_max,lat_max
output: "data/raw" # Output directory
max_tiles: 10 # Optional: limit tiles
num_workers: 3 # Optional: parallel downloads
Enrich Stage
Enrich LAZ files with geometric features and RGB:
enrich:
input_dir: "data/raw" # Input LAZ files
output: "data/enriched" # Output directory
mode: "full" # 'core' or 'full'
k_neighbors: 10 # Neighbors for features
use_gpu: true # GPU acceleration
add_rgb: true # Add RGB from orthophotos
rgb_cache_dir: "cache/ortho" # RGB cache directory
num_workers: 4 # Parallel processing
auto_convert_qgis: false # QGIS format conversion
force: false # Force reprocessing
Patch Stage
Create training patches:
patch:
input_dir: "data/enriched" # Input LAZ files
output: "data/patches" # Output directory
lod_level: "LOD2" # 'LOD2' or 'LOD3'
patch_size: 150.0 # Patch size in meters
patch_overlap: 0.1 # Overlap ratio (0.0-1.0)
num_points: 16384 # Points per patch
augment: true # Data augmentation
num_augmentations: 3 # Augmented versions
num_workers: 4 # Parallel processing
include_architectural_style: false # Style features
style_encoding: "constant" # 'constant' or 'multihot'
force: false # Force reprocessing
Example Workflows
Full Pipeline
Download, enrich, and create patches in one workflow:
# pipeline_full.yaml
global:
num_workers: 4
download:
bbox: "2.3, 48.8, 2.4, 48.9"
output: "data/raw"
max_tiles: 10
enrich:
input_dir: "data/raw"
output: "data/enriched"
mode: "full"
use_gpu: true
add_rgb: true
rgb_cache_dir: "cache/orthophotos"
patch:
input_dir: "data/enriched"
output: "data/patches"
lod_level: "LOD2"
patch_size: 150.0
num_points: 16384
augment: true
Run with:
ign-lidar-hd pipeline pipeline_full.yaml
Enrich Only
Process existing tiles with geometric features and RGB:
# pipeline_enrich.yaml
global:
num_workers: 4
enrich:
input_dir: "data/raw"
output: "data/enriched"
mode: "full"
k_neighbors: 10
use_gpu: true
add_rgb: true
rgb_cache_dir: "cache/orthophotos"
Patch Only
Create patches from already enriched tiles:
# pipeline_patch.yaml
global:
num_workers: 4
patch:
input_dir: "data/enriched"
output: "data/patches"
lod_level: "LOD2"
patch_size: 150.0
num_points: 16384
Use Cases
Production Workflow
High-quality processing with all features:
global:
num_workers: 8
enrich:
mode: "full" # All features
use_gpu: true
add_rgb: true
patch:
num_points: 16384 # Full patches
augment: true
num_augmentations: 5
Development/Testing
Fast iteration with minimal processing:
global:
num_workers: 2
enrich:
mode: "core" # Basic features
use_gpu: false
add_rgb: false
patch:
num_points: 4096 # Smaller patches
augment: false
Regional Processing
Process different regions with specific settings:
# paris_urban.yaml
enrich:
input_dir: "paris_tiles/"
mode: "full"
add_rgb: true
patch:
input_dir: "paris_enriched/"
lod_level: "LOD3"
Python API
Use configurations programmatically:
from pathlib import Path
from ign_lidar.pipeline_config import PipelineConfig
from ign_lidar.cli import cmd_pipeline
# Load configuration
config = PipelineConfig(Path("my_config.yaml"))
# Check configured stages
print(f"Has download: {config.has_download}")
print(f"Has enrich: {config.has_enrich}")
print(f"Has patch: {config.has_patch}")
# Get stage configuration
if config.has_enrich:
enrich_cfg = config.get_enrich_config()
print(f"Mode: {enrich_cfg['mode']}")
print(f"RGB: {enrich_cfg.get('add_rgb', False)}")
Create Configuration Programmatically
import yaml
from pathlib import Path
config = {
'global': {'num_workers': 4},
'enrich': {
'input_dir': 'data/raw',
'output': 'data/enriched',
'mode': 'full',
'add_rgb': True,
},
'patch': {
'input_dir': 'data/enriched',
'output': 'data/patches',
'lod_level': 'LOD2',
},
}
with open('config.yaml', 'w') as f:
yaml.dump(config, f, default_flow_style=False)
Benefits
Reproducibility
- Version control your configurations
- Exact same parameters every time
- Easy to track changes
Simplicity
Before (multiple commands):
ign-lidar-hd download --bbox "..." --output data/raw
ign-lidar-hd enrich --input-dir data/raw --output data/enriched ...
ign-lidar-hd patch --input-dir data/enriched --output data/patches ...
After (single command):
ign-lidar-hd pipeline my_workflow.yaml
Collaboration
- Share configuration files with team
- Document processing workflows
- Create reusable templates
Best Practices
1. Use Descriptive Names
✅ Good:
├── paris_urban_LOD2.yaml
├── rural_buildings_LOD3.yaml
└── test_small_dataset.yaml
❌ Avoid:
├── config1.yaml
├── test.yaml
└── new.yaml
2. Add Comments
enrich:
# Use building mode for urban areas with complex geometry
mode: "full"
# RGB improves classification accuracy by 5-10%
add_rgb: true
# GPU cuts processing time by 60%
use_gpu: true
3. Use Relative Paths
# ✅ Good - portable
enrich:
input_dir: "data/raw"
output: "data/enriched"
# ❌ Avoid - hard to share
enrich:
input_dir: "/home/user/project/data/raw"
4. Version Your Configs
# Version: 1.2
# Date: 2025-10-03
# Author: Data Team
# Purpose: Production pipeline for urban classification
global:
num_workers: 8
Troubleshooting
Configuration Not Found
Error: Configuration file not found: my_config.yaml
Solution: Use absolute path
ign-lidar-hd pipeline $(pwd)/my_config.yaml
Invalid YAML Syntax
Error: YAML parse error
Solution: Validate YAML
python -c "import yaml; yaml.safe_load(open('config.yaml'))"
Stage Failed
Error: Enrich stage failed
Solution: Run stage separately to debug
ign-lidar-hd enrich --input-dir data/raw --output data/test