System Architecture
Version 2.0 - Modular Architecture
Understanding the v2.0 modular architecture helps you make the most of its capabilities and customize it for your specific needs.
Complete architectural redesign with 6 dedicated modules, Hydra CLI, and unified pipeline. See Migration Guide if upgrading from v1.x.
ποΈ Modular Architecture (v2.0)β
οΏ½ Module Structureβ
The v2.0 architecture organizes code into 6 specialized modules:
ign_lidar/
βββ core/ # π― Core processing orchestration
β βββ processor.py # Main LiDARProcessor class
β βββ pipeline.py # Unified RAWβPatches pipeline
β
βββ features/ # π¬ Feature computation
β βββ computer.py # Base feature computation
β βββ gpu_chunked.py # GPU chunked processing
β βββ boundary.py # Boundary-aware features
β βββ ultra.py # Ultra-high quality features
β
βββ preprocessing/ # π§ Data preprocessing
β βββ outliers.py # Outlier removal (SOR, statistical)
β βββ normalization.py # Ground height normalization
β βββ tiling.py # Tile operations
β βββ filters.py # Classification filtering
β
βββ io/ # πΎ Input/Output operations
β βββ laz.py # LAZ file reading/writing
β βββ metadata.py # Metadata management
β βββ stitching.py # Tile stitching
β βββ downloader.py # IGN data download
β
βββ config/ # βοΈ Configuration management
β βββ hydra_config.py # Hydra integration
β
βββ configs/ # π Hydra config files (YAML)
β βββ config.yaml # Root configuration
β βββ preset/ # Presets (fast/balanced/quality/ultra)
β βββ processor/ # Processor configs (cpu/gpu)
β βββ features/ # Feature configs
β βββ preprocess/ # Preprocessing configs
β
βββ datasets/ # ποΈ ML dataset classes
βββ multi_arch.py # Architecture-agnostic dataset
βββ pointnet_pp.py # PointNet++ format
βββ octree.py # Octree format
βββ transformer.py # Transformer format
Module Responsibilitiesβ
Module | Purpose | Key Classes |
---|---|---|
core | Pipeline orchestration, workflow management | LiDARProcessor , Pipeline |
features | Feature computation, GPU acceleration | FeatureComputer , GPUChunked |
preprocessing | Data cleaning, normalization | remove_outliers , normalize_ground |
io | File operations, metadata | read_laz_file , MetadataManager |
config | Configuration management | HydraConfig , load_config |
datasets | PyTorch datasets for ML training | MultiArchDataset , PointNetDataset |
π Unified Pipeline (v2.0)β
Major Change: Single-step RAWβPatches workflow!
Pipeline Flowβ
π― Core Moduleβ
The brain of the system - orchestrates the entire pipeline.
LiDARProcessorβ
Main class that coordinates all processing:
from ign_lidar.core import LiDARProcessor
processor = LiDARProcessor(
input_dir="data/raw/",
output_dir="output/",
preset="balanced", # fast/balanced/quality/ultra
use_gpu=True
)
processor.run() # Single call for entire pipeline!
Responsibilities:
- β Pipeline orchestration
- β Multi-worker parallelization
- β Progress tracking
- β Error handling and recovery
- β Metadata generation
Pipelineβ
Unified workflow manager:
- RAWβPatches in one step
- Automatic preprocessing
- Feature computation
- Patch generation
- Optional stitching
π¬ Features Moduleβ
Advanced geometric and RGB feature computation.
FeatureComputerβ
Base class for feature computation:
from ign_lidar.features import FeatureComputer
computer = FeatureComputer(
use_rgb=True,
compute_ndvi=True,
boundary_aware=False
)
features = computer.compute(points, colors)
Computed Features:
- Geometric: Linearity, planarity, sphericity, anisotropy
- Curvature: Mean, Gaussian, principal curvatures
- Local: Density, verticality, roughness
- RGB: Color features, NDVI (vegetation index)
- Infrared: Near-infrared intensity (optional)
GPUChunkedβ
Large-scale GPU processing:
- Automatic memory management
- Chunked processing for huge tiles
- 90%+ GPU utilization
- Memory-aware chunk sizing
BoundaryFeaturesβ
Cross-tile feature computation:
- Eliminates edge artifacts
- Automatic neighbor loading
- Seamless feature computation
- Configurable buffer zones
π§ Preprocessing Moduleβ
Data cleaning and normalization before feature computation.
Outlier Removalβ
from ign_lidar.preprocessing import remove_outliers
# Statistical outlier removal
points = remove_outliers(
points,
method="statistical",
nb_neighbors=20,
std_ratio=2.0
)
Methods:
statistical
: Statistical Outlier Removal (SOR)radius
: Radius-based outlier removal
Ground Normalizationβ
from ign_lidar.preprocessing import normalize_ground
# Normalize heights to ground level
points = normalize_ground(
points,
max_distance=5.0, # Max search distance
resolution=1.0 # Grid resolution
)
Filteringβ
from ign_lidar.preprocessing import filter_by_classification
# Keep only building-related classes
points = filter_by_classification(
points,
keep_classes=[2, 3, 4, 5, 6] # Ground, low/med/high veg, building
)
πΎ IO Moduleβ
File operations, metadata, and multi-tile workflows.
LAZ Handlerβ
from ign_lidar.io import read_laz_file, write_laz_file
# Read LAZ file
points, colors, features = read_laz_file("tile.laz")
# Write enriched LAZ with features
write_laz_file(
"enriched_tile.laz",
points,
colors,
features=features # NEW in v2.0
)
Metadata Managerβ
Tracks processing provenance:
{
"version": "2.0.1",
"timestamp": "2025-10-08T12:00:00Z",
"input_files": ["tile_1234_5678.laz"],
"processing": {
"preset": "balanced",
"features": ["geometric", "rgb", "ndvi"],
"preprocessing": ["outliers", "ground_norm"]
},
"statistics": {
"total_points": 17234567,
"patches_generated": 423
}
}
Stitchingβ
Multi-tile workflows:
from ign_lidar.io.stitching import TileStitcher
stitcher = TileStitcher(
tile_dir="tiles/",
output_dir="stitched/",
buffer=10.0
)
stitcher.process_all()
βοΈ Config Moduleβ
Hydra-based configuration management.
Configuration Hierarchyβ
# config.yaml (root)
defaults:
- preset: balanced
- processor: gpu
- features: standard
- preprocess: standard
input_dir: "data/"
output_dir: "output/"
num_workers: 4
Presetsβ
Pre-configured workflows:
Preset | Speed | Quality | Features | Use Case |
---|---|---|---|---|
fast | β‘β‘β‘ | β | Basic | Quick testing |
balanced | β‘β‘ | βββ | Standard | Production (recommended) |
quality | β‘ | ββββ | Full | High-quality datasets |
ultra | π’ | βββββ | All + boundary | Research, seamless output |
Dynamic Configurationβ
Override any parameter:
ign-lidar-hd process \
preset=balanced \
processor=gpu \
features.use_rgb=true \
features.compute_ndvi=true \
num_workers=8
ποΈ Datasets Moduleβ
PyTorch datasets for multiple ML architectures.
MultiArchDatasetβ
Architecture-agnostic dataset loader:
from ign_lidar.datasets import MultiArchDataset
dataset = MultiArchDataset(
data_dir="output/patches/",
architecture="pointnet++", # or octree, transformer, sparse_conv
transform=None,
augment=True
)
# Use with PyTorch DataLoader
from torch.utils.data import DataLoader
loader = DataLoader(dataset, batch_size=32, shuffle=True)
Supported Architectures:
- PointNet++ - Hierarchical point cloud learning
- Octree - Spatial partitioning
- Transformer - Attention-based
- Sparse Conv - 3D sparse convolutions
π GPU Accelerationβ
Optimized GPU processing across modules.
GPU Pipelineβ
Performance Benefits:
- β‘ 10-50x faster feature computation
- β‘ 24x faster RGB augmentation
- β‘ 90%+ GPU utilization
- β‘ Chunked processing for large tiles
GPU Configurationβ
# Enable GPU
ign-lidar-hd process processor=gpu
# GPU with chunking (for large tiles)
ign-lidar-hd process \
processor=gpu \
features.gpu_chunk_size=1000000
π Performance Characteristicsβ
Processing Speed (v2.0)β
Preset | CPU (tiles/min) | GPU (tiles/min) | Speedup | Time per Tile |
---|---|---|---|---|
fast | ~3-4 | ~10-15 | 3-4x | 5-10 min |
balanced | ~2-3 | ~6-9 | 3x | 15-20 min |
quality | ~1-2 | ~3-5 | 3x | 30-45 min |
ultra | ~0.5-1 | ~1.5-2 | 2-3x | 60+ min |
Memory Usageβ
Memory Optimization:
- β Automatic chunking for large tiles
- β Streaming processing option
- β Memory-aware worker scaling
Output Sizeβ
Output Type | Size (per tile) | Compression | Use Case |
---|---|---|---|
RAW LAZ | 50-200 MB | High | Original data |
Enriched LAZ | 80-300 MB (+60%) | High | Visualization, analysis |
Patches (NPZ) | 10-50 MB per file | Medium | ML training |
Full Output | 100-400 MB | Mixed | Complete workflow |
ποΈ Configuration System (Hydra)β
Hierarchical Compositionβ
Configuration Precedenceβ
- Base defaults - Built-in optimal defaults
- Preset selection - Choose workflow preset
- Config files - Project-specific YAML files
- Command-line overrides - Immediate parameter changes
Example:
# Uses balanced preset + custom overrides
ign-lidar-hd process \
preset=balanced \ # Preset
processor=gpu \ # Override
features.use_rgb=true \ # Override
num_workers=8 # Override
Key Parametersβ
Category | Parameters | Default (balanced) | Impact |
---|---|---|---|
Input/Output | input_dir , output_dir , output | output=patches | Data flow |
Performance | processor , num_workers | cpu , 4 | Speed |
Features | features , use_rgb , compute_ndvi | standard | Quality |
Preprocess | preprocess , remove_outliers | standard | Data cleaning |
Dataset | patch_size , architecture | 50m , pointnet++ | ML format |
Boundary | boundary_aware , boundary_buffer | false , 10.0 | Edge handling |
Stitching | stitching , tile_overlap | none , 0.0 | Multi-tile |
οΏ½ Extension Pointsβ
The modular v2.0 architecture supports extensive customization:
1. Custom Feature Extractorsβ
from ign_lidar.features import FeatureComputer
class CustomFeatureComputer(FeatureComputer):
def compute_custom_feature(self, points):
# Your custom feature logic
return custom_features
def compute(self, points, colors):
# Call parent for standard features
features = super().compute(points, colors)
# Add custom features
features['custom'] = self.compute_custom_feature(points)
return features
2. Custom Preprocessingβ
from ign_lidar.preprocessing import BasePreprocessor
class CustomPreprocessor(BasePreprocessor):
def preprocess(self, points):
# Your custom preprocessing
return processed_points
3. Custom Dataset Formatsβ
from ign_lidar.datasets import BaseDataset
class CustomArchDataset(BaseDataset):
def __init__(self, data_dir, **kwargs):
super().__init__(data_dir, **kwargs)
def __getitem__(self, idx):
# Load and format data for your architecture
return data, labels
4. Processing Hooksβ
from ign_lidar.core import LiDARProcessor
class CustomProcessor(LiDARProcessor):
def post_feature_hook(self, points, features):
# Custom logic after feature computation
return points, features
def pre_patch_hook(self, points):
# Custom logic before patch generation
return points
5. Custom Configurationβ
# configs/custom/my_workflow.yaml
defaults:
- /preset: quality
- /processor: gpu
- _self_
# Custom parameters
custom_param: true
my_threshold: 0.85
# Override preset values
features:
use_rgb: true
compute_ndvi: true
custom_feature: true # Your extension
ποΈ Design Principlesβ
The v2.0 architecture follows these principles:
1. Modularityβ
- Clear separation of concerns
- Independent, testable modules
- Minimal inter-module dependencies
2. Composabilityβ
- Mix and match presets and configs
- Hierarchical configuration
- Override any parameter
3. Extensibilityβ
- Plugin architecture for features
- Custom preprocessing pipelines
- Multiple dataset formats
4. Performanceβ
- GPU acceleration throughout
- Parallel processing
- Memory-efficient chunking
5. Reliabilityβ
- Comprehensive error handling
- Automatic recovery
- Detailed logging
6. Usabilityβ
- Sensible defaults
- Progressive disclosure
- Backward compatibility
π Evolution from v1.xβ
Aspect | v1.x | v2.0 | Improvement |
---|---|---|---|
Structure | Flat, monolithic | Modular, 6 modules | Better organization |
CLI | Legacy only | Hydra + Legacy | Modern + backward compat |
Pipeline | Multi-step | Unified single-step | Simpler workflow |
Config | Command args | Hydra hierarchical | More flexible |
Features | Per-tile | Boundary-aware option | No edge artifacts |
GPU | Basic | Chunked + optimized | 90%+ utilization |
Extensibility | Limited | Plugin architecture | Easy customization |
Testing | Limited | Comprehensive suite | Higher quality |
π Next Stepsβ
- Quick Start - Get started with v2.0
- Hydra CLI Guide - Master the new CLI
- Migration Guide - Upgrade from v1.x
- API Reference - Complete API documentation
- Configuration Guide - Deep dive into configs
The modular v2.0 architecture provides the flexibility and performance needed for production LiDAR processing workflows. π