Aller au contenu principal

System Architecture

Version 2.0 - Modular Architecture

Understanding the v2.0 modular architecture helps you make the most of its capabilities and customize it for your specific needs.

What's New in v2.0?

Complete architectural redesign with 6 dedicated modules, Hydra CLI, and unified pipeline. See Migration Guide if upgrading from v1.x.

πŸ—οΈ Modular Architecture (v2.0)​

οΏ½ Module Structure​

The v2.0 architecture organizes code into 6 specialized modules:

ign_lidar/
β”œβ”€β”€ core/ # 🎯 Core processing orchestration
β”‚ β”œβ”€β”€ processor.py # Main LiDARProcessor class
β”‚ └── pipeline.py # Unified RAWβ†’Patches pipeline
β”‚
β”œβ”€β”€ features/ # πŸ”¬ Feature computation
β”‚ β”œβ”€β”€ computer.py # Base feature computation
β”‚ β”œβ”€β”€ gpu_chunked.py # GPU chunked processing
β”‚ β”œβ”€β”€ boundary.py # Boundary-aware features
β”‚ └── ultra.py # Ultra-high quality features
β”‚
β”œβ”€β”€ preprocessing/ # πŸ”§ Data preprocessing
β”‚ β”œβ”€β”€ outliers.py # Outlier removal (SOR, statistical)
β”‚ β”œβ”€β”€ normalization.py # Ground height normalization
β”‚ β”œβ”€β”€ tiling.py # Tile operations
β”‚ └── filters.py # Classification filtering
β”‚
β”œβ”€β”€ io/ # πŸ’Ύ Input/Output operations
β”‚ β”œβ”€β”€ laz.py # LAZ file reading/writing
β”‚ β”œβ”€β”€ metadata.py # Metadata management
β”‚ β”œβ”€β”€ stitching.py # Tile stitching
β”‚ └── downloader.py # IGN data download
β”‚
β”œβ”€β”€ config/ # βš™οΈ Configuration management
β”‚ └── hydra_config.py # Hydra integration
β”‚
β”œβ”€β”€ configs/ # πŸ“‹ Hydra config files (YAML)
β”‚ β”œβ”€β”€ config.yaml # Root configuration
β”‚ β”œβ”€β”€ preset/ # Presets (fast/balanced/quality/ultra)
β”‚ β”œβ”€β”€ processor/ # Processor configs (cpu/gpu)
β”‚ β”œβ”€β”€ features/ # Feature configs
β”‚ └── preprocess/ # Preprocessing configs
β”‚
└── datasets/ # πŸ—‚οΈ ML dataset classes
β”œβ”€β”€ multi_arch.py # Architecture-agnostic dataset
β”œβ”€β”€ pointnet_pp.py # PointNet++ format
β”œβ”€β”€ octree.py # Octree format
└── transformer.py # Transformer format

Module Responsibilities​

ModulePurposeKey Classes
corePipeline orchestration, workflow managementLiDARProcessor, Pipeline
featuresFeature computation, GPU accelerationFeatureComputer, GPUChunked
preprocessingData cleaning, normalizationremove_outliers, normalize_ground
ioFile operations, metadataread_laz_file, MetadataManager
configConfiguration managementHydraConfig, load_config
datasetsPyTorch datasets for ML trainingMultiArchDataset, PointNetDataset

πŸ”„ Unified Pipeline (v2.0)​

Major Change: Single-step RAW→Patches workflow!

Pipeline Flow​


🎯 Core Module​

The brain of the system - orchestrates the entire pipeline.

LiDARProcessor​

Main class that coordinates all processing:

from ign_lidar.core import LiDARProcessor

processor = LiDARProcessor(
input_dir="data/raw/",
output_dir="output/",
preset="balanced", # fast/balanced/quality/ultra
use_gpu=True
)

processor.run() # Single call for entire pipeline!

Responsibilities:

  • βœ… Pipeline orchestration
  • βœ… Multi-worker parallelization
  • βœ… Progress tracking
  • βœ… Error handling and recovery
  • βœ… Metadata generation

Pipeline​

Unified workflow manager:

  • RAWβ†’Patches in one step
  • Automatic preprocessing
  • Feature computation
  • Patch generation
  • Optional stitching

πŸ”¬ Features Module​

Advanced geometric and RGB feature computation.

FeatureComputer​

Base class for feature computation:

from ign_lidar.features import FeatureComputer

computer = FeatureComputer(
use_rgb=True,
compute_ndvi=True,
boundary_aware=False
)

features = computer.compute(points, colors)

Computed Features:

  • Geometric: Linearity, planarity, sphericity, anisotropy
  • Curvature: Mean, Gaussian, principal curvatures
  • Local: Density, verticality, roughness
  • RGB: Color features, NDVI (vegetation index)
  • Infrared: Near-infrared intensity (optional)

GPUChunked​

Large-scale GPU processing:

  • Automatic memory management
  • Chunked processing for huge tiles
  • 90%+ GPU utilization
  • Memory-aware chunk sizing

BoundaryFeatures​

Cross-tile feature computation:

  • Eliminates edge artifacts
  • Automatic neighbor loading
  • Seamless feature computation
  • Configurable buffer zones

πŸ”§ Preprocessing Module​

Data cleaning and normalization before feature computation.

Outlier Removal​

from ign_lidar.preprocessing import remove_outliers

# Statistical outlier removal
points = remove_outliers(
points,
method="statistical",
nb_neighbors=20,
std_ratio=2.0
)

Methods:

  • statistical: Statistical Outlier Removal (SOR)
  • radius: Radius-based outlier removal

Ground Normalization​

from ign_lidar.preprocessing import normalize_ground

# Normalize heights to ground level
points = normalize_ground(
points,
max_distance=5.0, # Max search distance
resolution=1.0 # Grid resolution
)

Filtering​

from ign_lidar.preprocessing import filter_by_classification

# Keep only building-related classes
points = filter_by_classification(
points,
keep_classes=[2, 3, 4, 5, 6] # Ground, low/med/high veg, building
)

πŸ’Ύ IO Module​

File operations, metadata, and multi-tile workflows.

LAZ Handler​

from ign_lidar.io import read_laz_file, write_laz_file

# Read LAZ file
points, colors, features = read_laz_file("tile.laz")

# Write enriched LAZ with features
write_laz_file(
"enriched_tile.laz",
points,
colors,
features=features # NEW in v2.0
)

Metadata Manager​

Tracks processing provenance:

{
"version": "2.0.1",
"timestamp": "2025-10-08T12:00:00Z",
"input_files": ["tile_1234_5678.laz"],
"processing": {
"preset": "balanced",
"features": ["geometric", "rgb", "ndvi"],
"preprocessing": ["outliers", "ground_norm"]
},
"statistics": {
"total_points": 17234567,
"patches_generated": 423
}
}

Stitching​

Multi-tile workflows:

from ign_lidar.io.stitching import TileStitcher

stitcher = TileStitcher(
tile_dir="tiles/",
output_dir="stitched/",
buffer=10.0
)

stitcher.process_all()

βš™οΈ Config Module​

Hydra-based configuration management.

Configuration Hierarchy​

# config.yaml (root)
defaults:
- preset: balanced
- processor: gpu
- features: standard
- preprocess: standard

input_dir: "data/"
output_dir: "output/"
num_workers: 4

Presets​

Pre-configured workflows:

PresetSpeedQualityFeaturesUse Case
fast⚑⚑⚑⭐BasicQuick testing
balanced⚑⚑⭐⭐⭐StandardProduction (recommended)
quality⚑⭐⭐⭐⭐FullHigh-quality datasets
ultra🐒⭐⭐⭐⭐⭐All + boundaryResearch, seamless output

Dynamic Configuration​

Override any parameter:

ign-lidar-hd process \
preset=balanced \
processor=gpu \
features.use_rgb=true \
features.compute_ndvi=true \
num_workers=8

πŸ—‚οΈ Datasets Module​

PyTorch datasets for multiple ML architectures.

MultiArchDataset​

Architecture-agnostic dataset loader:

from ign_lidar.datasets import MultiArchDataset

dataset = MultiArchDataset(
data_dir="output/patches/",
architecture="pointnet++", # or octree, transformer, sparse_conv
transform=None,
augment=True
)

# Use with PyTorch DataLoader
from torch.utils.data import DataLoader

loader = DataLoader(dataset, batch_size=32, shuffle=True)

Supported Architectures:

  1. PointNet++ - Hierarchical point cloud learning
  2. Octree - Spatial partitioning
  3. Transformer - Attention-based
  4. Sparse Conv - 3D sparse convolutions

πŸš€ GPU Acceleration​

Optimized GPU processing across modules.

GPU Pipeline​

Performance Benefits:

  • ⚑ 10-50x faster feature computation
  • ⚑ 24x faster RGB augmentation
  • ⚑ 90%+ GPU utilization
  • ⚑ Chunked processing for large tiles

GPU Configuration​

# Enable GPU
ign-lidar-hd process processor=gpu

# GPU with chunking (for large tiles)
ign-lidar-hd process \
processor=gpu \
features.gpu_chunk_size=1000000

πŸ“Š Performance Characteristics​

Processing Speed (v2.0)​

PresetCPU (tiles/min)GPU (tiles/min)SpeedupTime per Tile
fast~3-4~10-153-4x5-10 min
balanced~2-3~6-93x15-20 min
quality~1-2~3-53x30-45 min
ultra~0.5-1~1.5-22-3x60+ min

Memory Usage​

Memory Optimization:

  • βœ… Automatic chunking for large tiles
  • βœ… Streaming processing option
  • βœ… Memory-aware worker scaling

Output Size​

Output TypeSize (per tile)CompressionUse Case
RAW LAZ50-200 MBHighOriginal data
Enriched LAZ80-300 MB (+60%)HighVisualization, analysis
Patches (NPZ)10-50 MB per fileMediumML training
Full Output100-400 MBMixedComplete workflow

πŸŽ›οΈ Configuration System (Hydra)​

Hierarchical Composition​

Configuration Precedence​

  1. Base defaults - Built-in optimal defaults
  2. Preset selection - Choose workflow preset
  3. Config files - Project-specific YAML files
  4. Command-line overrides - Immediate parameter changes

Example:

# Uses balanced preset + custom overrides
ign-lidar-hd process \
preset=balanced \ # Preset
processor=gpu \ # Override
features.use_rgb=true \ # Override
num_workers=8 # Override

Key Parameters​

CategoryParametersDefault (balanced)Impact
Input/Outputinput_dir, output_dir, outputoutput=patchesData flow
Performanceprocessor, num_workerscpu, 4Speed
Featuresfeatures, use_rgb, compute_ndvistandardQuality
Preprocesspreprocess, remove_outliersstandardData cleaning
Datasetpatch_size, architecture50m, pointnet++ML format
Boundaryboundary_aware, boundary_bufferfalse, 10.0Edge handling
Stitchingstitching, tile_overlapnone, 0.0Multi-tile

οΏ½ Extension Points​

The modular v2.0 architecture supports extensive customization:

1. Custom Feature Extractors​

from ign_lidar.features import FeatureComputer

class CustomFeatureComputer(FeatureComputer):
def compute_custom_feature(self, points):
# Your custom feature logic
return custom_features

def compute(self, points, colors):
# Call parent for standard features
features = super().compute(points, colors)

# Add custom features
features['custom'] = self.compute_custom_feature(points)
return features

2. Custom Preprocessing​

from ign_lidar.preprocessing import BasePreprocessor

class CustomPreprocessor(BasePreprocessor):
def preprocess(self, points):
# Your custom preprocessing
return processed_points

3. Custom Dataset Formats​

from ign_lidar.datasets import BaseDataset

class CustomArchDataset(BaseDataset):
def __init__(self, data_dir, **kwargs):
super().__init__(data_dir, **kwargs)

def __getitem__(self, idx):
# Load and format data for your architecture
return data, labels

4. Processing Hooks​

from ign_lidar.core import LiDARProcessor

class CustomProcessor(LiDARProcessor):
def post_feature_hook(self, points, features):
# Custom logic after feature computation
return points, features

def pre_patch_hook(self, points):
# Custom logic before patch generation
return points

5. Custom Configuration​

# configs/custom/my_workflow.yaml
defaults:
- /preset: quality
- /processor: gpu
- _self_

# Custom parameters
custom_param: true
my_threshold: 0.85

# Override preset values
features:
use_rgb: true
compute_ndvi: true
custom_feature: true # Your extension

πŸ—οΈ Design Principles​

The v2.0 architecture follows these principles:

1. Modularity​

  • Clear separation of concerns
  • Independent, testable modules
  • Minimal inter-module dependencies

2. Composability​

  • Mix and match presets and configs
  • Hierarchical configuration
  • Override any parameter

3. Extensibility​

  • Plugin architecture for features
  • Custom preprocessing pipelines
  • Multiple dataset formats

4. Performance​

  • GPU acceleration throughout
  • Parallel processing
  • Memory-efficient chunking

5. Reliability​

  • Comprehensive error handling
  • Automatic recovery
  • Detailed logging

6. Usability​

  • Sensible defaults
  • Progressive disclosure
  • Backward compatibility

πŸ”„ Evolution from v1.x​

Aspectv1.xv2.0Improvement
StructureFlat, monolithicModular, 6 modulesBetter organization
CLILegacy onlyHydra + LegacyModern + backward compat
PipelineMulti-stepUnified single-stepSimpler workflow
ConfigCommand argsHydra hierarchicalMore flexible
FeaturesPer-tileBoundary-aware optionNo edge artifacts
GPUBasicChunked + optimized90%+ utilization
ExtensibilityLimitedPlugin architectureEasy customization
TestingLimitedComprehensive suiteHigher quality

πŸ“š Next Steps​


The modular v2.0 architecture provides the flexibility and performance needed for production LiDAR processing workflows. πŸš€