Skip to main content

Tile Stitching for Multi-Tile Datasets

Seamlessly combine multiple LiDAR tiles into unified training datasets with automatic neighbor detection and consistency management.


🎯 What is Tile Stitching?​

Tile stitching enables processing multiple adjacent LiDAR tiles as a single cohesive dataset, automatically handling:

  • Neighbor tile detection
  • Coordinate system consistency
  • Cross-tile patch sampling
  • Metadata unification
  • Dataset-level statistics

πŸš€ Quick Start​

Basic Stitching (V5 Configuration)​

# Process multiple tiles as unified dataset
ign-lidar-hd process \
input_dir=data/raw/ \
output_dir=output/ \
stitching.enabled=true

With Boundary-Aware Features​

# Combine stitching with boundary-aware processing
ign-lidar-hd process \
input_dir=data/raw/ \
output_dir=output/ \
stitching.enabled=true \
processor.patch_overlap=0.1 \
features.boundary_aware=true \
features.buffer_size=5.0

πŸ“Š How It Works​

1. Tile Discovery​

2. Neighbor Detection​

The system automatically detects tile relationships:

Tile Grid (1000m tiles):
β”Œβ”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”
β”‚ A-1 β”‚ A-2 β”‚ A-3 β”‚ Tile naming: X_Y coordinates
β”œβ”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€
β”‚ B-1 β”‚ B-2 β”‚ B-3 β”‚ Example: 1234_5678.laz
β”œβ”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€
β”‚ C-1 β”‚ C-2 β”‚ C-3 β”‚
β””β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”˜

Each tile knows its 8 potential neighbors:
N, S, E, W, NE, NW, SE, SW

3. Unified Processing​

# Unified dataset creation
1. Load all tiles in directory
2. Compute global statistics (mean, std, etc.)
3. Apply consistent normalization
4. Generate patches respecting tile boundaries
5. Create unified metadata

βš™οΈ Configuration (V5)​

Complete V5 Configuration​

# config.yaml (V5)
defaults:
- base/processor
- base/features
- base/data_sources
- base/output
- base/monitoring
- _self_

# Tile stitching configuration
stitching:
enabled: true

# Tile grid settings
auto_detect_neighbors: true
tile_size: 1000.0 # Expected tile size in meters
naming_pattern: "{x}_{y}" # Tile naming convention

# Buffer settings
buffer_size: 0.0 # No buffer by default
overlap_handling: "average" # How to handle overlaps: average, first, last

# Processing options
unified_normalization: true # Use global statistics
cross_tile_features: false # Compute features across tile boundaries

# Processor overlap (for boundary-aware processing)
processor:
patch_size: 100.0
patch_overlap: 0.1 # 10% overlap between patches

# Boundary-aware features (optional)
features:
boundary_aware: false # Enable cross-tile feature computation
buffer_size: 5.0 # Buffer zone for boundary features (meters)

# Output settings
output:
save_stitched_metadata: true # Save tile relationship metadata
include_tile_id: true # Include source tile ID in patches

Key Parameters​

ParameterTypeDefaultDescription
stitching.enabledboolfalseEnable tile stitching
stitching.tile_sizefloat1000.0Expected tile size in meters
stitching.overlap_tolerancefloat10Allowed overlap in meters
stitching.min_pointsint1000Minimum points per tile
stitching.neighbor_searchstringautoNeighbor detection method

Neighbor Search Methods​

# Automatic detection (recommended)
stitching.neighbor_search=auto

# Grid-based (faster, assumes regular grid)
stitching.neighbor_search=grid

# Distance-based (flexible, handles irregular layouts)
stitching.neighbor_search=distance

πŸ“ Output Structure​

Without Stitching (Default)​

output/
β”œβ”€β”€ patches/
β”‚ β”œβ”€β”€ tile_1234_5678/
β”‚ β”‚ β”œβ”€β”€ patch_0000.npy
β”‚ β”‚ β”œβ”€β”€ patch_0001.npy
β”‚ β”‚ └── ...
β”‚ β”œβ”€β”€ tile_1234_5679/
β”‚ β”‚ β”œβ”€β”€ patch_0000.npy
β”‚ β”‚ └── ...
β”‚ └── ...
└── metadata/
β”œβ”€β”€ tile_1234_5678.json
└── tile_1234_5679.json

With Stitching (Unified)​

output/
β”œβ”€β”€ patches/
β”‚ β”œβ”€β”€ patch_0000.npy ← All tiles combined
β”‚ β”œβ”€β”€ patch_0001.npy
β”‚ β”œβ”€β”€ ...
β”‚ └── patch_9999.npy
β”œβ”€β”€ metadata.json ← Unified metadata
β”œβ”€β”€ tile_index.json ← Tileβ†’patch mapping
└── stitching_info.json ← Neighbor relationships

🎯 Use Cases​

1. Regional Datasets​

Process an entire region as one dataset:

# Download Paris city center
ign-lidar-hd download \
bbox="2.3,48.85,2.4,48.9" \
output_dir=data/paris/

# Process as unified dataset
ign-lidar-hd process \
input_dir=data/paris/ \
output_dir=output/paris_dataset/ \
stitching.enabled=true \
features=full

2. Building-Spanning Datasets​

Handle buildings that span multiple tiles:

# Large building dataset
ign-lidar-hd process \
input_dir=data/buildings/ \
output_dir=output/buildings_dataset/ \
stitching.enabled=true \
features.boundary_aware=true \
target_class=building

3. Continuous Landscapes​

Process forests, coastlines, or other continuous features:

# Forest landscape
ign-lidar-hd process \
input_dir=data/forest/ \
output_dir=output/forest_dataset/ \
stitching.enabled=true \
features.boundary_aware=true \
target_class=vegetation

πŸ”§ Advanced Usage​

Python API​

from ign_lidar.core import TileStitcher
from ign_lidar.io import TileManager

# Initialize stitcher
stitcher = TileStitcher(
tile_size=1000.0,
overlap_tolerance=10.0,
neighbor_search="auto"
)

# Load and analyze tiles
tile_manager = TileManager(input_dir="data/raw/")
tiles = tile_manager.load_all()

# Build tile grid
grid = stitcher.build_grid(tiles)
print(f"Grid: {grid.shape}, {len(tiles)} tiles")

# Get neighbors for a tile
tile = tiles[0]
neighbors = stitcher.get_neighbors(tile, grid)
print(f"Tile {tile.name} has {len(neighbors)} neighbors")

# Process with stitching
from ign_lidar.core import LiDARProcessor

processor = LiDARProcessor(
stitching_enabled=True,
boundary_aware=True
)
dataset = processor.process(tiles)

Custom Tile Layouts​

Handle non-standard tile arrangements:

from ign_lidar.core import TileStitcher

# Define custom tile positions
tile_positions = {
"tile_A": (0, 0),
"tile_B": (1000, 0),
"tile_C": (0, 1000),
"tile_D": (1200, 500) # Irregular position
}

# Create stitcher with custom layout
stitcher = TileStitcher(
neighbor_search="distance",
max_neighbor_distance=1500
)

grid = stitcher.build_grid_from_positions(tile_positions)

πŸ“ˆ Performance Considerations​

Memory Usage​

Memory = n_tiles Γ— tile_memory + stitching_overhead

Example (10 tiles, 100MB each):
Without stitching: 100 MB (1 tile at a time)
With stitching: 1,200 MB (all tiles + overhead)

Processing Time​

ModeTimeMemoryOutput Quality
No stitching1.0xLowPer-tile
Stitching only1.1xMediumUnified
+ Boundary-aware1.3xHighBest

Optimization Strategies​

# 1. Process in batches
stitching.batch_size=5 # Process 5 tiles at a time

# 2. Reduce buffer for boundary-aware
features.buffer_size=3.0 # Smaller buffer = less memory

# 3. Use grid-based neighbor search
stitching.neighbor_search=grid # Faster for regular grids

# 4. Filter small tiles
stitching.min_points=5000 # Skip tiles with <5k points

πŸ“Š Stitching Metadata​

tile_index.json​

Maps patches back to source tiles:

{
"patch_0000.npy": {
"source_tiles": ["tile_1234_5678"],
"bbox": [2.3, 48.85, 2.301, 48.851],
"num_points": 2048
},
"patch_0001.npy": {
"source_tiles": ["tile_1234_5678", "tile_1234_5679"],
"bbox": [2.3005, 48.85, 2.3015, 48.851],
"num_points": 2048,
"crosses_boundary": true
}
}

stitching_info.json​

Documents tile relationships:

{
"tiles": {
"tile_1234_5678": {
"neighbors": {
"north": "tile_1234_5679",
"east": "tile_1235_5678",
"northeast": "tile_1235_5679"
},
"position": [1234000, 5678000],
"num_points": 1234567
}
},
"grid": {
"shape": [3, 3],
"tile_size": 1000.0,
"total_tiles": 9
}
}

βœ… Best Practices​

Data Preparation​

  1. Consistent naming: Use standard IGN naming (X_Y.laz)
  2. Complete coverage: Ensure no missing tiles in region
  3. Same CRS: All tiles must use same coordinate system
  4. Same format: Consistent LAZ version and point format

Configuration​

# Recommended for most cases
ign-lidar-hd process \
input_dir=data/ \
output_dir=output/ \
stitching.enabled=true \
stitching.neighbor_search=auto \
features.boundary_aware=true \
features.buffer_size=5.0

Quality Checks​

# Verify stitching quality
import json

# Load stitching info
with open("output/stitching_info.json") as f:
info = json.load(f)

# Check tile connectivity
for tile, data in info["tiles"].items():
n_neighbors = len(data["neighbors"])
print(f"{tile}: {n_neighbors} neighbors")

# Expected: Interior tiles have 8, edge tiles have 3-5

πŸ› Troubleshooting​

Missing Neighbors​

# Error: Expected neighbor not found
Solution: Check tile naming and coverage

# List tiles
ls -la data/raw/*.laz

# Expected pattern: tile_XXXX_YYYY.laz

Memory Errors​

# Out of memory error
Solution: Process in batches

stitching.batch_size=3 # Smaller batches

Coordinate Mismatches​

# Error: Tiles don't align
Solution: Verify CRS consistency

# Check CRS for all tiles
for f in data/raw/*.laz; do
pdal info $f | grep "srs"
done

Slow Processing​

# Use grid-based search for regular layouts
stitching.neighbor_search=grid

# Or disable boundary-aware processing
features.boundary_aware=false

πŸŽ“ Complete Example​

Regional Building Classification​

# 1. Download region
ign-lidar-hd download \
bbox="2.35,48.86,2.37,48.88" \
output_dir=data/le_marais/

# 2. Process with full stitching
ign-lidar-hd process \
input_dir=data/le_marais/ \
output_dir=output/le_marais_dataset/ \
stitching.enabled=true \
stitching.neighbor_search=auto \
features.boundary_aware=true \
features.buffer_size=5.0 \
features=full \
target_class=building \
preprocess=aggressive

# 3. Verify output
ls -lh output/le_marais_dataset/patches/ | wc -l
cat output/le_marais_dataset/stitching_info.json | jq '.grid'

# 4. Train model on unified dataset
python train.py \
--data output/le_marais_dataset/ \
--architecture pointnet++ \
--epochs 100


Next Steps: