Skip to main content

Multi-Architecture Dataset Support

Generate datasets optimized for different deep learning architectures from a single processing pipeline.


🎯 What is Multi-Architecture Support?​

Multi-architecture support enables you to create datasets tailored for different ML architectures (PointNet++, Octree-based networks, Transformers, Sparse CNNs) without reprocessing the raw LiDAR data.

Supported Architectures​

ArchitectureFormatBest ForMemory
PointNet++Raw pointsGeneral purposeMedium
Octree-basedOctreeLarge-scale scenesLow
TransformerPoint tokensHigh accuracyHigh
Sparse ConvolutionVoxel gridFast inferenceMedium

πŸš€ Quick Start​

Generate for Specific Architecture​

# PointNet++ (default)
ign-lidar-hd process \
input_dir=data/ \
output_dir=output/ \
architecture=pointnet++

# Octree-based networks
ign-lidar-hd process \
input_dir=data/ \
output_dir=output/ \
architecture=octree

# Transformer networks
ign-lidar-hd process \
input_dir=data/ \
output_dir=output/ \
architecture=transformer

# Sparse Convolutional networks
ign-lidar-hd process \
input_dir=data/ \
output_dir=output/ \
architecture=sparse_conv

Generate Multiple Formats​

# Create datasets for multiple architectures
ign-lidar-hd process \
input_dir=data/ \
output_dir=output/ \
architecture=all

πŸ“Š Architecture Details​

PointNet++ (Default)​

Format: Raw point clouds with per-point features

# Output format
{
'points': np.ndarray, # (N, 3) XYZ coordinates
'features': np.ndarray, # (N, F) per-point features
'labels': np.ndarray, # (N,) per-point labels
'normals': np.ndarray # (N, 3) normal vectors
}

Configuration:

ign-lidar-hd process \
input_dir=data/ \
output_dir=output/ \
architecture=pointnet++ \
architecture.num_points=2048 \
architecture.use_normals=true \
architecture.sampling=fps # Farthest Point Sampling

Best for:

  • General-purpose point cloud classification
  • Building detection and segmentation
  • Moderate-sized datasets

Octree-Based​

Format: Hierarchical octree structure

# Output format
{
'octree': OctreeNode, # Hierarchical structure
'depth': int, # Maximum depth
'features': Dict, # Features per node
'labels': Dict # Labels per node
}

Configuration:

ign-lidar-hd process \
input_dir=data/ \
output_dir=output/ \
architecture=octree \
architecture.max_depth=8 \
architecture.min_points_per_node=10 \
architecture.full_depth=5

Best for:

  • Large-scale urban scenes
  • Memory-efficient processing
  • Multi-scale analysis

Transformer​

Format: Point tokens with positional encoding

# Output format
{
'tokens': np.ndarray, # (N, D) point tokens
'positions': np.ndarray, # (N, 3) positions
'attention_mask': np.ndarray, # (N, N) attention mask
'labels': np.ndarray # (N,) labels
}

Configuration:

ign-lidar-hd process \
input_dir=data/ \
output_dir=output/ \
architecture=transformer \
architecture.num_tokens=1024 \
architecture.token_dim=256 \
architecture.positional_encoding=learned

Best for:

  • High-accuracy requirements
  • Complex scene understanding
  • Sufficient GPU memory available

Sparse Convolutional​

Format: Voxelized point cloud with sparse tensors

# Output format
{
'voxels': np.ndarray, # (V, max_points, 3) voxel coordinates
'voxel_features': np.ndarray, # (V, F) voxel-level features
'coordinates': np.ndarray, # (V, 3) voxel grid coordinates
'labels': np.ndarray # (V,) voxel labels
}

Configuration:

ign-lidar-hd process \
input_dir=data/ \
output_dir=output/ \
architecture=sparse_conv \
architecture.voxel_size=0.25 \
architecture.max_points_per_voxel=32 \
architecture.max_voxels=20000

Best for:

  • Fast inference
  • Real-time applications
  • Regular grid structures

🎯 Use Cases​

Research & Experimentation​

Compare architectures on same dataset:

# Generate all formats
ign-lidar-hd process \
input_dir=data/buildings/ \
output_dir=output/multi_arch/ \
architecture=all \
features=full

# Results in:
# output/multi_arch/pointnet++/
# output/multi_arch/octree/
# output/multi_arch/transformer/
# output/multi_arch/sparse_conv/

Production Pipeline​

Optimize for specific deployment:

# Fast inference for mobile/edge
ign-lidar-hd process \
input_dir=data/ \
output_dir=output/ \
architecture=sparse_conv \
architecture.voxel_size=0.5

# High accuracy for cloud processing
ign-lidar-hd process \
input_dir=data/ \
output_dir=output/ \
architecture=transformer \
architecture.token_dim=512

πŸ”§ Advanced Configuration​

Custom Architecture Parameters​

# config/custom_arch.yaml
architecture:
name: pointnet++
num_points: 4096
use_normals: true
use_colors: true
sampling: fps
fps_ratio: 0.25
ball_query_radius: 0.5
ball_query_samples: 32
feature_dimensions: [64, 128, 256, 512]
# Use custom config
ign-lidar-hd process \
input_dir=data/ \
output_dir=output/ \
--config-name custom_arch

Python API​

from ign_lidar.datasets import (
PointNetPlusDataset,
OctreeDataset,
TransformerDataset,
SparseConvDataset
)

# PointNet++ dataset
dataset = PointNetPlusDataset(
data_dir="output/patches/",
num_points=2048,
use_normals=True,
augment=True
)

# Octree dataset
octree_dataset = OctreeDataset(
data_dir="output/patches/",
max_depth=8,
min_points=10
)

# Transformer dataset
transformer_dataset = TransformerDataset(
data_dir="output/patches/",
num_tokens=1024,
token_dim=256
)

# Sparse Conv dataset
sparse_dataset = SparseConvDataset(
data_dir="output/patches/",
voxel_size=0.25,
max_voxels=20000
)

# Use with PyTorch DataLoader
from torch.utils.data import DataLoader

loader = DataLoader(
dataset,
batch_size=32,
shuffle=True,
num_workers=4
)

πŸ“ˆ Performance Comparison​

Processing Time​

ArchitectureTime per TileDisk UsageMemory Usage
PointNet++1.0x100%100%
Octree1.3x60%70%
Transformer1.2x120%150%
Sparse Conv1.4x80%90%

Training Speed​

ArchitectureSamples/secGPU MemoryInference Speed
PointNet++1006 GB10 ms
Octree804 GB8 ms
Transformer5012 GB15 ms
Sparse Conv1205 GB5 ms

Accuracy Comparison​

Based on building classification benchmark:

ArchitectureIoUPrecisionRecallF1
PointNet++0.850.880.900.89
Octree0.830.860.880.87
Transformer0.890.910.930.92
Sparse Conv0.860.890.900.89

βœ… Best Practices​

Choosing an Architecture​

Use PointNet++ when:

  • Starting a new project
  • General-purpose classification
  • Moderate dataset size (<1M points)
  • Standard accuracy requirements

Use Octree when:

  • Processing very large scenes
  • Limited memory available
  • Need multi-scale features
  • Hierarchical reasoning important

Use Transformer when:

  • Maximum accuracy needed
  • Sufficient GPU memory (12+ GB)
  • Complex scene understanding
  • Can afford longer training

Use Sparse Conv when:

  • Fast inference critical
  • Deploying to edge devices
  • Real-time processing needed
  • Regular grid structure present

Data Augmentation​

Different architectures benefit from different augmentations:

# PointNet++: Standard point cloud augmentations
augmentations = [
'random_rotation',
'random_jitter',
'random_scaling'
]

# Octree: Preserve hierarchy
augmentations = [
'random_rotation_90', # Preserve grid alignment
'random_flip'
]

# Transformer: Token-level augmentations
augmentations = [
'token_dropout',
'random_masking',
'feature_mixing'
]

# Sparse Conv: Voxel-aware augmentations
augmentations = [
'random_rotation_90', # Grid-aligned
'voxel_dropout',
'cutmix'
]

πŸŽ“ Complete Example​

Multi-Architecture Experiment​

# 1. Generate datasets for all architectures
ign-lidar-hd process \
input_dir=data/buildings/ \
output_dir=output/experiment/ \
architecture=all \
features=full \
target_class=building

# 2. Train models
for arch in pointnet++ octree transformer sparse_conv; do
python train.py \
--data output/experiment/$arch/ \
--architecture $arch \
--epochs 100 \
--output models/$arch/
done

# 3. Evaluate
python evaluate.py \
--models models/ \
--test_data data/test/ \
--output results.csv

# 4. Compare results
python plot_comparison.py --results results.csv

πŸ› Troubleshooting​

Out of Memory​

# Reduce points/tokens
architecture.num_points=1024 # PointNet++
architecture.num_tokens=512 # Transformer

# Or use memory-efficient architecture
architecture=octree

Slow Processing​

# Use faster architecture
architecture=sparse_conv

# Or reduce complexity
architecture.max_depth=6 # Octree
architecture.voxel_size=0.5 # Sparse Conv

Low Accuracy​

# Increase model capacity
architecture.num_points=4096 # PointNet++
architecture.token_dim=512 # Transformer
architecture.max_points_per_voxel=64 # Sparse Conv

# Or use high-accuracy architecture
architecture=transformer


Next Steps: