SIMSHIFT

What is `SIMSHIFT`?

SIMSHIFT is a benchmark designed to evaluate Unsupervised Domain Adaptation (UDA) methods for neural surrogates of physical simulations. In particular, it targets real world industrial scenarios and provides pre-defined distribution shifts across parameter configurations in mesh-based PDE simulations.

The library contains dataloaders, baseline models, unsupervised domain adatpation algorithms and model selection strategies.

Note

For more details on SIMSHIFT, check out our preprint.

Datasets

SIMSHIFT includes four practical datasets, with predefined distribution shifts. All datasets are publicly hosted on huggingface.

Hot Rolling: a metal slab plastically deformed into a sheet metal product.
Sheet Metal Forming: a sheet metal supported at the ends and center, a holder and a punch deforms.
Electric Motor: a structural FEM simulation of a rotor in electric machinery, subjected to mechanical loading at burst speed.
Heatsink: CFD simulation focused on the thermal performance of heat sinks, commonly used in electronic cooling applications.

Dataset	Origin	Samples	Avg. # nodes	Varied simulation params	Dim	Size (GB)
Rolling	Metallurgy	5000	508	4	2D	3.1
Forming	Manufacturing	4000	9,080	4	2D	19
Motor	Machinery	3,195	4,846	15	2D	15
Heatsink	Electronics	512	4,443,114	4	3D	520

Models

SIMSHIFT includes various implemented machine learning models, commonly used in the field of AI for simulation.

PointNet [Qi et al., 2017.] integrates global context by aggregating local features from all input points into a shared global representation.
GraphSAGE [Hamilton et al., 2017.] is a graph neural network that captures local information via message passing, suited for complex meshes but can be computationally expensive.
Transolver [Wu et al., 2024.] is a state of the art Transformer-based model with Physics-Attention, to capture complex geometries and long-range interactions.
UPT [Alkin et al., 2024.] is a state of the art neural operator with a focus on scalability, that represents fields in a latent space and directly learns latent dynamics.
GINO [Li et al., 2023.] is a state of the art neural operator that operates on a regular latent grid in the frequency domain to capture global interactions.

UDA Methods

SIMSHIFT implements several unsupervised domain adaptation (UDA) methods to address distribution shifts between source and target simulation domains.

Correlation Alignment (DeepCORAL) [Sun and Saenko, 2016.] aligns the covariance of the source and target feature representations.
Central Moment Discrepancy (CMD) [Zellinger et al., 2017.] aligns central moments (mean, variance, skewness, etc.) of the source and target feature representations.
Domain-Adversarial Neural Network (DANN) [Ganin et al., 2016.] introduces a domain classifier and adversarial loss that encourage the feature encoder to learn domain-invariant features.
DARE-GRAM [Nejjar et al., 2023.] aligns a selected low-rank subspace of the pseudo-inverse Gram matrices of source and target features.

Model Selection

SIMSHIFT supports several model selection strategies for unsupervised the unsupervised setting (no labels in the target domain).

Deep Embedded Validation (DEV) [You et al., 2019.] selects the model with the lowest variance in prediction consistency across nearby samples in the target domain, using the idea that robust models produce smooth outputs.
Importance Weighted Validation (IWV) [Sugiyama et al., 2007.] estimates model performance on the target domain by reweighting source validation loss using learned importance weights between source and target feature distributions.
Source Best (SB) chooses the model with the lowest average validation loss on the source domain. This is a naive baseline that assumes source domain performance correlates with target performance.
Target Best (TB) selects the best model per sample using ground truth target losses (oracle). This method is not available during real world deployment but serves as an upper bound for model selection performance.

Each model selection algorithm in SIMSHIFT returns a weight vector over candidate models and can be plugged into ensemble evaluation or winner-takes-all prediction modes.

Getting Started