Shortcuts

torchgeo.models

Aurora

torchgeo.models.aurora_swin_unet(weights=None, *args, **kwargs)[source]

Aurora model.

If you use this model in your research, please cite the following paper:

This dataset requires the following additional library to be installed:

New in version 0.8.

Parameters:
  • weights (torchgeo.models.aurora.Aurora_Weights | None) – Pre-trained model weights to use.

  • *args (Any) – Additional arguments to pass to aurora.Aurora

  • **kwargs (Any) – Additional keyword arguments to pass to aurora.Aurora

Returns:

An Aurora model.

Return type:

Module

class torchgeo.models.Aurora_Weights(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)[source]

Bases: WeightsEnum

Aurora weights.

If you use this model in your research, please cite the following paper:

New in version 0.8.

Change Star

class torchgeo.models.ChangeStar(dense_feature_extractor, seg_classifier, changemixin, inference_mode='t1t2')[source]

Bases: Module

The base class of the network architecture of ChangeStar.

ChangeStar is composed of an any segmentation model and a ChangeMixin module. This model is mainly used for binary/multi-class change detection under bitemporal supervision and single-temporal supervision. It features the property of segmentation architecture reusing, which is helpful to integrate advanced dense prediction (e.g., semantic segmentation) network architecture into change detection.

For multi-class change detection, semantic change prediction can be inferred by a binary change prediction from the ChangeMixin module and two semantic predictions from the Segmentation model.

If you use this model in your research, please cite the following paper:

__init__(dense_feature_extractor, seg_classifier, changemixin, inference_mode='t1t2')[source]

Initializes a new ChangeStar model.

Parameters:
  • dense_feature_extractor (Module) – module for dense feature extraction, typically a semantic segmentation model without semantic segmentation head.

  • seg_classifier (Module) – semantic segmentation head, typically a convolutional layer followed by an upsampling layer.

  • changemixin (ChangeMixin) – torchgeo.models.ChangeMixin module

  • inference_mode (str) – name of inference mode 't1t2' | 't2t1' | 'mean'. 't1t2': concatenate bitemporal features in the order of t1->t2; 't2t1': concatenate bitemporal features in the order of t2->t1; 'mean': the weighted mean of the output of 't1t2' and 't1t2'

forward(x)[source]

Forward pass of the model.

Parameters:

x (Tensor) – a bitemporal input tensor of shape [B, T, C, H, W]

Returns:

a dictionary containing bitemporal semantic segmentation logit and binary change detection logit/probability

Return type:

dict[str, torch.Tensor]

class torchgeo.models.ChangeStarFarSeg(backbone='resnet50', classes=1, backbone_pretrained=True)[source]

Bases: ChangeStar

The network architecture of ChangeStar(FarSeg).

ChangeStar(FarSeg) is composed of a FarSeg model and a ChangeMixin module.

If you use this model in your research, please cite the following paper:

__init__(backbone='resnet50', classes=1, backbone_pretrained=True)[source]

Initializes a new ChangeStarFarSeg model.

Parameters:
  • backbone (str) – name of ResNet backbone

  • classes (int) – number of output segmentation classes

  • backbone_pretrained (bool) – whether to use pretrained weight for backbone

class torchgeo.models.ChangeMixin(in_channels=256, inner_channels=16, num_convs=4, scale_factor=4.0)[source]

Bases: Module

This module enables any segmentation model to detect binary change.

The common usage is to attach this module on a segmentation model without the classification head.

If you use this model in your research, please cite the following paper:

__init__(in_channels=256, inner_channels=16, num_convs=4, scale_factor=4.0)[source]

Initializes a new ChangeMixin module.

Parameters:
  • in_channels (int) – sum of channels of bitemporal feature maps

  • inner_channels (int) – number of channels of inner feature maps

  • num_convs (int) – number of convolution blocks

  • scale_factor (float) – number of upsampling factor

forward(bi_feature)[source]

Forward pass of the model.

Parameters:

bi_feature (Tensor) – input bitemporal feature maps of shape [b, t, c, h, w]

Returns:

a list of bidirected output predictions

Return type:

list[torch.Tensor]

Copernicus-FM

class torchgeo.models.CopernicusFM(img_size=224, patch_size=16, drop_rate=0.0, embed_dim=1024, depth=24, num_heads=16, hyper_dim=128, num_classes=0, global_pool=True, mlp_ratio=4.0, norm_layer=<class 'torch.nn.modules.normalization.LayerNorm'>)[source]

Bases: Module

CopernicusFM: VisionTransformer backbone.

Example

1. Spectral Mode (Using Wavelength and Bandwidth):

>>> model = CopernicusFM()
>>> x = torch.randn(1, 4, 224, 224) # input image
>>> metadata = torch.full((1, 4), float('nan')) # [lon (degree), lat (degree), delta_time (days since 1970/1/1), patch_token_area (km^2)], assume unknown
>>> wavelengths = [490, 560, 665, 842] # wavelength (nm): B,G,R,NIR (Sentinel 2)
>>> bandwidths = [65, 35, 30, 115] # bandwidth (nm): B,G,R,NIR (Sentinel 2)
>>> kernel_size = 16 # expected patch size
>>> input_mode = 'spectral'
>>> logit = model(x, metadata, wavelengths=wavelengths, bandwidths=bandwidths, input_mode=input_mode, kernel_size=kernel_size)
>>> print(logit.shape)

2. Variable Mode (Using language embedding):

>>> model = CopernicusFM()
>>> varname = 'Sentinel 5P Nitrogen Dioxide' # variable name (as input to a LLM for language embed)
>>> x = torch.randn(1, 1, 56, 56) # input image
>>> metadata = torch.full((1, 4), float('nan')) # [lon (degree), lat (degree), delta_time (days since 1970/1/1), patch_token_area (km^2)], assume unknown
>>> language_embed = torch.randn(2048) # language embedding: encode varname with a LLM (e.g. Llama)
>>> kernel_size = 4 # expected patch size
>>> input_mode = 'variable'
>>> logit = model(x, metadata, language_embed=language_embed, input_mode=input_mode, kernel_size=kernel_size)
>>> print(logit.shape)
__init__(img_size=224, patch_size=16, drop_rate=0.0, embed_dim=1024, depth=24, num_heads=16, hyper_dim=128, num_classes=0, global_pool=True, mlp_ratio=4.0, norm_layer=<class 'torch.nn.modules.normalization.LayerNorm'>)[source]

Initialize a new CopernicusFM instance.

Parameters:
  • img_size (int) – Input image size.

  • patch_size (int) – Patch size.

  • drop_rate (float) – Head dropout rate.

  • embed_dim (int) – Transformer embedding dimension.

  • depth (int) – Depth of transformer.

  • num_heads (int) – Number of attention heads.

  • hyper_dim (int) – Dimensions of dynamic weight generator.

  • num_classes (int) – Number of classes for classification head.

  • global_pool (bool) – Whether or not to perform global pooling.

  • mlp_ratio (float) – Ratio of MLP hidden dim to embedding dim.

  • norm_layer (type[torch.nn.modules.module.Module]) – Normalization layer.

get_coord_pos_embed(lons, lats, embed_dim)[source]

Geospatial coordinate position embedding.

Parameters:
  • lons (Tensor) – Longitudes (x).

  • lats (Tensor) – Latitudes (y).

  • embed_dim (int) – Embedding dimension.

Returns:

Coordinate position embedding.

Return type:

Tensor

get_area_pos_embed(areas, embed_dim)[source]

Geospatial area position embedding.

Parameters:
  • areas (Tensor) – Spatial areas.

  • embed_dim (int) – Embedding dimension.

Returns:

Area position embedding.

Return type:

Tensor

get_time_pos_embed(times, embed_dim)[source]

Geotemporal position embedding.

Parameters:
  • times (Tensor) – Timestamps.

  • embed_dim (int) – Embedding dimension.

Returns:

Temporal position embedding.

Return type:

Tensor

forward_features(x, metadata, wavelengths=None, bandwidths=None, language_embed=None, input_mode='spectral', kernel_size=None)[source]

Forward pass of the feature embedding layer.

Parameters:
  • x (Tensor) – Input mini-batch.

  • metadata (Tensor) – Longitudes (degree), latitudes (degree), times (days since 1970/1/1), and areas (km^2) of each patch. Use NaN for unknown metadata.

  • wavelengths (collections.abc.Sequence[float] | None) – Wavelengths of each spectral band (nm). Only used if input_mode==’spectral’.

  • bandwidths (collections.abc.Sequence[float] | None) – Bandwidths in nm. Only used if input_mode==’spectral’.

  • language_embed (torch.Tensor | None) – Language embedding tensor from Llama 3.2 1B (length 2048). Only used if input_mode==’variable’.

  • input_mode (Literal['spectral', 'variable']) – One of ‘spectral’ or ‘variable’.

  • kernel_size (int | None) – If provided and differs from the initialized kernel size, the generated patch embed kernel weights are resized accordingly.

Returns:

Output mini-batch.

Return type:

Tensor

forward_head(x, pre_logits=False)[source]

Forward pass of the attention head.

Parameters:
  • x (Tensor) – Input mini-batch.

  • pre_logits (bool) – Whether or not to return the layer before logits are computed.

Returns:

Output mini-batch.

Return type:

Tensor

forward(x, metadata, wavelengths=None, bandwidths=None, language_embed=None, input_mode='spectral', kernel_size=None)[source]

Forward pass of the model.

Parameters:
  • x (Tensor) – Input mini-batch.

  • metadata (Tensor) – Longitudes (degree), latitudes (degree), times (days since 1970/1/1), and areas (km^2) of each patch. Use NaN for unknown metadata.

  • wavelengths (collections.abc.Sequence[float] | None) – Wavelengths of each spectral band (nm). Only used if input_mode==’spectral’.

  • bandwidths (collections.abc.Sequence[float] | None) – Bandwidths in nm. Only used if input_mode==’spectral’.

  • language_embed (torch.Tensor | None) – Language embedding tensor from Llama 3.2 1B (length 2048). Only used if input_mode==’variable’.

  • input_mode (Literal['spectral', 'variable']) – One of ‘spectral’ or ‘variable’.

  • kernel_size (int | None) – If provided and differs from the initialized kernel size, the generated patch embed kernel weights are resized accordingly.

Returns:

Output mini-batch.

Return type:

Tensor

torchgeo.models.copernicusfm_base(weights=None, *args, **kwargs)[source]

CopernicusFM vit-base model.

If you use this model in your research, please cite the following paper:

New in version 0.7.

Parameters:
Returns:

A CopernicusFM base model.

Return type:

CopernicusFM

class torchgeo.models.CopernicusFM_Base_Weights(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)[source]

Bases: WeightsEnum

Copernicus-FM-base weights.

CROMA

class torchgeo.models.CROMA(modalities=['sar', 'optical'], encoder_dim=768, encoder_depth=12, num_heads=16, patch_size=8, image_size=120)[source]

Bases: Module

Pretrained CROMA model.

Corresponds to the pretrained CROMA model found in the CROMA repository:

If you use this model in your research, please cite the following paper:

__init__(modalities=['sar', 'optical'], encoder_dim=768, encoder_depth=12, num_heads=16, patch_size=8, image_size=120)[source]

Initialize the CROMA model.

Parameters:
  • modalities (Sequence[str]) – List of modalities used during forward pass, list can contain ‘sar’, ‘optical’, or both.

  • encoder_dim (int) – Dimension of the encoder.

  • encoder_depth (int) – Depth of the encoder.

  • num_heads (int) – Number of heads for the multi-head attention, should be power of 2.

  • patch_size (int) – Size of the patches.

  • image_size (int) – Size of the input images, CROMA was trained on 120x120 images, must be a multiple of 8.

Raises:

AssertionError – If any arguments are not valid.

forward(x_sar=None, x_optical=None)[source]

Forward pass of the CROMA model.

Parameters:
  • x_sar (torch.Tensor | None) – Input mini-batch of SAR images [B, 2, H, W].

  • x_optical (torch.Tensor | None) – Input mini-batch of optical images [B, 12, H, W].

torchgeo.models.croma_base(weights=None, *args, **kwargs)[source]

CROMA base model.

If you use this model in your research, please cite the following paper:

New in version 0.7.

Parameters:
  • weights (torchgeo.models.croma.CROMABase_Weights | None) – Pretrained weights to load.

  • *args (Any) – Additional arguments to pass to :class:CROMA.`

  • **kwargs (Any) – Additional keyword arguments to pass to :class:CROMA.`

Returns:

CROMA base model.

Return type:

CROMA

torchgeo.models.croma_large(weights=None, *args, **kwargs)[source]

CROMA large model.

If you use this model in your research, please cite the following paper:

New in version 0.7.

Parameters:
Returns:

CROMA large model.

Return type:

CROMA

class torchgeo.models.CROMABase_Weights(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)[source]

Bases: WeightsEnum

CROMA base model weights.

New in version 0.7.

class torchgeo.models.CROMALarge_Weights(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)[source]

Bases: WeightsEnum

CROMA large model weights.

New in version 0.7.

DOFA

class torchgeo.models.DOFA(img_size=224, patch_size=16, drop_rate=0.0, embed_dim=1024, depth=24, num_heads=16, dynamic_embed_dim=128, num_classes=45, global_pool=True, mlp_ratio=4.0, norm_layer=functools.partial(<class 'torch.nn.modules.normalization.LayerNorm'>, eps=1e-06))[source]

Bases: Module

Dynamic One-For-All (DOFA) model.

Reference implementation:

If you use this model in your research, please cite the following paper:

New in version 0.6.

__init__(img_size=224, patch_size=16, drop_rate=0.0, embed_dim=1024, depth=24, num_heads=16, dynamic_embed_dim=128, num_classes=45, global_pool=True, mlp_ratio=4.0, norm_layer=functools.partial(<class 'torch.nn.modules.normalization.LayerNorm'>, eps=1e-06))[source]

Initialize a new DOFA instance.

Parameters:
  • img_size (int) – Input image size.

  • patch_size (int) – Patch size.

  • drop_rate (float) – Head dropout rate.

  • embed_dim (int) – Transformer embedding dimension.

  • depth (int) – Depth of transformer.

  • num_heads (int) – Number of attention heads.

  • dynamic_embed_dim (int) – Dimensions of dynamic weight generator.

  • num_classes (int) – Number of classes for classification head.

  • global_pool (bool) – Whether or not to perform global pooling.

  • mlp_ratio (float) – Ratio of MLP hidden dim to embedding dim.

  • norm_layer (type[torch.nn.modules.module.Module]) – Normalization layer.

forward_features(x, wavelengths)[source]

Forward pass of the feature embedding layer.

Parameters:
  • x (Tensor) – Input mini-batch.

  • wavelengths (list[float]) – Wavelengths of each spectral band (μm).

Returns:

Output mini-batch.

Return type:

Tensor

forward_head(x, pre_logits=False)[source]

Forward pass of the attention head.

Parameters:
  • x (Tensor) – Input mini-batch.

  • pre_logits (bool) – Whether or not to return the layer before logits are computed.

Returns:

Output mini-batch.

Return type:

Tensor

forward(x, wavelengths)[source]

Forward pass of the model.

Parameters:
  • x (Tensor) – Input mini-batch.

  • wavelengths (list[float]) – Wavelengths of each spectral band (μm).

Returns:

Output mini-batch.

Return type:

Tensor

torchgeo.models.dofa_small_patch16_224(*args, **kwargs)[source]

Dynamic One-For-All (DOFA) small patch size 16 model.

If you use this model in your research, please cite the following paper:

New in version 0.6.

Parameters:
  • *args (Any) – Additional arguments to pass to DOFA.

  • **kwargs (Any) – Additional keyword arguments to pass to DOFA.

Returns:

A DOFA small 16 model.

Return type:

DOFA

torchgeo.models.dofa_base_patch16_224(weights=None, *args, **kwargs)[source]

Dynamic One-For-All (DOFA) base patch size 16 model.

If you use this model in your research, please cite the following paper:

New in version 0.6.

Parameters:
Returns:

A DOFA base 16 model.

Return type:

DOFA

torchgeo.models.dofa_large_patch16_224(weights=None, *args, **kwargs)[source]

Dynamic One-For-All (DOFA) large patch size 16 model.

If you use this model in your research, please cite the following paper:

New in version 0.6.

Parameters:
Returns:

A DOFA large 16 model.

Return type:

DOFA

torchgeo.models.dofa_huge_patch14_224(*args, **kwargs)[source]

Dynamic One-For-All (DOFA) huge patch size 14 model.

If you use this model in your research, please cite the following paper:

New in version 0.6.

Parameters:
  • *args (Any) – Additional arguments to pass to DOFA.

  • **kwargs (Any) – Additional keyword arguments to pass to DOFA.

Returns:

A DOFA huge 14 model.

Return type:

DOFA

class torchgeo.models.DOFABase16_Weights(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)[source]

Bases: WeightsEnum

Dynamic One-For-All (DOFA) base patch size 16 weights.

New in version 0.6.

class torchgeo.models.DOFALarge16_Weights(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)[source]

Bases: WeightsEnum

Dynamic One-For-All (DOFA) large patch size 16 weights.

New in version 0.6.

EarthLoc

class torchgeo.models.EarthLoc(in_channels=3, image_size=320, desc_dim=4096, backbone='resnet50', pretrained=True)[source]

Bases: Module

EarthLoc model for generating feature descriptors from satellite imagery.

Adapted from https://github.com/gmberton/EarthLoc. Copyright (c) 2024 Gabriele Berton

If you use this model in your research, please cite the following paper:

New in version 0.8.

__init__(in_channels=3, image_size=320, desc_dim=4096, backbone='resnet50', pretrained=True)[source]

Initialize the EarthLoc model.

Parameters:
  • in_channels (int) – Number of input channels in the images (default: 3 for RGB).

  • image_size (int) – Size of the input images (assumed square).

  • desc_dim (int) – Dimension of the final output feature descriptor.

  • backbone (str) – Backbone model to use for feature extraction (default: “resnet50”).

  • pretrained (bool) – Whether to use pre-trained weights for the backbone model.

forward(x)[source]

Forward pass of the EarthLoc model.

Parameters:

x (Tensor) – Input tensor of shape (b, c, h, w).

Returns:

Output feature descriptor tensor of shape (b, desc_dim).

Return type:

Tensor

torchgeo.models.earthloc(weights=None, *args, **kwargs)[source]

EarthLoc model.

If you use this model in your research, please cite the following paper:

New in version 0.8.

Parameters:
Returns:

An EarthLoc model.

Return type:

EarthLoc

class torchgeo.models.EarthLoc_Weights(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)[source]

Bases: WeightsEnum

EarthLoc weights.

FarSeg

class torchgeo.models.FarSeg(backbone='resnet50', classes=16, backbone_pretrained=True)[source]

Bases: Module

Foreground-Aware Relation Network (FarSeg).

This model can be used for binary- or multi-class object segmentation, such as building, road, ship, and airplane segmentation. It can be also extended as a change detection model. It features a foreground-scene relation module to model the relation between scene embedding, object context, and object feature, thus improving the discrimination of object feature representation.

If you use this model in your research, please cite the following paper:

__init__(backbone='resnet50', classes=16, backbone_pretrained=True)[source]

Initialize a new FarSeg model.

Parameters:
  • backbone (str) – name of ResNet backbone, one of [“resnet18”, “resnet34”, “resnet50”, “resnet101”]

  • classes (int) – number of output segmentation classes

  • backbone_pretrained (bool) – whether to use pretrained weight for backbone

forward(x)[source]

Forward pass of the model.

Parameters:

x (Tensor) – input image

Returns:

output prediction

Return type:

Tensor

Fully-convolutional Network

class torchgeo.models.FCN(in_channels, classes, num_filters=64)[source]

Bases: Module

A simple 5 layer FCN with leaky relus and ‘same’ padding.

__init__(in_channels, classes, num_filters=64)[source]

Initializes the 5 layer FCN model.

Parameters:
  • in_channels (int) – Number of input channels that the model will expect

  • classes (int) – Number of filters in the final layer

  • num_filters (int) – Number of filters in each convolutional layer

forward(x)[source]

Forward pass of the model.

FC Siamese Networks

class torchgeo.models.FCSiamConc(*args, **kwargs)[source]

Bases: SegmentationModel

Fully-convolutional Siamese Concatenation (FC-Siam-conc).

If you use this model in your research, please cite the following paper:

__init__(encoder_name='resnet34', encoder_depth=5, encoder_weights='imagenet', decoder_use_batchnorm='batchnorm', decoder_channels=(256, 128, 64, 32, 16), decoder_attention_type=None, in_channels=3, classes=1, activation=None)[source]

Initialize a new FCSiamConc model.

Parameters:
  • encoder_name (str) – Name of the classification model that will be used as an encoder (a.k.a backbone) to extract features of different spatial resolution

  • encoder_depth (int) – A number of stages used in encoder in range [3, 5]. two times smaller in spatial dimensions than previous one (e.g. for depth 0 we will have features. Each stage generate features with shapes [(N, C, H, W),], for depth 1 - [(N, C, H, W), (N, C, H // 2, W // 2)] and so on). Default is 5

  • encoder_weights (str | None) – One of None (random initialization), “imagenet” (pre-training on ImageNet) and other pretrained weights (see table with available weights for each encoder_name)

  • decoder_channels (Sequence[int]) – List of integers which specify in_channels parameter for convolutions used in decoder. Length of the list should be the same as encoder_depth

  • decoder_use_batchnorm (bool | str | dict[str, Any]) –

    Specifies normalization between Conv2D and activation. Accepts the following types:

    • True: Defaults to “batchnorm”.

    • False: No normalization (nn.Identity).

    • str: Specifies normalization type using default parameters. Available values: “batchnorm”, “identity”, “layernorm”, “instancenorm”, “inplace”.

    • dict: Fully customizable normalization settings. Structure: `python {"type": <norm_type>, **kwargs} ` where norm_name corresponds to normalization type (see above), and kwargs are passed directly to the normalization layer as defined in PyTorch documentation.

      Example: `python decoder_use_norm={"type": "layernorm", "eps": 1e-2} `

  • decoder_attention_type (str | None) – Attention module used in decoder of the model. Available options are None and scse. SCSE paper https://arxiv.org/abs/1808.08127

  • in_channels (int) – A number of input channels for the model, default is 3 (RGB images)

  • classes (int) – A number of classes for output mask (or you can think as a number of channels of output mask)

  • activation (str | collections.abc.Callable[[torch.Tensor], torch.Tensor] | None) – An activation function to apply after the final convolution n layer. Available options are “sigmoid”, “softmax”, “logsoftmax”, “tanh”, “identity”, callable and None. Default is None

forward(x)[source]

Forward pass of the model.

Parameters:

x (Tensor) – input images of shape (b, t, c, h, w)

Returns:

predicted change masks of size (b, classes, h, w)

Return type:

Tensor

class torchgeo.models.FCSiamDiff(*args, **kwargs)[source]

Bases: Unet

Fully-convolutional Siamese Difference (FC-Siam-diff).

If you use this model in your research, please cite the following paper:

__init__(*args, **kwargs)[source]

Initialize a new FCSiamConc model.

Parameters:
  • *args (Any) – Additional arguments passed to Unet

  • **kwargs (Any) – Additional keyword arguments passed to Unet

forward(x)[source]

Forward pass of the model.

Parameters:

x (Tensor) – input images of shape (b, t, c, h, w)

Returns:

predicted change masks of size (b, classes, h, w)

Return type:

Tensor

L-TAE

class torchgeo.models.LTAE(in_channels=128, n_head=16, d_k=8, n_neurons=(256, 128), dropout=0.2, d_model=256, T=1000, len_max_seq=24, positions=None)[source]

Bases: Module

Lightweight Temporal Attention Encoder (L-TAE).

This model implements a lightweight temporal attention encoder that processes time series data using a multi-head attention mechanism. It is designed to efficiently encode temporal sequences into fixed-length embeddings.

If you use this model in your research, please cite the following paper:

New in version 0.8.

__init__(in_channels=128, n_head=16, d_k=8, n_neurons=(256, 128), dropout=0.2, d_model=256, T=1000, len_max_seq=24, positions=None)[source]

Sequence-to-embedding encoder.

Parameters:
  • in_channels (int) – Number of channels of the input embeddings

  • n_head (int) – Number of attention heads

  • d_k (int) – Dimension of the key and query vectors

  • n_neurons (Sequence[int]) – Defines the dimensions of the successive feature spaces of the MLP that processes the concatenated outputs of the attention heads

  • dropout (float) – dropout

  • T (int) – Period to use for the positional encoding

  • len_max_seq (int) – Maximum sequence length, used to pre-compute the positional encoding table

  • positions (collections.abc.Sequence[int] | None) – List of temporal positions to use instead of position in the sequence

  • d_model (int | None) – If specified, the input tensors will first processed by a fully connected layer to project them into a feature space of dimension d_model

forward(x)[source]

Forward pass of the model.

Parameters:

x (Tensor) – Input tensor of shape (batch_size, seq_len, in_channels)

Returns:

Output tensor of shape (batch_size, n_neurons[-1])

Return type:

Tensor

MOSAIKS

class torchgeo.models.MOSAIKS(dataset, in_channels=3, features=4096, kernel_size=4, bias=-1.0, seed=None)[source]

Bases: RCF

MOSAIKS RCF model with the recommended parameters defined in the paper.

If you use this model in your research, please cite the following paper:

Note

This Module is not trainable. It is only used as a feature extractor.

New in version 0.8.

__init__(dataset, in_channels=3, features=4096, kernel_size=4, bias=-1.0, seed=None)[source]

Initializes the MOSAIKS model.

Parameters:
  • dataset (NonGeoDataset) – a NonGeoDataset to sample from

  • in_channels (int) – number of input channels

  • features (int) – number of features to compute, must be divisible by 2

  • kernel_size (int) – size of the kernel used to compute the RCFs

  • bias (float) – bias of the convolutional layer

  • seed (int | None) – random seed used to initialize the convolutional layer

class torchgeo.models.RCF(in_channels=4, features=16, kernel_size=3, bias=-1.0, seed=None, mode='gaussian', dataset=None)[source]

Bases: Module

This model extracts random convolutional features (RCFs) from its input.

RCFs are used in the Multi-task Observation using Satellite Imagery & Kitchen Sinks (MOSAIKS) method proposed in “A generalizable and accessible approach to machine learning with global satellite imagery”.

This class can operate in two modes, “gaussian” and “empirical”. In “gaussian” mode, the filters will be sampled from a Gaussian distribution, while in “empirical” mode, the filters will be sampled from a dataset.

If you use this model in your research, please cite the following paper:

Note

This Module is not trainable. It is only used as a feature extractor.

__init__(in_channels=4, features=16, kernel_size=3, bias=-1.0, seed=None, mode='gaussian', dataset=None)[source]

Initializes the RCF model.

This is a static model that serves to extract fixed length feature vectors from input patches.

New in version 0.2: The seed parameter.

New in version 0.5: The mode and dataset parameters.

Parameters:
  • in_channels (int) – number of input channels

  • features (int) – number of features to compute, must be divisible by 2

  • kernel_size (int) – size of the kernel used to compute the RCFs

  • bias (float) – bias of the convolutional layer

  • seed (int | None) – random seed used to initialize the convolutional layer

  • mode (str) – “empirical” or “gaussian”

  • dataset (torchgeo.datasets.geo.NonGeoDataset | None) – a NonGeoDataset to sample from when mode is “empirical”

forward(x)[source]

Forward pass of the RCF model.

Parameters:

x (Tensor) – a tensor with shape (B, C, H, W)

Returns:

a tensor of size (B, self.num_features)

Return type:

Tensor

ResNet

torchgeo.models.resnet18(weights=None, *args, **kwargs)[source]

ResNet-18 model.

If you use this model in your research, please cite the following paper:

New in version 0.4.

Parameters:
Returns:

A ResNet-18 model.

Return type:

Module

torchgeo.models.resnet50(weights=None, *args, **kwargs)[source]

ResNet-50 model.

If you use this model in your research, please cite the following paper:

Changed in version 0.4: Switched to multi-weight support API.

Parameters:
Returns:

A ResNet-50 model.

Return type:

Module

torchgeo.models.resnet152(weights=None, *args, **kwargs)[source]

ResNet-152 model.

If you use this model in your research, please cite the following paper:

New in version 0.6.

Parameters:
Returns:

A ResNet-152 model.

Return type:

Module

class torchgeo.models.ResNet18_Weights(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)[source]

Bases: WeightsEnum

ResNet-18 weights.

For timm resnet18 implementation.

New in version 0.4.

class torchgeo.models.ResNet50_Weights(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)[source]

Bases: WeightsEnum

ResNet-50 weights.

For timm resnet50 implementation.

New in version 0.4.

class torchgeo.models.ResNet152_Weights(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)[source]

Bases: WeightsEnum

ResNet-152 weights.

For timm resnet152 implementation.

New in version 0.6.

Scale-MAE

torchgeo.models.ScaleMAE(res=1.0, *args, **kwargs)[source]

Custom Vision Transformer for Scale-MAE with GSD positional embeddings.

This is a ViT encoder only model of the Scale-MAE architecture with GSD positional embeddings.

If you use this model in your research, please cite the following paper:

class torchgeo.models.ScaleMAELarge16_Weights(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)[source]

Bases: WeightsEnum

Scale-MAE Large patch size 16 weights.

New in version 0.6.

Swin Transformer

torchgeo.models.swin_v2_t(weights=None, *args, **kwargs)[source]

Swin Transformer v2 tiny model.

If you use this model in your research, please cite the following paper:

New in version 0.6.

Parameters:
  • weights (torchgeo.models.swin.Swin_V2_T_Weights | None) – Pre-trained model weights to use.

  • *args (Any) – Additional arguments to pass to torchvision.models.swin_transformer.SwinTransformer.

  • **kwargs (Any) – Additional keyword arguments to pass to torchvision.models.swin_transformer.SwinTransformer.

Returns:

A Swin Transformer Tiny model.

Return type:

SwinTransformer

torchgeo.models.swin_v2_b(weights=None, *args, **kwargs)[source]

Swin Transformer v2 base model.

If you use this model in your research, please cite the following paper:

New in version 0.6.

Parameters:
  • weights (torchgeo.models.swin.Swin_V2_B_Weights | None) – Pre-trained model weights to use.

  • *args (Any) – Additional arguments to pass to torchvision.models.swin_transformer.SwinTransformer.

  • **kwargs (Any) – Additional keyword arguments to pass to torchvision.models.swin_transformer.SwinTransformer.

Returns:

A Swin Transformer Base model.

Return type:

SwinTransformer

class torchgeo.models.Swin_V2_T_Weights(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)[source]

Bases: WeightsEnum

Swin Transformer v2 Tiny weights.

For torchvision swin_v2_t implementation.

New in version 0.6.

class torchgeo.models.Swin_V2_B_Weights(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)[source]

Bases: WeightsEnum

Swin Transformer v2 Base weights.

For torchvision swin_v2_b implementation.

New in version 0.6.

Panopticon

class torchgeo.models.Panopticon(attn_dim=2304, embed_dim=768, img_size=224)[source]

Bases: Module

Panopticon ViT-Base Foundation Model.

New in version 0.7.

__init__(attn_dim=2304, embed_dim=768, img_size=224)[source]

Initialize a Panopticon model.

Parameters:
  • attn_dim (int) – Dimension of channel attention.

  • embed_dim (int) – Embedding dimension of backbone.

  • img_size (int) – Image size. Panopticon can be initizialized with any image size but image size is fixed after initialization. For optimal performance, we recommend to use the same image size as used during training. For the published weights, this is 224.

forward(x_dict)[source]

Forward pass of the model including forward pass through the head.

Parameters:

x_dict (dict[str, torch.Tensor]) –

Dictionary with keys:

Returns:

Embeddings.

Return type:

Tensor

torchgeo.models.panopticon_vitb14(weights=None, img_size=224, **kwargs)[source]

Panopticon ViT-Base model.

Panopticon can handle arbitrary optical channel and SAR combinations. It can also be initialized with any image size where the image size is fixed after initialization. However, we recommend to set 224 in alignment with the pretraining. For more information on how to use the model, see https://github.com/Panopticon-FM/panopticon?tab=readme-ov-file#using-panopticon.

If you use this model in your research, please cite the following paper:

New in version 0.7.

Returns:

The Panopticon ViT-Base model with the published weights loaded.

Return type:

Panopticon

class torchgeo.models.Panopticon_Weights(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)[source]

Bases: WeightsEnum

Panopticon weights.

New in version 0.7.

U-Net

torchgeo.models.unet(weights=None, classes=None, *args, **kwargs)[source]

U-Net model.

If you use this model in your research, please cite the following paper:

New in version 0.8.

Parameters:
  • weights (torchgeo.models.unet.Unet_Weights | None) – Pre-trained model weights to use.

  • classes (int | None) – Number of output classes. If not specified, the number of classes will be inferred from the weights.

  • *args (Any) – Additional arguments to pass to segmentation_models_pytorch.create_model

  • **kwargs (Any) – Additional keyword arguments to pass to segmentation_models_pytorch.create_model

Returns:

A U-Net model.

Return type:

Unet

class torchgeo.models.Unet_Weights(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)[source]

Bases: WeightsEnum

U-Net weights.

For smp Unet implementation.

New in version 0.8.

Vision Transformer

torchgeo.models.vit_small_patch16_224(weights=None, *args, **kwargs)[source]

Vision Transform (ViT) small patch size 16 model.

If you use this model in your research, please cite the following paper:

New in version 0.4.

Parameters:
Returns:

A ViT small 16 model.

Return type:

Module

torchgeo.models.vit_base_patch16_224(weights=None, *args, **kwargs)[source]

Vision Transform (ViT) base patch size 16 model.

If you use this model in your research, please cite the following paper:

New in version 0.7.

Parameters:
Returns:

A ViT base 16 model.

Return type:

Module

torchgeo.models.vit_large_patch16_224(weights=None, *args, **kwargs)[source]

Vision Transform (ViT) large patch size 16 model.

If you use this model in your research, please cite the following paper:

New in version 0.7.

Parameters:
Returns:

A ViT large 16 model.

Return type:

Module

torchgeo.models.vit_huge_patch14_224(weights=None, *args, **kwargs)[source]

Vision Transform (ViT) huge patch size 14 model.

If you use this model in your research, please cite the following paper:

New in version 0.7.

Parameters:
Returns:

A ViT huge 14 model.

Return type:

Module

torchgeo.models.vit_small_patch14_dinov2(weights=None, *args, **kwargs)[source]

Vision Transform (ViT) small patch size 14 model for DINOv2.

If you use this model in your research, please cite the following paper:

New in version 0.7.

Parameters:
Returns:

A DINOv2 ViT small 14 model.

Return type:

Module

torchgeo.models.vit_base_patch14_dinov2(weights=None, *args, **kwargs)[source]

Vision Transform (ViT) base patch size 14 model for DINOv2.

If you use this model in your research, please cite the following paper:

New in version 0.7.

Parameters:
Returns:

A DINOv2 ViT base 14 model.

Return type:

Module

class torchgeo.models.ViTSmall16_Weights(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)[source]

Bases: WeightsEnum

Vision Transformer Small Patch Size 16 weights.

For timm vit_small_patch16_224 implementation.

New in version 0.4.

class torchgeo.models.ViTBase16_Weights(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)[source]

Bases: WeightsEnum

Vision Transformer Base Patch Size 16 weights.

For timm vit_base_patch16_224 implementation.

New in version 0.7.

class torchgeo.models.ViTLarge16_Weights(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)[source]

Bases: WeightsEnum

Vision Transformer Large Patch Size 16 weights.

For timm vit_large_patch16_224 implementation.

New in version 0.7.

class torchgeo.models.ViTHuge14_Weights(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)[source]

Bases: WeightsEnum

Vision Transformer Huge Patch Size 14 weights.

For timm vit_huge_patch14_224 implementation.

New in version 0.7.

class torchgeo.models.ViTSmall14_DINOv2_Weights(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)[source]

Bases: WeightsEnum

Vision Transformer Small Patch Size 14 (DINOv2) weights.

For timm vit_small_patch14_dinov2 implementation.

New in version 0.7.

class torchgeo.models.ViTBase14_DINOv2_Weights(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)[source]

Bases: WeightsEnum

Vision Transformer Base Patch Size 14 (DINOv2) weights.

For timm vit_base_patch14_dinov2 implementation.

New in version 0.7.

YOLO

torchgeo.models.yolo(weights=None, *args, **kwargs)[source]

YOLO model.

New in version 0.8.

Parameters:
Returns:

An ultralytics.YOLO model.

Raises:

DependencyNotFoundError – If ultralytics is not installed.

Return type:

Module

class torchgeo.models.YOLO_Weights(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)[source]

Bases: WeightsEnum

YOLO weights.

For ultralytics YOLO implementation.

New in version 0.8.

Utility Functions

torchgeo.models.get_model(name, *args, **kwargs)[source]

Get an instantiated model from its name.

New in version 0.4.

Parameters:
  • name (str) – Name of the model.

  • *args (Any) – Additional arguments passed to the model builder method.

  • **kwargs (Any) – Additional keyword arguments passed to the model builder method.

Returns:

An instantiated model.

Return type:

Module

torchgeo.models.get_model_weights(name)[source]

Get the weights enum class associated with a given model.

New in version 0.4.

Parameters:

name (collections.abc.Callable[[...], torch.nn.modules.module.Module] | str) – Model builder function or the name under which it is registered.

Returns:

The weights enum class associated with the model.

Return type:

WeightsEnum

torchgeo.models.get_weight(name)[source]

Get the weights enum value by its full name.

New in version 0.4.

Parameters:

name (str) – Name of the weight enum entry.

Returns:

The requested weight enum.

Raises:

ValueError – If name is not a valid WeightsEnum.

Return type:

WeightsEnum

torchgeo.models.list_models()[source]

List the registered models.

New in version 0.4.

Returns:

A list of registered models.

Return type:

list[str]

Pretrained Weights

TorchGeo provides a number of pre-trained models and backbones, allowing you to perform transfer learning on small datasets without training a new model from scratch or relying on ImageNet weights. Depending on the satellite/sensor where your data comes from, choose from the following pre-trained weights based on which one has the best performance metrics.

Sensor-Agnostic

These weights can be used with imagery from any satellite/sensor. In addition to the usual performance metrics, there are also additional columns for dynamic spatial (resolution), temporal (time span), and/or spectral (wavelength) support, either via their training data (implicit) or via their model architecture (explicit).

Weight

Source

Citation

License

Spatial

Temporal

Spectral

m-bigearthnet

m-forestnet

m-brick-kiln

m-pv4ger

m-so2sat

m-eurosat

m-pv4ger-seg

m-nz-cattle

m-NeonTree

m-cashew-plant

m-SA-crop

m-chesapeake

CopernicusFM_Base_Weights.CopernicusFM_ViT

link

link

CC-BY-4.0

explicit

explicit

explicit

CROMABase_Weights.CROMA_VIT

link

link

CC-BY-4.0

implicit

implicit

CROMALarge_Weights.CROMA_VIT

link

link

CC-BY-4.0

implicit

implicit

DOFABase16_Weights.DOFA_MAE

link

link

CC-BY-4.0

implicit

explicit

65.7

50.9

95.8

96.9

55.1

93.9

94.5

81.4

58.8

51.5

33.0

65.3

DOFALarge16_Weights.DOFA_MAE

link

link

CC-BY-4.0

implicit

explicit

67.5

54.6

96.9

97.3

60.1

97.1

95.0

81.8

59.4

56.9

32.1

66.3

Panopticon_Weights.VIT_BASE14

link

link

Apache-2.0

implicit

explicit

56.3

96.7

96.4

61.7

96.4

95.2

92.6

79.6

59.3

52.6

78.1

ResNet50_Weights.FMOW_RGB_GASSL

link

link

implicit

ScaleMAE_ViTLarge16_Weights.FMOW_RGB_SCALEMAE

link

link

CC-BY-NC-4.0

explicit

YOLO_Weights.CORE_DINO

link

CC-BY-NC-3.0

implicit

YOLO_Weights.DELINEATE_ANYTHING

link

link

AGPL-3.0

implicit

YOLO_Weights.DELINEATE_ANYTHING_SMALL

link

link

AGPL-3.0

implicit

Landsat

Weight

Landsat

Channels

Source

Citation

License

NLCD (Acc)

NLCD (mIoU)

CDL (Acc)

CDL (mIoU)

ResNet18_Weights.LANDSAT_TM_TOA_MOCO

4–5

7

link

link

CC0-1.0

67.65

51.11

68.70

52.32

ResNet18_Weights.LANDSAT_TM_TOA_SIMCLR

4–5

7

link

link

CC0-1.0

60.86

43.74

61.94

44.86

ResNet50_Weights.LANDSAT_TM_TOA_MOCO

4–5

7

link

link

CC0-1.0

68.75

53.28

69.45

53.20

ResNet50_Weights.LANDSAT_TM_TOA_SIMCLR

4–5

7

link

link

CC0-1.0

62.05

44.98

62.80

45.77

ViTSmall16_Weights.LANDSAT_TM_TOA_MOCO

4–5

7

link

link

CC0-1.0

67.17

50.57

67.60

51.07

ViTSmall16_Weights.LANDSAT_TM_TOA_SIMCLR

4–5

7

link

link

CC0-1.0

66.82

50.17

66.92

50.28

ResNet18_Weights.LANDSAT_ETM_TOA_MOCO

7

9

link

link

CC0-1.0

65.22

48.39

62.84

45.81

ResNet18_Weights.LANDSAT_ETM_TOA_SIMCLR

7

9

link

link

CC0-1.0

58.76

41.60

56.47

39.34

ResNet50_Weights.LANDSAT_ETM_TOA_MOCO

7

9

link

link

CC0-1.0

66.60

49.92

64.12

47.19

ResNet50_Weights.LANDSAT_ETM_TOA_SIMCLR

7

9

link

link

CC0-1.0

57.17

40.02

54.95

37.88

ViTSmall16_Weights.LANDSAT_ETM_TOA_MOCO

7

9

link

link

CC0-1.0

63.75

46.79

60.88

43.70

ViTSmall16_Weights.LANDSAT_ETM_TOA_SIMCLR

7

9

link

link

CC0-1.0

63.33

46.34

59.06

41.91

ResNet18_Weights.LANDSAT_ETM_SR_MOCO

7

6

link

link

CC0-1.0

64.18

47.25

67.30

50.71

ResNet18_Weights.LANDSAT_ETM_SR_SIMCLR

7

6

link

link

CC0-1.0

57.26

40.11

54.42

37.48

ResNet50_Weights.LANDSAT_ETM_SR_MOCO

7

6

link

link

CC0-1.0

64.37

47.46

62.35

45.30

ResNet50_Weights.LANDSAT_ETM_SR_SIMCLR

7

6

link

link

CC0-1.0

57.79

40.64

55.69

38.59

ViTSmall16_Weights.LANDSAT_ETM_SR_MOCO

7

6

link

link

CC0-1.0

64.09

47.21

52.37

35.48

ViTSmall16_Weights.LANDSAT_ETM_SR_SIMCLR

7

6

link

link

CC0-1.0

63.99

47.05

53.17

36.21

ResNet18_Weights.LANDSAT_OLI_TIRS_TOA_MOCO

8–9

11

link

link

CC0-1.0

67.82

51.30

65.74

48.96

ResNet18_Weights.LANDSAT_OLI_TIRS_TOA_SIMCLR

8–9

11

link

link

CC0-1.0

62.14

45.08

60.01

42.86

ResNet50_Weights.LANDSAT_OLI_TIRS_TOA_MOCO

8–9

11

link

link

CC0-1.0

69.17

52.87

67.29

50.70

ResNet50_Weights.LANDSAT_OLI_TIRS_TOA_SIMCLR

8–9

11

link

link

CC0-1.0

64.66

47.78

62.08

45.01

ViTSmall16_Weights.LANDSAT_OLI_TIRS_TOA_MOCO

8–9

11

link

link

CC0-1.0

67.11

50.49

64.62

47.73

ViTSmall16_Weights.LANDSAT_OLI_TIRS_TOA_SIMCLR

8–9

11

link

link

CC0-1.0

66.12

49.39

63.88

46.94

ResNet18_Weights.LANDSAT_OLI_SR_MOCO

8–9

7

link

link

CC0-1.0

67.01

50.39

68.05

51.57

ResNet18_Weights.LANDSAT_OLI_SR_SIMCLR

8–9

7

link

link

CC0-1.0

59.93

42.79

57.44

40.30

ResNet50_Weights.LANDSAT_OLI_SR_MOCO

8–9

7

link

link

CC0-1.0

67.44

50.88

65.96

49.21

ResNet50_Weights.LANDSAT_OLI_SR_SIMCLR

8–9

7

link

link

CC0-1.0

63.65

46.68

60.01

43.17

ViTSmall16_Weights.LANDSAT_OLI_SR_MOCO

8–9

7

link

link

CC0-1.0

66.81

50.16

64.17

47.24

ViTSmall16_Weights.LANDSAT_OLI_SR_SIMCLR

8–9

7

link

link

CC0-1.0

65.04

48.20

62.61

45.46

Swin_V2_B_Weights.LANDSAT_SI_SATLAS

8–9

11

link

link

ODC-BY

Swin_V2_B_Weights.LANDSAT_MI_SATLAS

8–9

11

link

link

ODC-BY

NAIP

Weight

Channels

Source

Citation

License

Swin_V2_B_Weights.NAIP_RGB_MI_SATLAS

3

link

link

ODC-BY

Swin_V2_B_Weights.NAIP_RGB_SI_SATLAS

3

link

link

ODC-BY

Sentinel-1

Weight

Channels

Source

Citation

License

ResNet50_Weights.SENTINEL1_GRD_CLOSP

2

link

link

OpenRAIL

ResNet50_Weights.SENTINEL1_GRD_DECUR

2

link

link

Apache-2.0

ResNet50_Weights.SENTINEL1_GRD_GEOCLOSP

2

link

link

OpenRAIL

ResNet50_Weights.SENTINEL1_GRD_MOCO

2

link

link

CC-BY-4.0

ResNet50_Weights.SENTINEL1_GRD_SOFTCON

2

link

link

CC-BY-4.0

ViTSmall16_Weights.SENTINEL1_GRD_CLOSP

2

link

link

OpenRAIL

ViTSmall16_Weights.SENTINEL1_GRD_MAE

2

link

link

CC-BY-4.0

ViTSmall16_Weights.SENTINEL1_GRD_FGMAE

2

link

link

CC-BY-4.0

ViTBase16_Weights.SENTINEL1_GRD_MAE

2

link

link

CC-BY-4.0

ViTBase16_Weights.SENTINEL1_GRD_FGMAE

2

link

link

CC-BY-4.0

ViTLarge16_Weights.SENTINEL1_GRD_CLOSP

2

link

link

OpenRAIL

ViTLarge16_Weights.SENTINEL1_GRD_MAE

2

link

link

CC-BY-4.0

ViTLarge16_Weights.SENTINEL1_GRD_FGMAE

2

link

link

CC-BY-4.0

ViTHuge14_Weights.SENTINEL1_GRD_MAE

2

link

link

CC-BY-4.0

ViTHuge14_Weights.SENTINEL1_GRD_FGMAE

2

link

link

CC-BY-4.0

ViTSmall14_DINOv2_Weights.SENTINEL1_GRD_SOFTCON

2

link

link

CC-BY-4.0

ViTBase14_DINOv2_Weights.SENTINEL1_GRD_SOFTCON

2

link

link

CC-BY-4.0

Swin_V2_B_Weights.SENTINEL1_MI_SATLAS

2

link

link

ODC-BY

Swin_V2_B_Weights.SENTINEL1_SI_SATLAS

2

link

link

ODC-BY

Sentinel-2

Weight

Channels

Source

Citation

License

BigEarthNet

EuroSAT

So2Sat

OSCD

ResNet18_Weights.SENTINEL2_ALL_MOCO

13

link

link

CC-BY-4.0

ResNet18_Weights.SENTINEL2_RGB_MOCO

3

link

link

CC-BY-4.0

ResNet18_Weights.SENTINEL2_RGB_SECO

3

link

link

Apache-2.0

87.27

93.14

46.94

ResNet50_Weights.SENTINEL2_ALL_CLOSP

13

link

link

OpenRAIL

ResNet50_Weights.SENTINEL2_ALL_DECUR

13

link

link

Apache-2.0

ResNet50_Weights.SENTINEL2_ALL_DINO

13

link

link

CC-BY-4.0

90.7

99.1

63.6

ResNet50_Weights.SENTINEL2_ALL_GEOCLOSP

13

link

link

OpenRAIL

ResNet50_Weights.SENTINEL2_ALL_MOCO

13

link

link

CC-BY-4.0

91.8

99.1

60.9

ResNet50_Weights.SENTINEL2_ALL_SOFTCON

13

link

link

CC-BY-4.0

ResNet50_Weights.SENTINEL2_ALL_SECO_ECO

12

link

link

MIT

ResNet50_Weights.SENTINEL2_ALL_NDVI_SECO_ECO

9

link

link

MIT

ResNet50_Weights.SENTINEL2_MI_MS_SATLAS

9

link

link

ODC-BY

ResNet50_Weights.SENTINEL2_MI_RGB_SATLAS

3

link

link

ODC-BY

ResNet50_Weights.SENTINEL2_SI_MS_SATLAS

9

link

link

ODC-BY

ResNet50_Weights.SENTINEL2_SI_RGB_SATLAS

3

link

link

ODC-BY

ResNet50_Weights.SENTINEL2_RGB_MOCO

3

link

link

CC-BY-4.0

ResNet50_Weights.SENTINEL2_RGB_SECO

3

link

link

Apache-2.0

87.81

ResNet152_Weights.SENTINEL2_MI_MS_SATLAS

9

link

link

ODC-BY

ResNet152_Weights.SENTINEL2_MI_RGB_SATLAS

3

link

link

ODC-BY

ResNet152_Weights.SENTINEL2_SI_MS_SATLAS

9

link

link

ODC-BY

ResNet152_Weights.SENTINEL2_SI_RGB_SATLAS

3

link

link

ODC-BY

ViTSmall16_Weights.SENTINEL2_ALL_CLOSP

13

link

link

OpenRAIL

ViTSmall16_Weights.SENTINEL2_ALL_DINO

13

link

link

CC-BY-4.0

90.5

99.0

62.2

ViTSmall16_Weights.SENTINEL2_ALL_MOCO

13

link

link

CC-BY-4.0

89.9

98.6

61.6

ViTSmall16_Weights.SENTINEL2_ALL_MAE

13

link

link

CC-BY-4.0

ViTSmall16_Weights.SENTINEL2_ALL_FGMAE

13

link

link

CC-BY-4.0

ViTBase16_Weights.SENTINEL2_ALL_MAE

13

link

link

CC-BY-4.0

ViTBase16_Weights.SENTINEL2_ALL_FGMAE

13

link

link

CC-BY-4.0

ViTLarge16_Weights.SENTINEL2_ALL_CLOSP

13

link

link

OpenRAIL

ViTLarge16_Weights.SENTINEL2_ALL_MAE

13

link

link

CC-BY-4.0

ViTLarge16_Weights.SENTINEL2_ALL_FGMAE

13

link

link

CC-BY-4.0

ViTHuge14_Weights.SENTINEL2_ALL_MAE

13

link

link

CC-BY-4.0

ViTHuge14_Weights.SENTINEL2_ALL_FGMAE

13

link

link

CC-BY-4.0

ViTSmall14_DINOv2_Weights.SENTINEL2_ALL_SOFTCON

13

link

link

CC-BY-4.0

ViTBase14_DINOv2_Weights.SENTINEL2_ALL_SOFTCON

13

link

link

CC-BY-4.0

Swin_V2_T_Weights.SENTINEL2_MI_MS_SATLAS

9

link

link

ODC-BY

Swin_V2_T_Weights.SENTINEL2_MI_RGB_SATLAS

3

link

link

ODC-BY

Swin_V2_T_Weights.SENTINEL2_SI_MS_SATLAS

9

link

link

ODC-BY

Swin_V2_T_Weights.SENTINEL2_SI_RGB_SATLAS

3

link

link

ODC-BY

Swin_V2_B_Weights.SENTINEL2_MI_MS_SATLAS

9

link

link

ODC-BY

Swin_V2_B_Weights.SENTINEL2_MI_RGB_SATLAS

3

link

link

ODC-BY

Swin_V2_B_Weights.SENTINEL2_SI_MS_SATLAS

9

link

link

ODC-BY

Swin_V2_B_Weights.SENTINEL2_SI_RGB_SATLAS

3

link

link

ODC-BY

Unet_Weights.SENTINEL2_2CLASS_FTW

8

link

link

CC-BY-4.0

Unet_Weights.SENTINEL2_2CLASS_NC_FTW

8

link

link

non-commercial

Unet_Weights.SENTINEL2_3CLASS_FTW

8

link

link

CC-BY-4.0

Unet_Weights.SENTINEL2_3CLASS_NC_FTW

8

link

link

non-commercial

EarthLoc_Weights.SENTINEL2_RESNET50

3

link

link

MIT

YOLO_Weights.SENTINEL2_RGB_MARINE_VESSEL_DETECTION

3

link

link

AGPL-3.0

Atmospheric

N = Nowcasting, MWF = Medium-Range Weather Forecasting, S2S = Subseasonal to Seasonal, DS = Decadal Scale

Weight

Sensor

Task

Source

Citation

License

Aurora_Weights.HRES_T0_PRETRAINED_AURORA

HRES-T0

MWF

link

link

MIT

Aurora_Weights.HRES_T0_PRETRAINED_12HR_AURORA

HRES-T0

MWF

link

link

MIT

Aurora_Weights.HRES_T0_PRETRAINED_SMALL_AURORA

HRES-T0

MWF

link

link

MIT

Aurora_Weights.HRES_T0_AURORA

HRES-T0

MWF

link

link

MIT

Aurora_Weights.HRES_T0_HIGH_RES_AURORA

HRES-T0

MWF

link

link

MIT

Aurora_Weights.HRES_CAMS_AIR_POLLUTION_AURORA

HRES-CAMS

MWF

link

link

MIT

Aurora_Weights.HRES_WAM0_WAVE_AURORA

HRES-WAM0

MWF

link

link

MIT

Docs

Access comprehensive developer documentation for PyTorch

View Docs

Tutorials

Get in-depth tutorials for beginners and advanced developers

View Tutorials

Resources

Find development resources and get your questions answered

View Resources