Gradio

Prithvi-EO-2.0 is the second generation EO foundation model developed by the IBM and NASA team. The temporal ViT is train on 4.2M Harmonised Landsat Sentinel 2 (HLS) samples with four timestamps each, using the Masked AutoEncoder learning strategy. The model includes spatial and temporal attention across multiple patches and timestamps. Additionally, temporal and location information is added to the model input via embeddings. More info about the model are available here.

This demo showcases the image reconstruction over one to four timestamps. The model randomly masks out some proportion of the images and reconstructs them based on the not masked portion of the images. The reconstructed images are merged with the visible unmasked patches. We recommend submitting images of size 224 to ~1000 pixels for faster processing time. Images bigger than 224x224 are processed using a sliding window approach which can lead to artefacts between patches.

The user needs to provide the HLS geotiff images, including the following channels in reflectance units: Blue, Green, Red, Narrow NIR, SWIR, SWIR 2. Optionally, the location information is extracted from the tif files while the temporal information can be provided in the filename in the format <date>T<time> or <year><julian day>T<time> (HLS format). Some example images are provided at the end of this page.

Prithvi-EO-2.0 image reconstruction demo

Input time series

Masked images

Reconstructed images*