Should Land Masking Be Applied for GeoFM Tuning Datasets?

Author

Tylar

Published

December 28, 2025

Results of Seagrass Classification with Prithvi-EO using land-masked spectral files and non-land-masked spectral files.

Spectral Images

Spectral images are prepared with GEE users/tylarmurray/prithvi:prithvi_s2_median_tif_collection1.

Line 20 (see below) was removed to create the “unmasked” product.

   .map(function(img){return img.clip(roi.bb);})

The resulting multi-season median composites were built from the following set of images.

for 2020:

High Attenuation (Jan-Apr) - Image count:
72
Transitional (May-early Jul) - Image count:
44
Peak Visibility (Late Jul-Oct) - Image count:
67

and for 2024:

High Attenuation (Jan-Apr) - Image count:
74
Transitional (May-early Jul) - Image count:
49
Peak Visibility (Late Jul-Oct) - Image count:
72

Tuning Patch-Sets

Tuning patches were extracted from each 2023-2025 median image and the SIMM seagrass map.

(base) tylar@tylar-laptop:~/repos/nasa-prithvi-wetlands$ python scripts/extract_tuning_patches_seagrass_2class_sentinel2.py 
Clearing existing output directory: data/output/tuning_patches

============================================================
CRS mismatch detected: EPSG:4326 vs EPSG:3746
Reprojecting mask to match spectral image...
Mask reprojected to match spectral image: (1306, 1585)

Extracting 224x224 patches with stride 224
Number of spectral bands: 24
Will extract up to 35 patches from this shard

Shard complete: 22 valid patches extracted

============================================================
Extraction complete!
============================================================
Spectral files processed: 1
Valid patches saved: 22
Patches skipped: 13
Output directory: data/output/tuning_patches

Organizing patches into training and validation sets...

Dataset split created:
Training samples: 18
  - Files in: data/output/tuning_patches/training_chips
  - List file: training_data.txt
Validation samples: 4
  - Files in: data/output/tuning_patches/validation_chips
  - List file: validation_data.txt

Chip naming format:
  - Spectral: chip_XXXXX_merged.tif
  - Mask: chip_XXXXX.mask.tif

Cleaning up temporary directories...
  - Removed: data/output/tuning_patches/spectral
  - Removed: data/output/tuning_patches/masks
✓ Cleanup complete!

✓ Patch extraction complete! Ready for Prithvi fine-tuning.

NOTE: This number of tuning samples may be too low for tuning to be effective.

Research Notebook Changes

The following line is adjusted to use the unmasked or mask product:

UNSEEN_IMAGE_FNAME = "stAndrews_seasonal_s2_stack_2019_to_2021_unmasked.tif"

Results

With Landmask: seagrass-classification-with-landmask

seagrass-classification-unmasked.png

Without Landmask: seagrass-classification-without-landmask