- Convert raw Band 3 (green) and 11 (SWIR) images from JPEG 2000 to TIFF format, set NoData values to zero, convert from integer to float data type, resample Band 11 from 20 m to 10 m resolution. [gdal-cli-multiplex]
- Apply Modified Normalized Difference Water Index (MNDWI), where the SWIR band has been resampled to 10 m from its original 20 m resolution, to match the 10 m resolution of the green band. [gdal-cli-multiplex]
- Use threshold value (> 0.25) to segment water from non-water [ENVI_ColorSliceClassification, https://gbdxdocs.digitalglobe.com/docs/envi-color-slice-classification]
- Remove speckling noise from classification [ENVI_ClassificationSmoothing, https://gbdxdocs.digitalglobe.com/docs/envi-classification-smoothing]
- Further de-noise classification [ENVI_ClassificationAggregation, https://gbdxdocs.digitalglobe.com/docs/envi-classification-aggregation]
- Convert the cleaned classification raster to shapefile format [ENVI_ClassificationToShapefile, https://gbdxdocs.digitalglobe.com/docs/envi-classification-to-shapefile]
- Simplify the water vectors, keeping only those >5,000 m2 [simplify-polygon]
- Assign the coordinate reference system (WGS 84) [gdal-cli-multiplex]
- Prepare vectors for Vector Services [IngestShpToVectorServices]
- Upload vectors to Vector Services S3 location [StageDataToS3]
- We used images collected on different dates and potentially different seasons to achieve cloud- free coverage. This implies that adjacent images may not match at boundaries, and overlapping images between years may not match.
- MNDWI threshold may not extract waterbodies similarly in all images.
- Thin features may be more susceptible to being filtered out by area.
- Waterbody vectors must be dissolved (or otherwise processed) to use absolute number of waterbodies or total area of waterbodies. Contiguous waterbodies may be divided internally, inflating number of waterbodies. Also, due to the overlapping tiling scheme of Sentinel-2 imagery, the overlapping portions are likely to include duplicates of the same waterbody (possibly observed on different dates), inflating total number and area of waterbodies.
- The footprint geometries for Sentinel-2 images returned from vector services reflect the tile geometry, but may not accurately describe the area containing usable data. We noticed images that contained null data across large areas, which would result in no mapped waterbodies in those regions.
- The aggregated sum of areas also includes duplicate water bodies at geohash boundaries, inflating those absolute values.