Feedback Request: Evolving the APEx Data Guidelines for STAC Alignment and Tooling Integration

As part of our ongoing efforts to enhance data discoverability, interoperability, and reuse across the APEx tools, we are reviewing and evolving the APEx Data Provider Guidelines. This update aims to ensure closer alignment with (STAC) best practices and to support seamless integration with existing tooling used across APEx and ESA projects.

We are seeking your feedback on:

  • STAC compliance and extensions: Which STAC fields, extensions, or conventions should be prioritized to best represent data collections and items?
  • Tooling integration: Which STAC metadata and extension are required to support the seamless integration into existing tools (APEx Geospatial Explorer, STAC Browser, …) and processing frameworks (e.g. openEO, OGC API Processes, …)?
  • Implementation guidance: What examples, templates, or validation tools would help teams apply these guidelines effectively?

Your input will help shape the next iteration of the APEx Data Provider Guidelines, ensuring they remain practical, interoperable, and consistent with community standards.

With respect to compatibility with CDSE openEO, these are some key requirements, in order of importance:

  • Use band common metadata in STAC 1.1 to name bands available in the assets
  • Use consistent asset keys throughout the collection
  • Collections should be homogeneous: each item has the same assets
  • Use the projection extension to declare asset/item EPSG codes, shape and bounding box in raster coordinates
  • Declare asset data type, nodata value, and optionally scale and offset per band

More optional:

  • Use item_assets at collection level, to give a summary of all properties
  • Declare legends for categorical data

I also noticed that some of the STAC APIs use relative paths to their assets. For example: ESA PRR - WORLDCOVER_10M_2020_V1. In this case the asset contains a relative path:

"ESA_WORLDCOVER_10M_MAP": {
      "href": "/d/ESA_WORLDCOVER_10M_2020_V1/2020/12/31/ESA_WorldCover_10m_2020_v100_N00E006/ESA_WorldCover_10m_2020_v100_N00E006_Map.tif",
      "type": "image/tiff",
      "roles": [
         "data"
      ],
      "title": "ESA_WORLDCOVER_10M_MAP"
},

This seems to be causing issues as external tools need to have logic to translate these relative paths into absolute paths whenever they want to use the corresponding asset. For example, in case of the APEx STAC browser, the download URL is pointing to:

https://browser.apex.esa.int/d/ESA_WORLDCOVER_10M_2020_V1/2020/12/31/ESA_WorldCover_10m_2020_v100_N00E006/ESA_WorldCover_10m_2020_v100_N00E006_Map.tif

which isn’t working. I believe we are experiencing the same issue when trying to use such as collection in openEO.

1 Like

For CDSE openEO, we got the same question many times, so a first version of minimal requirements + some background is now here:
https://documentation.dataspace.copernicus.eu/APIs/openEO/openeo-backend/docs/load_stac.html

We also contributed to these EOEPCA guidelines:

And this is in preparation at STAC PSC level, it seems to be more generic than what we need, but still very relevant because it more explicitly handles certain details that we often take for granted:

1 Like

Considering this from the perspective of the APEX Geospatial Explorer, the important factor is to be able to interpret the data that is available.

For example if a COG that represents categorical data contains values 10, 20, 30, 40 etc, rather than having to consult Project Documentation to understand what these refer to, it would be extremely useful to describe this using the Classification Extension to STAC (GitHub - stac-extensions/classification: Describes categorical values and bitfields to give values in a file a certain meaning (classification).). The use of this at the Item level for Model data outputs, as per this example classification/examples/item-model-classes.json at main · stac-extensions/classification · GitHub would be particularly valuable.

This describes what the data is and provides a colour hint on how it should be rendered.

This would allow the Configuration Builder for the GE to be able to read the STAC record directly and add these details to the configuration file, so that the data is rendered exactly as the data publisher intended, and builds a legend that describes the data as intended.

Just to clarify here, whilst the classification extension does not require them to be mandatory, the use of title and color-hint on the data class object are especially helpful.

Hi @bram.janssen ,

I understand that this has to do with incorrect interpretation or implementation of the STAC spec by clients, which however also seems to affect certain STAC browser versions (e.g. it affects version 4.0.0-rc1 but not 3.3.4).

This is an important issue to monitor and address as it is highly undesirable to have to frequently change the metadata.

Paulo

My main points about this are:

  • the guidelines should clearly refer to an agreed STAC profile that works well (I’m a bit wary of writing compatible, aligned or saying it is “the same”) with other relevant initiatives (PRR, EOEPCA, EARTH-CODE, OSC), both for collections and for items. For the PRR, relevant references are PRR Collections Specifications for collection metadata and for items you can consider the following example (there should soon be an online reference):

{
“type”: “Feature”,
“stac_version”: “1.0.0”,
“stac_extensions”: [
“https://stac-extensions.github.io/alternate-assets/v1.1.0/schema.json”,
“https://stac-extensions.github.io/file/v2.1.0/schema.json”
],
“id”: “S3A_OPER_AUX_GNSSRD_POD__20171212T193142_V20160223T235943_20160224T225600”,
“properties”: {
“start_datetime”: “2015-05-19T12:00:00.000000Z”,
“end_datetime”: "2015-05-19T12:00:00.000000Z ",
},
“assets”: {
“PRODUCT”: { → at least one asset of role “data” is mandatory
“href”: “link to product/S3A_OPER_AUX_GNSSRD_POD__20171212T193142_V20160223T235943_20160224T225600.tgz”,
“title”: “Product”,
“type”: “image/tiff; application=geotiff; profile=cloud-optimized”,
“role”: “data”,
“file:checksum”: “90e4021044a8995dd50b6657a037a7839304535b”,
“file:size”: 153600
},
“QUICKLOOK”: {
“href”: “link to product/S3A_OPER_AUX_GNSSRD_POD__20171212T193142_V20160223T235943_20160224T225600.png”,
“title”: “Product”,
“type”: “image/tiff; application=geotiff; profile=cloud-optimized”,
“role”: “quicklook”,
“file:checksum”: “90e4021044a8995dd50b6657a037a7839304535b”,
“file:size”: 153600
}
}
}

requiring “start/end datetime” instead of the single datetime (for OpenEO), requiring, for the “PRODUCT” assets, the “type”, “role” and “file:size” as mandatory, “title” defaulting to “Product” and “file:checksum” as optional. Optional quicklook.

1 Like