Crowdsourced Bathymetry Frequently Asked Questions

Terminology:

What is “Crowdsourced Bathymetry”?

B-12 Edition 3.0.0 International Hydrographic Organization Guidance to Crowdsourced Bathymetry defines crowdsourced bathymetry (CSB) as the collection and sharing of depth measurements from vessels, using standard navigation instruments, while engaged in routine maritime operations.

Where can I find more information on the IHO and its involvement in Crowdsourced Bathymetry programs?

See the IHO’s CSB web page and the Crowdsourced Bathymetry tab on the IHO Data Centre for Digital Bathymetry (DCDB) page.

What is the DCDB?

The International Hydrographic Organization Data Centre for Digital Bathymetry (DCDB) was established in 1990 to steward the worldwide collection of bathymetric data. The DCDB archives and shares, freely and without restrictions, depth data contributed by mariners.

It is hosted by the U.S. National Oceanographic and Atmospheric Administration (NOAA) on behalf of the IHO Member States.

What is a “Trusted Node”?

An approved organization or individual who systematically receives CSB data collected

by vessels or other platforms and delivers them to the IHO DCDB.

What is a unique_id?

An identifier supplied by the trusted node which (usually) uniquely identifies the provider and platform (a.k.a ship) name which contributed the data. The characters preceding the first hyphen (-) identify the trusted node, while remaining characters are typically consistent for each contributing vessel, throughout the life of service of the vessel.  Platform names are not unique across providers or sometimes even for a single provider. See B-12 - IHO Guidelines for Crowdsourced Bathymetry maintained by the IHO’s Crowdsourced Bathymetry Working Group. 

What is the NODD?

The NOAA Open Data Dissemination Program provides public access to NOAA's open data on commercial cloud platforms through public-private partnerships.

What is the relationship between the NODD and the AWS Open Data Registry?

The Open Data Registry is a catalog maintained by Amazon Web Services (AWS) which includes a reference to the Crowdsourced Bathymetry archive (among other datasets) hosted by the NOAA Open Data Dissemination Program. See Open Data Registry web page for more information.

What is the difference between a “platform”, “vessel”, and “ship”?

At this time, the terms platform, vessel, and ship are used interchangeably.

Data Characteristics

Are all CSB data submitted to the IHO Data Centre available online?

No, only data in international waters or from coastal states who responded positively to the IHO’s request to indicate their position on the public sharing of CSB data collected within waters subject to their national jurisdiction are available online. Additionally individual data files which contain invalid geographic coordinates or which are missing valid timestamps are excluded from the public access.

What is the CSB API and what is the difference between data downloaded using it and the files in the NODD S3 Bucket?

The CSB data are available in two different organizational schemes. In the first, data are stored in the individual files provided by the Trusted Node and are organized by date. In the second data are organized into a single virtual collection where the distinction between the original submission files is no longer relevant. The data in this virtual collection can only be accessed via the CSB data access API but the API does allow export of data filtered by geographic area of interest and attributes such as platform, trusted node name, and date. Additionally data delivered via the API include the attribute for archive date but do not include the attribute for FILE_UUID found in the individual CSB files.

How can I tell when new data are added to the CSB archive?

You can use standard AWS tools and SDKs to subscribe to the SNS topic to receive notifications when new files are added. By subscribing to these notifications you can configure your own email notifications or trigger processing using your own AWS account.

Where can I find information about the format and content of the CSB files?

Files are comma-separated value text files with one bathymetric sounding (i.e. record) per line. Each file contains data from a single provider and platform and  the files vary greatly in size from a few kilobytes to several hundred megabytes. Each records contains the following fields:

See documentation in the S3 bucket for more information.

Is the UNIQUE_ID value unique for a vessel across Trusted Nodes?

The UNIQUE_ID value is assigned by the trusted node and intentionally outside the scope of the DCDB. Although the UNIQUE_ID contains a prefix identifying the trusted node which provided the data, there is nothing beyond the data collection guidance to ensure that the trusted node does use the same value for multiple vessels nor that a given vessel is not assigned different UNIQUE_ID values over time. However if the same vessel were to contribute via different trusted nodes, it would have different UNIQUE_ID values even if only in the prefix component of the UNIQUE_ID.

Is the UNIQUE_ID unique for a journey?

No

What is the PROVIDER field?

The Provider field uniquely identifies the Trusted Node which submitted the data to the DCDB.

How often are the data updated in the archive?

Data are received at the DCDB at a time of the Trusted Node’s choosing - ranging from daily to annually.  Once received, they are processed, uploaded to the S3 bucket, and via the API (usually) within a few days.

What is the license or restrictions on using these data?

Regardless of whether the data are provided to the IHO DCDB by a Trusted Node or an individual, the data is dedicated to the public domain in accordance with the “Creative Commons Zero” universal public domain dedication (CC0 1.0). The IHO DCDB intends to publicly release the Trusted Node’s data in its original form under the CC0 public domain dedication via the IHO DCDB Viewer.

Data Access:

What are the different ways that I can access the data?

Where can I find the metadata corresponding to the files hosted in the S3 bucket?

Metadata is not available with the NODD-hosted files.  File-level metadata can be obtained via the IHO DCDB Viewer.

How can I download all the data for a specific vessel?

The API can be used to download data for a specific vessel using the unique_id or or platform attribute but note that the latter is not guaranteed to be unique. Files hosted in the S3 bucket are named with a convention which includes the unique_id but it is the responsibility of the requestor to automate the identification and download of the files associated with the platform of interest.

How can I download all the data for a specific journey?

A cruise or journey in this context is defined as a series of soundings from a single platform without a significant spatial or temporal gap between subsequent points.   Currently there is no easy way to locate such groups of data using either the API or the files hosted in the S3 bucket. In some cases, a cruise may be contained within a single file but in others it may span multiple files or multiple cruises may be contained in a single file. With knowledge of when a cruise started and stopped, one can easily download data from a given date range but from that point will require processing by the user to reconstruct a cruise.

How can I retrieve the metadata corresponding to a given cruise?

Data cannot be directly associated with a cruise nor is there cruise-level metadata currently available online.  File-level metadata (which may correspond to cruise in some cases) can be obtained via the IHO DCDB Viewer.

Can I browse the data that are in the archive?

Yes, you can use the AWS S3 Explorer to browse the CSB files which are publicly available. They are organized by date of contribution and appear in the Explorer as subdirectories of year, month, and day. You can also use standard S3 tools and SDKs, both from AWS and third parties, to explore the files. See the documentation for information.

last updated Aug 8, 2023