NetCDF data objects
On this page, we describe the use of the NetCDF data object which can be used to create exchange items for NetCDF files. This data object is not specific to particular wrappers but can be used for many NetCDF files as long as the format of the variables matches one of the examples below when reading observations, model results or even updating states or boundary conditions. The NetCDF data object is very useful since many different model software packages use NetCDF files for storing and reading data.
In org.openda.exchange.dataobjects.NetcdfDataObject
, the
implementation of NetCDF data objects can be found. It supports many
different variable formats, such as
Scalar data, which in the NetCDF variables have 2 dimensions one for time and one for stations;
Three-dimensional scalar data, which in the NetCDF variables have 3 dimensions one for time, one for stations, and one for the vertical layers;
Gridded data, which in the NetCDF variables have 3 dimensions one for time, one for \(y\) coordinates, and one for \(x\) coordinates.
These formats will be described in more detail below.
Note that the current implementation only applies to files that follow the CF metadata conventions.
If the NetCDF file does not, then its own data object should be implemented in org.openda.exchange.dataobjects.NetcdfDataObject
.
Scalar data
The simplest form of scalar data has two dimensions, one for time and one for location. This will create exchange items for each measurement location in each observation variable. The corresponding id is a combination of the variable name with the location name which can be any string. The exchange item will contain the data for all the time steps. As an example, we create the exchange items
24.waterlevel
, 25.waterlevel
, 26.waterlevel
, 27.waterlevel
,
28.waterlevel
, 29.waterlevel
by using the following variables waterlevel
and data
:
float waterlevel(time=3, stations=6);
:standard_name = "water_surface_height_above_reference_datum detection_minimum";
:long_name = "waterlevel";
:units = "m";
:_FillValue = -9999.0f; // float
:coordinates = "lat lon";
:cell_methods = "time: maximum";
data:
station_id = "26", "27", "28", "24", "29", "25"
with corresponding XML config:
<dataObject className="org.openda.exchange.dataobjects.NetcdfDataObject">
<file>scalar.nc</file>
<id>dataObjectId</id>
<arg>true</arg>
<arg>false</arg>
</dataObject>
The two arguments for lazy reading and writing listed in this configuration will be explained below.
Three-dimensional scalar data
For scalar data that have multiple layers an extra key=value argument needs to be provided:
layerDimensionName=<name>
. The size of that
dimension will determine the amount of layer-specific exchange items
that will be created.
When this argument is provided the id of the
exchange items will be
<variableName>.<locationName>.layer<layerIndex>
.
Examples of layer-specific values are concentration or temperature measurements
at different water depths or atmospheric heights.
Suppose we have the next variables temperature
and data
:
double temperature(time=49, stations=3, laydim=20);
:coordinates = "station_x_coordinate station_y_coordinate station_id zcoordinate_c";
:units = "degC";
:geometry = "station_geom";
:_FillValue = -999.0; // double
:standard_name = "sea_water_temperature";
data:
station_id = "station01", "station02", "station03"
This will result in 60 exchange items with ids like:station0i.temperature.layerj
, where i=1,2,3, j=0,…,19.
The corresponding XML config looks as follows (note that the layerDimensionName=laydim
argument refers to the snippet laydim=20
above):
<dataObject className="org.openda.exchange.dataobjects.NetcdfDataObject">
<file>scalarLayers.nc</file>
<id>dataObjectId</id>
<arg>true</arg>
<arg>false</arg>
<arg>layerDimensionName=laydim</arg>
</dataObject>
Gridded data
Gridded data variables are used to describe model results for a computation on a grid \((x,y)\). They are supported with three dimensions containing time, \(x\) and \(y\). In this case, 1 exchange item will be created for the whole variable with the variable id as a name.
Example of a gridded variable:
double p(time=49, Y=16, X=27);
:coordinates = "Time Y X";
:long_name = "Atmospheric Pressure";
:standard_name = "air_pressure";
:units = "Pa";
with corresponding XML config (same as for simple scalar):
<dataObject className="org.openda.exchange.dataobjects.NetcdfDataObject">
<file>grid.nc</file>
<id>dataObjectId</id>
<arg>true</arg>
<arg>false</arg>
</dataObject>
Extra arguments
In this section, we describe some extra arguments that can be used in the NetCDF data object.
Boolean: Lazy reading and writing
The first two possible extra arguments are boolean values for lazy reading and lazy writing. When lazy reading is set to true, the data will only be read from the NetCDF file when the data is needed instead of when initializing the data object. When lazy writing is set to true, the data will be written when the data object is closed instead of directly when the data is being changed by OpenDA. Depending on how much data will be read and or written, how many times this will happen and available memory, users can choose which settings will be most suitable.
These boolean arguments always have to be specified as the first two <arg>
elements in the XML config for the NetCDF data object:
<dataObject className="org.openda.exchange.dataobjects.NetcdfDataObject">
<file>scalar.nc</file>
<id>dataObjectId</id>
<arg>true</arg>
<arg>false</arg>
</dataObject>
Key-value pair: requiredExchangeItemId
To prevent a long list of arguments with a specific order,
key-value pair arguments have been introduced. Any number of
requiredExchangeItemId=<ID>
arguments can be supplied which limits
the exchange items being created to the ones supplied. This can save a
lot of memory and performance for large NetCDF files.
If we have the following variables waterlevel
and data
:
float waterlevel(time=3, stations=6);
:standard_name =
"water_surface_height_above_reference_datum detection_minimum";
:long_name = "waterlevel";
:units = "m";
:_FillValue = -9999.0f; // float
:coordinates = "lat lon";
:cell_methods = "time: maximum";
data:
station_id = "26", "27", "28", "24", "29", "25"
it is possible to limit the creation of exchange items by adding three extra
arguments: requiredExchangeItemId=24.waterlevel
,
requiredExchangeItemId=26.waterlevel
, and
requiredExchangeItemId=27.waterlevel
. This way only the specified
exchange items will be created.
The XML config for this will look as follows:
<dataObject className="org.openda.exchange.dataobjects.NetCDFDataObject">
<file>scalar.nc</file>
<id>dataObjectId</id>
<arg>true</arg>
<arg>false</arg>
<arg>requiredExchangeItemId=24.waterlevel</arg>
<arg>requiredExchangeItemId=26.waterlevel</arg>
<arg>requiredExchangeItemId=27.waterlevel</arg>
</dataObject>
Key-value pair: allowTimeIndependentItems
The argument allowTimeIndependentItems=true/false
determines whether
time-independent exchange items should be created. Time-independent
exchange items can be created for variables that do not depend on a time
dimension. Default is false.
Key-value pairs: stationIdVarName
and stationDimensionVarName
The arguments stationIdVarName=<varName>
and stationDimensionVarName=<varName>
can be used to overrule the default names for the station id and station dimension variables.
The default names are station_id
and stations
respectively.
The XML config for overruling stationDimensionVarName
with station
(this is needed for D-Flow FM 2024.3 and newer) will look as follows:
<dataObject className="org.openda.exchange.dataobjects.NetCDFDataObject">
<file>scalar.nc</file>
<id>dataObjectId</id>
<arg>true</arg>
<arg>false</arg>
<arg>stationDimensionVarName=station</arg>
</dataObject>