Ancillary Functions

buffer_min_overlap

Buffer a rectangular geometry to a minimum overlap with a second geometry.

buffer_time

Time range buffering

check_scene_consistency

Check the consistency of a scene selection.

check_spacing

Check whether the spacing fits into the MGRS tile boundaries.

combine_polygons

Combine polygon vector objects into one.

compute_hash

Compute the (multi)hash of a file using the specified algorithm.

datamask

Create data masks for a given image file.

date_to_utc

convert a date object to a UTC date string or datetime object.

defaultdict_to_dict

Convert a (nested) defaultdict to a regular dictionary.

generate_unique_id

Returns a unique product identifier as a hexadecimal string.

get_kml

Download the Sentinel-2 MGRS grid KML file.

get_max_ext

Gets the maximum extent from a list of geometries.

get_tmp_name

Get the name of a temporary file with defined suffix.

group_by_attr

Group items based on a key function.

group_by_time

Group scenes by their acquisition time difference.

pixel_size_degrees

Convert a pixel size from meters to degrees.

vrt_add_overviews

Add overviews to an existing VRT file.

cesard.ancillary.buffer_min_overlap(geom1, geom2, percent=1, step=None)[source]

Buffer a rectangular geometry to a minimum overlap with a second geometry. The geometry is iteratively buffered until the minimum overlap is reached. If the overlap of the input geometries is already larger than the defined threshold, a copy of the original geometry is returned.

Parameters:
  • geom1 (Vector) – the geometry to be buffered

  • geom2 (Vector) – the reference geometry to intersect with

  • percent (int | float) – the minimum overlap in percent of geom1

  • step (int | float | None) – the buffering step size. If None, the step size is 0.1 % of the average rectangle corner length.

Return type:

Vector

cesard.ancillary.buffer_time(start, stop, as_datetime=False, str_format='%Y%m%dT%H%M%S', **kwargs)[source]

Time range buffering

Parameters:
  • start (str) – the start time date object to convert; timezone-unaware dates are interpreted as UTC.

  • stop (str) – the stop time date object to convert; timezone-unaware dates are interpreted as UTC.

  • as_datetime (bool) – return datetime objects instead of strings?

  • str_format (str) – the output string format (ignored if as_datetime is True)

  • kwargs – time arguments passed to datetime.timedelta()

Return type:

tuple[str | datetime, str | datetime]

Returns:

the buffered start and stop time as string or datetime object

cesard.ancillary.check_scene_consistency(scenes)[source]

Check the consistency of a scene selection. The following pyroSAR object attributes must be the same:

  • sensor

  • acquisition_mode

  • product

  • frameNumber (data take ID for Sentinel-1)

Parameters:

scenes (list[str | ID]) – the scene selection

Raises:

RuntimeError

Return type:

None

cesard.ancillary.check_spacing(spacing)[source]

Check whether the spacing fits into the MGRS tile boundaries.

Parameters:

spacing (int | float) – the target pixel spacing in meters

Return type:

None

cesard.ancillary.combine_polygons(vector, crs=4326, multipolygon=False, layer_name='combined')[source]

Combine polygon vector objects into one. The output is a single vector object with the polygons either stored in separate features or combined into a single multipolygon geometry.

Parameters:
  • vector (Vector | list[Vector]) – the input vector object(s). Providing only one object only makes sense when multipolygon=True.

  • crs (int | str) – the target CRS. Default: EPSG:4326

  • multipolygon (bool) – combine all polygons into one multipolygon? Default False: write each polygon into a separate feature.

  • layer_name (str) – the layer name of the output vector object.

Return type:

Vector

Returns:

the combined vector object

cesard.ancillary.compute_hash(file_path, algorithm='sha256', chunk_size=8192, multihash_encode=True)[source]

Compute the (multi)hash of a file using the specified algorithm.

Parameters:
  • file_path (str) – Path to the file.

  • algorithm (str) – Hash algorithm to use (default is ‘sha256’).

  • chunk_size (int) – Size of chunks to read from the file in bytes (default is 8192).

  • multihash_encode (bool) – Encode the hash according to the multihash specification (default is True)? The hash generated by hashlib will be wrapped using multiformats.multihash.wrap().

Return type:

str

Returns:

the hexadecimal hash string of the file.

cesard.ancillary.datamask(measurement, dm_ras, dm_vec)[source]

Create data masks for a given image file. The created raster data mask does not contain a simple mask of nodata values. Rather, a boundary vector geometry containing all valid pixels is created and then rasterized. This boundary geometry (single polygon) is saved as dm_vec. In this case dm_vec is returned. If the input image only contains nodata values, no raster data mask is created, and an empty dummy vector mask is created. In this case the function will return None.

Parameters:
  • measurement (str) – the binary image file

  • dm_ras (str) – the name of the raster data mask

  • dm_vec (str) – the name of the vector data mask

Return type:

str | None

Returns:

dm_vec if the vector data mask contains a geometry or None otherwise

cesard.ancillary.date_to_utc(date, as_datetime=False, str_format='%Y%m%dT%H%M%S')[source]

convert a date object to a UTC date string or datetime object.

Parameters:
  • date (str | datetime | None) – the date object to convert; timezone-unaware dates are interpreted as UTC.

  • as_datetime (bool) – return a datetime object instead of a string?

  • str_format (str) – the output string format (ignored if as_datetime is True)

Return type:

str | datetime | None

Returns:

the date string or datetime object in UTC time zone

cesard.ancillary.defaultdict_to_dict(d)[source]

Convert a (nested) defaultdict to a regular dictionary.

Parameters:

d (defaultdict) – the defaultdict to convert

Return type:

dict

Returns:

the converted dictionary

cesard.ancillary.generate_unique_id(encoded_str, length=4)[source]

Returns a unique product identifier as a hexadecimal string. The CRC-16 algorithm used to compute the unique identifier is CRC-CCITT (0xFFFF). The resulting CRC value is truncated to the number of hexadecimal characters specified by the length argument.

Parameters:
  • encoded_str (bytes) – A string that should be used to generate a unique id from. The string needs to be encoded; e.g.: ‘abc’.encode().

  • length (int) – The desired length of the output string in hexadecimal characters (max: 4). Values higher than 4 will be capped at 4, since CRC-16 only produces 16 bits.

Return type:

str

Returns:

The unique product identifier (upper-case hexadecimal string).

cesard.ancillary.get_kml()[source]

Download the Sentinel-2 MGRS grid KML file. The target folder is ~/cesard.

Return type:

str

Returns:

the path to the KML file

cesard.ancillary.get_max_ext(geometries, buffer=None, crs=None)[source]

Gets the maximum extent from a list of geometries.

Parameters:
  • geometries (list[Vector]) – List of Vector geometries.

  • buffer (float | None) – The buffer in units of the geometries’ CRS to add to the extent.

  • crs (str | int | None) – The target CRS of the extent. If None (default) the extent is expressed in the CRS of the input geometries.

Return type:

dict[str, float]

Returns:

The maximum extent of the selected Vector geometries including the chosen buffer.

cesard.ancillary.get_tmp_name(suffix)[source]

Get the name of a temporary file with defined suffix. Files are placed in a subdirectory ‘cesard’ of the regular temporary directory so the latter is not flooded with too many files in case they are not properly deleted.

Parameters:

suffix (str) – the file suffix/extension, e.g. ‘.tif’

Return type:

str

Returns:

the temporary file name

cesard.ancillary.group_by_attr(items, key_fn)[source]

Group items based on a key function.

Parameters:
Return type:

List[List[TypeVar(T)]]

Returns:

A list of groups, where each group is a list of items with the same key.

Example

>>> list_in = ['abc', 'axy', 'brt', 'btk']
>>> print(group_by_attr(list_in, lambda x: x[0]))
[['abc', 'axy'], ['brt', 'btk']]
>>> list_in = [{'a': 1}, {'a': 2}, {'a': 1}, {'a': 2}]
>>> print(group_by_attr(list_in, lambda x: x['a']))
[[{'a': 1}, {'a': 1}], [{'a': 2}, {'a': 2}]]
cesard.ancillary.group_by_time(scenes, time=3)[source]

Group scenes by their acquisition time difference.

Parameters:
  • scenes (list[ID | str]) – a list of image names

  • time (int | float) – a time difference in seconds by which to group the scenes. The default of 3 seconds incorporates the overlap between SLCs.

Return type:

list[list[ID]]

Returns:

a list of sub-lists containing the file names of the grouped scenes

cesard.ancillary.pixel_size_degrees(lon, lat, xres, yres)[source]

Convert a pixel size from meters to degrees.

Parameters:
  • lon (float) – longitude in degrees

  • lat (float) – latitude in degrees

  • xres (float) – x resolution in meters

  • yres (float) – y resolution in meters

Return type:

tuple[float, float]

Returns:

the x and y resolution in degrees

See also

pyproj.Geod.fwd

cesard.ancillary.vrt_add_overviews(vrt, overviews, resampling='AVERAGE')[source]

Add overviews to an existing VRT file. Existing overviews will be overwritten.

Parameters:
  • vrt (str) – the VRT file

  • overviews (list[int]) – the overview levels

  • resampling (str) – the overview resampling method

Return type:

None