geoips.data_manipulations package#

Submodules#

geoips.data_manipulations.conversions module#

Routines for converting between units.

geoips.data_manipulations.conversions.unit_conversion(data_array, input_units=None, output_units=None)[source]#

Convert array in units ‘input_units’ to units ‘output_units’.

Parameters:
  • data_array (ndarray) – numpy.ndarray or numpy.MaskedArray of data values to be converted

  • input_units (str, optional) – Units of input data array, defaults to None

  • output_units (str, optional) – Units of output data array, defaults to None

Returns:

Return numpy.ma.MaskedArray, with units converted from ‘input_units’ to ‘output_units’

Return type:

MaskedArray

geoips.data_manipulations.corrections module#

Apply min/max values, normalize, and invert data arrays.

geoips.data_manipulations.corrections.apply_data_range(data, min_val=None, max_val=None, min_outbounds='crop', max_outbounds='crop', norm=True, inverse=False)[source]#

Apply minimum and maximum values to an array of data.

Normalize, invert, and handle out of bounds data as requested.

Parameters:
  • data (numpy.ndarray or numpy.ma.MaskedArray) – data values to which the data range will be applied.

  • min_val (float, default None) –

    • The minimum bound to be applied to the input data as a scalar,

    • If None, use data.min().

  • max_val (float, default=None) –

    • The maximum bound to be applied to the input data as a scalar.

    • If None, use data.max().

  • min_outbounds (str, default='crop') –

    Method to use when applying bounds as a string. Valid values are:

    • retain: keep all pixels as is

    • mask: mask all pixels that are out of range.

    • crop: set all out of range values to min_val

  • max_outbounds (str, default='crop') –

    Method to use when applying bounds as a string. Valid values are:

    • retain: keep all pixels as is

    • mask: mask all pixels that are out of range.

    • crop: set all out of range values to max_val

  • norm (bool, default=True) –

    Boolean flag indicating whether to normalize (True) or not (False).

    • If True, returned data will be in the range from 0 to 1.

    • If False, returned data will be in the range min_val to max_val.

  • inverse (bool, default=False) –

    Boolean flag indicating whether to invert data (True) or not (False).

    • If True, returned data will be inverted

    • If False, returned data will not be inverted

Returns:

Return numpy.ndarray or numpy.ma.MaskedArray Input data array with values above ‘max_val’ or below ‘min_val’ retained, cropped, or masked.

Return type:

numpy.ndarray

geoips.data_manipulations.corrections.apply_gamma(data_array, gamma)[source]#

Apply gamma correction to all values in the data array.

Gamma correction applied as: data_array ** (1.0 / float(gamma))

Parameters:
  • data_array (numpy.ndarray or numpy.ma.MaskedArray) – data array to which gamma will be applied

  • gamma (float) – gamma correction value

Returns:

Return numpy.ndarray or numpy.ma.MaskedArray if data_array was MaskedArray with gamma correction applied data_array ** (1.0 / float(gamma))

Return type:

numpy.ndarray

geoips.data_manipulations.corrections.apply_maximum_value(data, max_val, outbounds)[source]#

Apply maximum value to an array of data.

Parameters:
  • data (numpy.ndarray or numpy.ma.MaskedArray) – data values to which the maximum value will be applied.

  • max_val (float) – The maximum bound to be applied to the input data as a scalar.

  • outbounds (str) –

    Method to use when applying bounds as a string. Valid values are:

    retain: keep all pixels as is mask: mask all pixels that are out of range. crop: set all out of range values to max_val.

Returns:

Return numpy.ndarray or numpy.ma.MaskedArray Input data array with values above ‘max_val’ retained, cropped, or masked appropriately.

Return type:

numpy.ndarray

geoips.data_manipulations.corrections.apply_minimum_value(data, min_val, outbounds)[source]#

Apply minimum values to an array of data.

Parameters:
  • data (numpy.ndarray or numpy.ma.MaskedArray) – data values to which the minimum value will be applied.

  • min_val (float) – The minimum bound to be applied to the input data as a scalar.

  • outbounds (str) –

    Method to use when applying bounds as a string. Valid values are:

    retain: keep all pixels as is mask: mask all pixels that are out of range. crop: set all out of range values to min_val.

Returns:

Return numpy.ndarray or numpy.ma.MaskedArray Input data array with values below ‘min_val’ retained, cropped, or masked appropriately.

Return type:

numpy.ndarray

geoips.data_manipulations.corrections.apply_offset(data_array, offset)[source]#

Apply offset to all values in data_array.

Offset applied as: data_array + offset

Parameters:
  • data_array (numpy.ndarray or numpy.ma.MaskedArray) – data values to which offset will be applied.

  • scale_factor (float) – requested offset.

Returns:

Return numpy.ndarray or numpy.ma.MaskedArray Input data array with offset applied data_array + offset

Return type:

numpy.ndarray

geoips.data_manipulations.corrections.apply_scale_factor(data_array, scale_factor)[source]#

Apply scale factor to all values in data_array.

Scale factor applied as: data_array * scale_factor

Parameters:
  • data_array (numpy.ndarray or numpy.ma.MaskedArray) – data values to be scaled

  • scale_factor (float) – requested scale factor

Returns:

Return numpy.ndarray or numpy.ma.MaskedArray Input data array with scale factor applied data_array * scale_factor

Return type:

numpy.ndarray

geoips.data_manipulations.corrections.apply_solar_zenith_correction(data_array, sunzen_array)[source]#

Apply solar zenith angle correction to all values in data_array.

Solar zenith correction applied as: data / cos(sunzen)

Parameters:
  • data_array (numpy.ndarray or numpy.ma.MaskedArray) – data values to be masked

  • sunzen_array (numpy.ndarray or numpy.ma.MaskedArray) – solar zenith angles of the same shape as the data array.

Returns:

Return numpy.ndarray or numpy.ma.MaskedArray if original data_array was MaskedArray with each value in the data_array divided by cos(sunzen).

Return type:

numpy.ndarray

geoips.data_manipulations.corrections.invert_data_range(data, min_val=None, max_val=None)[source]#

Invert data range to an array of data.

Parameters:
  • data (numpy.ndarray or numpy.ma.MaskedArray) – data values to which the data range will be applied.

  • min_val (float, optional) – The minimum bound to be applied to the input data as a scalar, by default None, which results in data.min().

  • max_val (float, optional) – The maximum bound to be applied to the input data as a scalar. by default None, which results in data.max().

Returns:

Return numpy.ndarray or numpy.ma.MaskedArray Input data array with values inverted.

Return type:

numpy.ndarray

geoips.data_manipulations.corrections.mask_day(data_array, sunzen_array, max_zenith=90)[source]#

Mask where solar zenith angle less than the maxinum specified value.

Mask all pixels within the data array where the solar zenith angle is less than the maxinum specified value.

Parameters:
  • data_array (numpy.ndarray or numpy.ma.MaskedArray) – data values to be masked

  • sunzen_array (numpy.ndarray) – numpy.ndarray or numpy.ma.MaskedArray of solar zenith angles, of the same shape as the data array

  • max_zenith (float, optional) – Mask all locations in data_array where sunzen_array is less than max_zenith, by default 90

Returns:

Data array with all locations corresponding to a solar zenith angle less than max_zenith masked.

Return type:

numpy.ma.MaskedArray

geoips.data_manipulations.corrections.mask_night(data_array, sunzen_array, min_zenith=90)[source]#

Mask where solar zenith angle greater than the minimum specified value.

Mask all pixels within the data array where the solar zenith angle is greater than the mininum specified value.

Parameters:
  • data_array (numpy.ndarray or numpy.ma.MaskedArray) – data values to be masked.

  • sunzen_array (numpy.ndarray or numpy.ma.MaskedArray) – array of solar zenith angles, same shape as the data array.

  • min_zenith (float, optional) – Mask all locations in data_array where sunzen_array is greater than min_zenith, by default 90.

Returns:

Data array with all locations corresponding to a solar zenith angle greater than min_zenith masked.

Return type:

numpy.ma.MaskedArray

geoips.data_manipulations.corrections.normalize(data, min_val=None, max_val=None, min_bounds='crop', max_bounds='crop')[source]#

Normalize data array with min_val and max_val to range 0 to 1.

Default to cropping outside requested data range.

Parameters:
  • data (numpy.ndarray or numpy.ma.MaskedArray) – data values to which the data range will be applied.

  • min_val (float, default=None) –

    • The minimum bound to be applied to the input data as a scalar,

    • If None, use data.min().

  • max_val (float, default=None) –

    • The maximum bound to be applied to the input data as a scalar.

    • If None, use data.max().

  • min_outbounds (str, default='crop') –

    Method to use when applying bounds as a string. Valid values are:

    • retain: keep all pixels as is

    • mask: mask all pixels that are out of range.

    • crop: set all out of range values to min_val

  • max_outbounds (str, default='crop') –

    Method to use when applying bounds as a string. Valid values are:

    • retain: keep all pixels as is

    • mask: mask all pixels that are out of range.

    • crop: set all out of range values to max_val

Returns:

Return numpy.ndarray or numpy.ma.MaskedArray Input data array normalized between 0 and 1, with values above ‘max_val’ or below ‘min_val’ retained, cropped, or masked.

Return type:

numpy.ndarray

geoips.data_manipulations.info module#

Introspection functions on data arrays.

geoips.data_manipulations.info.percent_not_nan(data_array)[source]#

Determine percent of a numpy.ndarray that is not NaN values.

Parameters:

data_array (numpy.ndarray) – Final processed array from which to determine coverage, invalid values specified by “numpy.nan”.

Returns:

percent of input data array that is not numpy.nan.

Return type:

float

geoips.data_manipulations.info.percent_unmasked(data_array)[source]#

Determine percent of a numpy.ma.Masked array that is not masked.

Parameters:

data_array (numpy.ma.MaskedArray) – Final processed array from which to determine coverage

Returns:

percent of input data array that is not masked.

Return type:

float

geoips.data_manipulations.merge module#

Utilities for merging granules into a single data array.

These utilities can apply to potentially different data sources - spanning a variety of sensors and platforms into a single final dataset.

geoips.data_manipulations.merge.daterange(start_date, end_date)[source]#

Check one day at a time.

If end_date - start_date is between 1 and 2, days will be 1, and range(1) is 0. So add 2 to days to set range.

geoips.data_manipulations.merge.find_datafiles_in_range(sector_name, platform_name, source_name, min_time, max_time, basedir, product_name, every_min=True, verbose=False, time_format='%H%M', actual_datetime=None, single_match=False)[source]#

Find datafiles from a specified set of parameters.

Parameters:
  • sector_name (str) – Sector of interest

  • platform_name (str) – platform of interest

  • source_name (str) – Source of interest

  • min_time (datetime.datetime) – Minimum time to search

  • max_time (datetime.datetime) – Maximum time to search

  • basedir (str) – Base directory to search

  • product_name (str) – Product of interest

  • every_min (bool, optional) – Check every minute, by default True

  • verbose (bool, optional) – Print a lot of log output during the search, by default False

  • time_format (str, optional) – Format of time information in filenames, by default “%H%M”

  • actual_datetime (datetime.datetime, optional) – Actual datetime of the requested data, required if single_match is True, by default None

  • single_match (bool, optional) – Only return the closest matching file if True, else return all matching files, by default False

Returns:

List of all filenames matching the given parameters (list of length 1 if single_match is True, all matching files if single_match is false)

Return type:

list of str

geoips.data_manipulations.merge.get_matching_files(primary_sector_name, subsector_names, platforms, sources, max_time_diffs, basedir, merge_datetime, product_name, time_format='%H%M', buffer_mins=30, verbose=False, single_match=False)[source]#

Given the current set of parameters, find all matching files.

Given the current primary sector, and associated subsectors, platforms, and sources, find all matching files.

Parameters:
  • primary_sector_name (str) – The final sector that all data will be stitched into. ie ‘GlobalGlobal’

  • subsector_names (list of str) – List of all subsectors that will be merged into the final sector. (potentially including the full primary_sector_name.) ie [‘GlobalGlobal’, ‘GlobalAntarctic’, ‘GlobalArctic’]

  • platforms (list of str) – List of all desired platforms. platforms, sources, and max_time_diffs correspond to one another and should be the same length and in the same order.

  • sources (list of str) – List of all desired sources. platforms, sources, and max_time_diffs correspond to one another and should be the same length and in the same order.

  • max_time_diffs (list of int) – Minutes. List of allowed time diffs for given platform/source. Matches max_time_diff before the requested merge_datetime argument. platforms, sources, and max_time_diffs correspond to one another and should be the same length and in the same order.

  • basedir (str) – Base directory in which to look for the matching files.

  • merge_datetime (datetime) – Attempt matching max_time_diff prior to merge_datetime

  • product_name (str) – product_name string found in matching files

  • time_format (str, optional) – Requested time format for filenames (strptime format string), by default ‘%H%M’

  • verbose (bool, optional) – Print a lot of log output during the search, by default False

  • single_match (bool, optional) – Only return the closest matching file if True, else return all matching files, by default False

Returns:

List of all filenames matching the given parameters (list of length 1 if single_match is True, all matching files if single_match is false)

Return type:

list of str

geoips.data_manipulations.merge.hourrange(start_date, end_date)[source]#

Check one hour at a time.

geoips.data_manipulations.merge.minrange(start_date, end_date)[source]#

Check one minute at a time.

Module contents#

geoips.data_manipulations init file.