geoips.data_manipulations package#


geoips.data_manipulations.conversions module#

Routines for converting between units.

geoips.data_manipulations.conversions.unit_conversion(data_array, input_units=None, output_units=None)[source]#

Convert array in units ‘input_units’ to units ‘output_units’.

  • data_array (ndarray) – numpy.ndarray or numpy.MaskedArray of data values to be converted

  • input_units (str, optional) – Units of input data array, defaults to None

  • output_units (str, optional) – Units of output data array, defaults to None


Return, with units converted from ‘input_units’ to ‘output_units’

Return type:


geoips.data_manipulations.corrections module#

Apply min/max values, normalize, and invert data arrays.

geoips.data_manipulations.corrections.apply_data_range(data, min_val=None, max_val=None, min_outbounds='crop', max_outbounds='crop', norm=True, inverse=False)[source]#

Apply minimum and maximum values to an array of data.

Normalize, invert, and handle out of bounds data as requested.

  • data (numpy.ndarray or – data values to which the data range will be applied.

  • min_val (float, default None) –

    • The minimum bound to be applied to the input data as a scalar,

    • If None, use data.min().

  • max_val (float, default=None) –

    • The maximum bound to be applied to the input data as a scalar.

    • If None, use data.max().

  • min_outbounds (str, default='crop') –

    Method to use when applying bounds as a string. Valid values are:

    • retain: keep all pixels as is

    • mask: mask all pixels that are out of range.

    • crop: set all out of range values to min_val

  • max_outbounds (str, default='crop') –

    Method to use when applying bounds as a string. Valid values are:

    • retain: keep all pixels as is

    • mask: mask all pixels that are out of range.

    • crop: set all out of range values to max_val

  • norm (bool, default=True) –

    Boolean flag indicating whether to normalize (True) or not (False).

    • If True, returned data will be in the range from 0 to 1.

    • If False, returned data will be in the range min_val to max_val.

  • inverse (bool, default=False) –

    Boolean flag indicating whether to invert data (True) or not (False).

    • If True, returned data will be inverted

    • If False, returned data will not be inverted


Return numpy.ndarray or Input data array with values above ‘max_val’ or below ‘min_val’ retained, cropped, or masked.

Return type:


geoips.data_manipulations.corrections.apply_gamma(data_array, gamma)[source]#

Apply gamma correction to all values in the data array.

Gamma correction applied as: data_array ** (1.0 / float(gamma))

  • data_array (numpy.ndarray or – data array to which gamma will be applied

  • gamma (float) – gamma correction value


Return numpy.ndarray or if data_array was MaskedArray with gamma correction applied data_array ** (1.0 / float(gamma))

Return type:


geoips.data_manipulations.corrections.apply_maximum_value(data, max_val, outbounds)[source]#

Apply maximum value to an array of data.

  • data (numpy.ndarray or – data values to which the maximum value will be applied.

  • max_val (float) – The maximum bound to be applied to the input data as a scalar.

  • outbounds (str) –

    Method to use when applying bounds as a string. Valid values are:

    retain: keep all pixels as is mask: mask all pixels that are out of range. crop: set all out of range values to max_val.


Return numpy.ndarray or Input data array with values above ‘max_val’ retained, cropped, or masked appropriately.

Return type:


geoips.data_manipulations.corrections.apply_minimum_value(data, min_val, outbounds)[source]#

Apply minimum values to an array of data.

  • data (numpy.ndarray or – data values to which the minimum value will be applied.

  • min_val (float) – The minimum bound to be applied to the input data as a scalar.

  • outbounds (str) –

    Method to use when applying bounds as a string. Valid values are:

    retain: keep all pixels as is mask: mask all pixels that are out of range. crop: set all out of range values to min_val.


Return numpy.ndarray or Input data array with values below ‘min_val’ retained, cropped, or masked appropriately.

Return type:


geoips.data_manipulations.corrections.apply_offset(data_array, offset)[source]#

Apply offset to all values in data_array.

Offset applied as: data_array + offset

  • data_array (numpy.ndarray or – data values to which offset will be applied.

  • scale_factor (float) – requested offset.


Return numpy.ndarray or Input data array with offset applied data_array + offset

Return type:


geoips.data_manipulations.corrections.apply_scale_factor(data_array, scale_factor)[source]#

Apply scale factor to all values in data_array.

Scale factor applied as: data_array * scale_factor

  • data_array (numpy.ndarray or – data values to be scaled

  • scale_factor (float) – requested scale factor


Return numpy.ndarray or Input data array with scale factor applied data_array * scale_factor

Return type:


geoips.data_manipulations.corrections.apply_solar_zenith_correction(data_array, sunzen_array)[source]#

Apply solar zenith angle correction to all values in data_array.

Solar zenith correction applied as: data / cos(sunzen)

  • data_array (numpy.ndarray or – data values to be masked

  • sunzen_array (numpy.ndarray or – solar zenith angles of the same shape as the data array.


Return numpy.ndarray or if original data_array was MaskedArray with each value in the data_array divided by cos(sunzen).

Return type:


geoips.data_manipulations.corrections.invert_data_range(data, min_val=None, max_val=None)[source]#

Invert data range to an array of data.

  • data (numpy.ndarray or – data values to which the data range will be applied.

  • min_val (float, optional) – The minimum bound to be applied to the input data as a scalar, by default None, which results in data.min().

  • max_val (float, optional) – The maximum bound to be applied to the input data as a scalar. by default None, which results in data.max().


Return numpy.ndarray or Input data array with values inverted.

Return type:


geoips.data_manipulations.corrections.mask_day(data_array, sunzen_array, max_zenith=90)[source]#

Mask where solar zenith angle less than the maxinum specified value.

Mask all pixels within the data array where the solar zenith angle is less than the maxinum specified value.

  • data_array (numpy.ndarray or – data values to be masked

  • sunzen_array (numpy.ndarray) – numpy.ndarray or of solar zenith angles, of the same shape as the data array

  • max_zenith (float, optional) – Mask all locations in data_array where sunzen_array is less than max_zenith, by default 90


Data array with all locations corresponding to a solar zenith angle less than max_zenith masked.

Return type:

geoips.data_manipulations.corrections.mask_night(data_array, sunzen_array, min_zenith=90)[source]#

Mask where solar zenith angle greater than the minimum specified value.

Mask all pixels within the data array where the solar zenith angle is greater than the mininum specified value.

  • data_array (numpy.ndarray or – data values to be masked.

  • sunzen_array (numpy.ndarray or – array of solar zenith angles, same shape as the data array.

  • min_zenith (float, optional) – Mask all locations in data_array where sunzen_array is greater than min_zenith, by default 90.


Data array with all locations corresponding to a solar zenith angle greater than min_zenith masked.

Return type:

geoips.data_manipulations.corrections.normalize(data, min_val=None, max_val=None, min_bounds='crop', max_bounds='crop')[source]#

Normalize data array with min_val and max_val to range 0 to 1.

Default to cropping outside requested data range.

  • data (numpy.ndarray or – data values to which the data range will be applied.

  • min_val (float, default=None) –

    • The minimum bound to be applied to the input data as a scalar,

    • If None, use data.min().

  • max_val (float, default=None) –

    • The maximum bound to be applied to the input data as a scalar.

    • If None, use data.max().

  • min_outbounds (str, default='crop') –

    Method to use when applying bounds as a string. Valid values are:

    • retain: keep all pixels as is

    • mask: mask all pixels that are out of range.

    • crop: set all out of range values to min_val

  • max_outbounds (str, default='crop') –

    Method to use when applying bounds as a string. Valid values are:

    • retain: keep all pixels as is

    • mask: mask all pixels that are out of range.

    • crop: set all out of range values to max_val


Return numpy.ndarray or Input data array normalized between 0 and 1, with values above ‘max_val’ or below ‘min_val’ retained, cropped, or masked.

Return type:

numpy.ndarray module#

Introspection functions on data arrays.[source]#

Determine percent of a numpy.ndarray that is not NaN values.


data_array (numpy.ndarray) – Final processed array from which to determine coverage, invalid values specified by “numpy.nan”.


percent of input data array that is not numpy.nan.

Return type:


Determine percent of a array that is not masked.


data_array ( – Final processed array from which to determine coverage


percent of input data array that is not masked.

Return type:


geoips.data_manipulations.merge module#

Utilities for merging granules into a single data array.

These utilities can apply to potentially different data sources - spanning a variety of sensors and platforms into a single final dataset.

geoips.data_manipulations.merge.daterange(start_date, end_date)[source]#

Check one day at a time.

If end_date - start_date is between 1 and 2, days will be 1, and range(1) is 0. So add 2 to days to set range.

geoips.data_manipulations.merge.find_datafiles_in_range(sector_name, platform_name, source_name, min_time, max_time, basedir, product_name, every_min=True, verbose=False, time_format='%H%M', actual_datetime=None, single_match=False)[source]#

Find datafiles from a specified set of parameters.

  • sector_name (str) – Sector of interest

  • platform_name (str) – platform of interest

  • source_name (str) – Source of interest

  • min_time (datetime.datetime) – Minimum time to search

  • max_time (datetime.datetime) – Maximum time to search

  • basedir (str) – Base directory to search

  • product_name (str) – Product of interest

  • every_min (bool, optional) – Check every minute, by default True

  • verbose (bool, optional) – Print a lot of log output during the search, by default False

  • time_format (str, optional) – Format of time information in filenames, by default “%H%M”

  • actual_datetime (datetime.datetime, optional) – Actual datetime of the requested data, required if single_match is True, by default None

  • single_match (bool, optional) – Only return the closest matching file if True, else return all matching files, by default False


List of all filenames matching the given parameters (list of length 1 if single_match is True, all matching files if single_match is false)

Return type:

list of str

geoips.data_manipulations.merge.get_matching_files(primary_sector_name, subsector_names, platforms, sources, max_time_diffs, basedir, merge_datetime, product_name, time_format='%H%M', buffer_mins=30, verbose=False, single_match=False)[source]#

Given the current set of parameters, find all matching files.

Given the current primary sector, and associated subsectors, platforms, and sources, find all matching files.

  • primary_sector_name (str) – The final sector that all data will be stitched into. ie ‘GlobalGlobal’

  • subsector_names (list of str) – List of all subsectors that will be merged into the final sector. (potentially including the full primary_sector_name.) ie [‘GlobalGlobal’, ‘GlobalAntarctic’, ‘GlobalArctic’]

  • platforms (list of str) – List of all desired platforms. platforms, sources, and max_time_diffs correspond to one another and should be the same length and in the same order.

  • sources (list of str) – List of all desired sources. platforms, sources, and max_time_diffs correspond to one another and should be the same length and in the same order.

  • max_time_diffs (list of int) – Minutes. List of allowed time diffs for given platform/source. Matches max_time_diff before the requested merge_datetime argument. platforms, sources, and max_time_diffs correspond to one another and should be the same length and in the same order.

  • basedir (str) – Base directory in which to look for the matching files.

  • merge_datetime (datetime) – Attempt matching max_time_diff prior to merge_datetime

  • product_name (str) – product_name string found in matching files

  • time_format (str, optional) – Requested time format for filenames (strptime format string), by default ‘%H%M’

  • verbose (bool, optional) – Print a lot of log output during the search, by default False

  • single_match (bool, optional) – Only return the closest matching file if True, else return all matching files, by default False


List of all filenames matching the given parameters (list of length 1 if single_match is True, all matching files if single_match is false)

Return type:

list of str

geoips.data_manipulations.merge.hourrange(start_date, end_date)[source]#

Check one hour at a time.

geoips.data_manipulations.merge.minrange(start_date, end_date)[source]#

Check one minute at a time.

Module contents#

geoips.data_manipulations init file.