geoips.data_manipulations package#
Submodules#
geoips.data_manipulations.conversions module#
Routines for converting between units.
- geoips.data_manipulations.conversions.unit_conversion(data_array, input_units=None, output_units=None)[source]#
Convert array in units ‘input_units’ to units ‘output_units’.
- Parameters:
data_array (ndarray) – numpy.ndarray or numpy.MaskedArray of data values to be converted
input_units (str, optional) – Units of input data array, defaults to None
output_units (str, optional) – Units of output data array, defaults to None
- Returns:
Return numpy.ma.MaskedArray, with units converted from ‘input_units’ to ‘output_units’
- Return type:
MaskedArray
geoips.data_manipulations.corrections module#
Apply min/max values, normalize, and invert data arrays.
- geoips.data_manipulations.corrections.apply_data_range(data, min_val=None, max_val=None, min_outbounds='crop', max_outbounds='crop', norm=True, inverse=False)[source]#
Apply minimum and maximum values to an array of data.
Normalize, invert, and handle out of bounds data as requested.
- Parameters:
data (numpy.ndarray or numpy.ma.MaskedArray) – data values to which the data range will be applied.
min_val (float, default None) –
The minimum bound to be applied to the input data as a scalar,
If None, use data.min().
max_val (float, default=None) –
The maximum bound to be applied to the input data as a scalar.
If None, use data.max().
min_outbounds (str, default='crop') –
Method to use when applying bounds as a string. Valid values are:
retain: keep all pixels as is
mask: mask all pixels that are out of range.
crop: set all out of range values to min_val
max_outbounds (str, default='crop') –
Method to use when applying bounds as a string. Valid values are:
retain: keep all pixels as is
mask: mask all pixels that are out of range.
crop: set all out of range values to max_val
norm (bool, default=True) –
Boolean flag indicating whether to normalize (True) or not (False).
If True, returned data will be in the range from 0 to 1.
If False, returned data will be in the range min_val to max_val.
inverse (bool, default=False) –
Boolean flag indicating whether to invert data (True) or not (False).
If True, returned data will be inverted
If False, returned data will not be inverted
- Returns:
Return numpy.ndarray or numpy.ma.MaskedArray Input data array with values above ‘max_val’ or below ‘min_val’ retained, cropped, or masked.
- Return type:
numpy.ndarray
- geoips.data_manipulations.corrections.apply_gamma(data_array, gamma)[source]#
Apply gamma correction to all values in the data array.
Gamma correction applied as: data_array ** (1.0 / float(gamma))
- Parameters:
data_array (numpy.ndarray or numpy.ma.MaskedArray) – data array to which gamma will be applied
gamma (float) – gamma correction value
- Returns:
Return numpy.ndarray or numpy.ma.MaskedArray if data_array was MaskedArray with gamma correction applied data_array ** (1.0 / float(gamma))
- Return type:
numpy.ndarray
- geoips.data_manipulations.corrections.apply_maximum_value(data, max_val, outbounds)[source]#
Apply maximum value to an array of data.
- Parameters:
data (numpy.ndarray or numpy.ma.MaskedArray) – data values to which the maximum value will be applied.
max_val (float) – The maximum bound to be applied to the input data as a scalar.
outbounds (str) –
- Method to use when applying bounds as a string. Valid values are:
retain: keep all pixels as is mask: mask all pixels that are out of range. crop: set all out of range values to max_val.
- Returns:
Return numpy.ndarray or numpy.ma.MaskedArray Input data array with values above ‘max_val’ retained, cropped, or masked appropriately.
- Return type:
numpy.ndarray
- geoips.data_manipulations.corrections.apply_minimum_value(data, min_val, outbounds)[source]#
Apply minimum values to an array of data.
- Parameters:
data (numpy.ndarray or numpy.ma.MaskedArray) – data values to which the minimum value will be applied.
min_val (float) – The minimum bound to be applied to the input data as a scalar.
outbounds (str) –
- Method to use when applying bounds as a string. Valid values are:
retain: keep all pixels as is mask: mask all pixels that are out of range. crop: set all out of range values to min_val.
- Returns:
Return numpy.ndarray or numpy.ma.MaskedArray Input data array with values below ‘min_val’ retained, cropped, or masked appropriately.
- Return type:
numpy.ndarray
- geoips.data_manipulations.corrections.apply_offset(data_array, offset)[source]#
Apply offset to all values in data_array.
Offset applied as: data_array + offset
- Parameters:
data_array (numpy.ndarray or numpy.ma.MaskedArray) – data values to which offset will be applied.
scale_factor (float) – requested offset.
- Returns:
Return numpy.ndarray or numpy.ma.MaskedArray Input data array with offset applied data_array + offset
- Return type:
numpy.ndarray
- geoips.data_manipulations.corrections.apply_scale_factor(data_array, scale_factor)[source]#
Apply scale factor to all values in data_array.
Scale factor applied as: data_array * scale_factor
- Parameters:
data_array (numpy.ndarray or numpy.ma.MaskedArray) – data values to be scaled
scale_factor (float) – requested scale factor
- Returns:
Return numpy.ndarray or numpy.ma.MaskedArray Input data array with scale factor applied data_array * scale_factor
- Return type:
numpy.ndarray
- geoips.data_manipulations.corrections.apply_solar_zenith_correction(data_array, sunzen_array)[source]#
Apply solar zenith angle correction to all values in data_array.
Solar zenith correction applied as: data / cos(sunzen)
- Parameters:
data_array (numpy.ndarray or numpy.ma.MaskedArray) – data values to be masked
sunzen_array (numpy.ndarray or numpy.ma.MaskedArray) – solar zenith angles of the same shape as the data array.
- Returns:
Return numpy.ndarray or numpy.ma.MaskedArray if original data_array was MaskedArray with each value in the data_array divided by cos(sunzen).
- Return type:
numpy.ndarray
- geoips.data_manipulations.corrections.invert_data_range(data, min_val=None, max_val=None)[source]#
Invert data range to an array of data.
- Parameters:
data (numpy.ndarray or numpy.ma.MaskedArray) – data values to which the data range will be applied.
min_val (float, optional) – The minimum bound to be applied to the input data as a scalar, by default None, which results in data.min().
max_val (float, optional) – The maximum bound to be applied to the input data as a scalar. by default None, which results in data.max().
- Returns:
Return numpy.ndarray or numpy.ma.MaskedArray Input data array with values inverted.
- Return type:
numpy.ndarray
- geoips.data_manipulations.corrections.mask_day(data_array, sunzen_array, max_zenith=90)[source]#
Mask where solar zenith angle less than the maxinum specified value.
Mask all pixels within the data array where the solar zenith angle is less than the maxinum specified value.
- Parameters:
data_array (numpy.ndarray or numpy.ma.MaskedArray) – data values to be masked
sunzen_array (numpy.ndarray) – numpy.ndarray or numpy.ma.MaskedArray of solar zenith angles, of the same shape as the data array
max_zenith (float, optional) – Mask all locations in data_array where sunzen_array is less than max_zenith, by default 90
- Returns:
Data array with all locations corresponding to a solar zenith angle less than max_zenith masked.
- Return type:
numpy.ma.MaskedArray
- geoips.data_manipulations.corrections.mask_night(data_array, sunzen_array, min_zenith=90)[source]#
Mask where solar zenith angle greater than the minimum specified value.
Mask all pixels within the data array where the solar zenith angle is greater than the mininum specified value.
- Parameters:
data_array (numpy.ndarray or numpy.ma.MaskedArray) – data values to be masked.
sunzen_array (numpy.ndarray or numpy.ma.MaskedArray) – array of solar zenith angles, same shape as the data array.
min_zenith (float, optional) – Mask all locations in data_array where sunzen_array is greater than min_zenith, by default 90.
- Returns:
Data array with all locations corresponding to a solar zenith angle greater than min_zenith masked.
- Return type:
numpy.ma.MaskedArray
- geoips.data_manipulations.corrections.normalize(data, min_val=None, max_val=None, min_bounds='crop', max_bounds='crop')[source]#
Normalize data array with min_val and max_val to range 0 to 1.
Default to cropping outside requested data range.
- Parameters:
data (numpy.ndarray or numpy.ma.MaskedArray) – data values to which the data range will be applied.
min_val (float, default=None) –
The minimum bound to be applied to the input data as a scalar,
If None, use data.min().
max_val (float, default=None) –
The maximum bound to be applied to the input data as a scalar.
If None, use data.max().
min_outbounds (str, default='crop') –
Method to use when applying bounds as a string. Valid values are:
retain: keep all pixels as is
mask: mask all pixels that are out of range.
crop: set all out of range values to min_val
max_outbounds (str, default='crop') –
Method to use when applying bounds as a string. Valid values are:
retain: keep all pixels as is
mask: mask all pixels that are out of range.
crop: set all out of range values to max_val
- Returns:
Return numpy.ndarray or numpy.ma.MaskedArray Input data array normalized between 0 and 1, with values above ‘max_val’ or below ‘min_val’ retained, cropped, or masked.
- Return type:
numpy.ndarray
geoips.data_manipulations.info module#
Introspection functions on data arrays.
- geoips.data_manipulations.info.percent_not_nan(data_array)[source]#
Determine percent of a numpy.ndarray that is not NaN values.
- Parameters:
data_array (numpy.ndarray) – Final processed array from which to determine coverage, invalid values specified by “numpy.nan”.
- Returns:
percent of input data array that is not numpy.nan.
- Return type:
float
- geoips.data_manipulations.info.percent_unmasked(data_array)[source]#
Determine percent of a numpy.ma.Masked array that is not masked.
- Parameters:
data_array (numpy.ma.MaskedArray) – Final processed array from which to determine coverage
- Returns:
percent of input data array that is not masked.
- Return type:
float
geoips.data_manipulations.merge module#
Utilities for merging granules into a single data array.
These utilities can apply to potentially different data sources - spanning a variety of sensors and platforms into a single final dataset.
- geoips.data_manipulations.merge.daterange(start_date, end_date)[source]#
Check one day at a time.
If end_date - start_date is between 1 and 2, days will be 1, and range(1) is 0. So add 2 to days to set range.
- geoips.data_manipulations.merge.find_datafiles_in_range(sector_name, platform_name, source_name, min_time, max_time, basedir, product_name, every_min=True, verbose=False, time_format='%H%M', actual_datetime=None, single_match=False)[source]#
Find datafiles from a specified set of parameters.
- Parameters:
sector_name (str) – Sector of interest
platform_name (str) – platform of interest
source_name (str) – Source of interest
min_time (datetime.datetime) – Minimum time to search
max_time (datetime.datetime) – Maximum time to search
basedir (str) – Base directory to search
product_name (str) – Product of interest
every_min (bool, optional) – Check every minute, by default True
verbose (bool, optional) – Print a lot of log output during the search, by default False
time_format (str, optional) – Format of time information in filenames, by default “%H%M”
actual_datetime (datetime.datetime, optional) – Actual datetime of the requested data, required if single_match is True, by default None
single_match (bool, optional) – Only return the closest matching file if True, else return all matching files, by default False
- Returns:
List of all filenames matching the given parameters (list of length 1 if single_match is True, all matching files if single_match is false)
- Return type:
list of str
- geoips.data_manipulations.merge.get_matching_files(primary_sector_name, subsector_names, platforms, sources, max_time_diffs, basedir, merge_datetime, product_name, time_format='%H%M', buffer_mins=30, verbose=False, single_match=False)[source]#
Given the current set of parameters, find all matching files.
Given the current primary sector, and associated subsectors, platforms, and sources, find all matching files.
- Parameters:
primary_sector_name (str) – The final sector that all data will be stitched into. ie ‘GlobalGlobal’
subsector_names (list of str) – List of all subsectors that will be merged into the final sector. (potentially including the full primary_sector_name.) ie [‘GlobalGlobal’, ‘GlobalAntarctic’, ‘GlobalArctic’]
platforms (list of str) – List of all desired platforms. platforms, sources, and max_time_diffs correspond to one another and should be the same length and in the same order.
sources (list of str) – List of all desired sources. platforms, sources, and max_time_diffs correspond to one another and should be the same length and in the same order.
max_time_diffs (list of int) – Minutes. List of allowed time diffs for given platform/source. Matches max_time_diff before the requested merge_datetime argument. platforms, sources, and max_time_diffs correspond to one another and should be the same length and in the same order.
basedir (str) – Base directory in which to look for the matching files.
merge_datetime (datetime) – Attempt matching max_time_diff prior to merge_datetime
product_name (str) – product_name string found in matching files
time_format (str, optional) – Requested time format for filenames (strptime format string), by default ‘%H%M’
verbose (bool, optional) – Print a lot of log output during the search, by default False
single_match (bool, optional) – Only return the closest matching file if True, else return all matching files, by default False
- Returns:
List of all filenames matching the given parameters (list of length 1 if single_match is True, all matching files if single_match is false)
- Return type:
list of str
Module contents#
geoips.data_manipulations init file.