geoips.plugins.modules.procflows package#
Submodules#
geoips.plugins.modules.procflows.config_based module#
Processing workflow for config-based processing.
- geoips.plugins.modules.procflows.config_based.call(fnames, command_line_args=None)[source]#
Workflow for efficiently running all required outputs.
Includes all sectors and products specified in a YAML output config file. Specified via a YAML config file
- Parameters:
fnames (list) – List of strings specifying full paths to input file names to process
command_line_args (dict) – dictionary of command line arguments
- Returns:
0 for successful completion, non-zero for error (incorrect comparison, or failed run)
- Return type:
int
- geoips.plugins.modules.procflows.config_based.get_area_def_list_from_dict(area_defs)[source]#
Get a list of actual area_defs from full dictionary.
Dict returned from get_area_defs_from_available_sectors
- geoips.plugins.modules.procflows.config_based.get_area_defs_from_available_sectors(available_sectors_dict, command_line_args, xobjs, variables)[source]#
Get all required area_defs for the given set of parameters.
YAML config parameters (config_dict), command_line_args, xobjs, and required variables. Command line args override config specifications.
- Parameters:
available_sectors_dict (dict) – Dictionary of all requested sector_types (specified in YAML config)
command_line_args (dict) – Dictionary of command line arguments - any command line argument that is also a key in available_sectors_dict[<sector_type>] will replace the value in the available_sectors_dict[<sector_type>]
xobjs (dict) – Dictionary of xarray datasets, used in determining start/end time of data files for identifying dynamic sectors
variables (list) – List of required variables, for determining center coverage for TCs
- Returns:
Dictionary of required area_defs, with area_def.name as the dictionary keys. Based on YAML config-specified available_sectors, and command line args
- Return type:
dict
Notes
Each area_def.name key has one or more “sector_types” associated with it.
Each sector_type dictionary contains the actual “requested_sector_dict” from the YAML config, and the actual AreaDefinition object that was returned.
area_defs[area_def.name][sector_type]['requested_sector_dict']
area_defs[area_def.name][sector_type]['area_def']
- geoips.plugins.modules.procflows.config_based.get_bg_xarray(sect_xarrays, area_def, prod_plugin, resampled_read=False, window_start_time=None, window_end_time=None)[source]#
Get background xarray.
- Parameters:
sect_xarrays (dict of xarray.Dataset) – dictionary of xarray Datasets to pull appropriate background xarray from. This may include multiple products / variables.
area_def (pyresample.AreaDefinition) – Spatial region required in the final xarray Datasets.
prod_plugin (ProductPlugin) – GeoIPS Product Plugin obtained through interfaces.products.get_plugin(“name”).
resampled_read (bool, default=False) – Specify whether a resampled read is required, needed for datatypes that will be read within “get_alg_xarray”
window_start_time (datetime.datetime, default=None) – If specified, sector temporally between window_start_time and window_end_time. hours_before_sector_time and hours_after_sector_time are ignored if window start/end time are set!
window_end_time (datetime.datetime, default=None) – If specified, sector temporally between window_start_time and window_end_time. hours_before_sector_time and hours_after_sector_time are ignored if window start/end time are set!
- Returns:
alg_xarray – xarray Dataset containing the data needed to produce the background for overlay imagery.
- Return type:
xarray.Dataset
- geoips.plugins.modules.procflows.config_based.get_config_dict(config_yaml_file)[source]#
Populate the full config dictionary from a given YAML config file.
Includes both sector and output specifications.
- Parameters:
config_yaml_file (str) – Full path to YAML config file, containing sector and output specifications. YAML config files support environment variables in entries flagged with !ENV
- Returns:
Return dictionary of both sector and output specifications, as found in config_yaml_file. The output dictionary references the “sector_types” found in the available_sectors dictionary, each output_type requests a specific “sector_type” to be used for processing.
- Return type:
dict
- geoips.plugins.modules.procflows.config_based.get_required_outputs(config_dict, sector_type)[source]#
Get only the required outputs from the current sector_type.
- geoips.plugins.modules.procflows.config_based.get_resampled_read(config_dict, area_defs, area_def_id, sector_type, reader_plugin, reader_kwargs, fnames, variables)[source]#
Return dictionary of xarray datasets for a given area def.
Xarrays resampled to area_def
- geoips.plugins.modules.procflows.config_based.get_sectored_read(config_dict, area_defs, area_def_id, sector_type, reader_plugin, reader_kwargs, fnames, variables)[source]#
Return dictionary of xarray datasets for a given area def.
Xarrays sectored to area_def
- geoips.plugins.modules.procflows.config_based.get_variables_from_available_outputs_dict(available_outputs_dict, source_name, sector_types=None)[source]#
Get required variables for all outputs for a given “source_name”.
Outputs specified within the YAML config.
- Parameters:
available_outputs_dict (dict) – Dictionary of all requested output_types (specified in YAML config)
source_name (str) – Find all required variables for the passed “source_name”
sector_types (list, default=None) – if sector_types list of strings is passed, only include output_types that require one of the passed “sector_types”
- Returns:
List of all required variables for all output products for the given source_name
- Return type:
list
- geoips.plugins.modules.procflows.config_based.initialize_final_products(final_products, cpath)[source]#
Initialize the final_products dictionary with cpath dict key if needed.
- Parameters:
final_products (dict) – Dictionary of final products, with keys of final required “compare_path” Products with no compare_path specified are stored with the key “no_comparison”
cpath (str) – Key to add to final_products dictionary
- Returns:
Return final_products dictionary, updated with current “cpath” key: final_products[cpath][‘files’] = <list_of_files_in_given_cpath>
- Return type:
dict
- geoips.plugins.modules.procflows.config_based.is_required_sector_type(available_outputs_dict, sector_type)[source]#
Check if current sector is required for any outputs.
Check if a given sector_type is required for any currently requested output_types
- Parameters:
available_outputs_dict (dict) – Dictionary of all requested output_types (specified in YAML config)
sector_type (str) – Determine if any output_types require the currently requested “sector_type”
- Returns:
True if any output_types require the passed “sector_type”
False if no output_types require the passed “sector_type”
- Return type:
bool
- geoips.plugins.modules.procflows.config_based.process_unsectored_data_outputs(final_products, available_outputs_dict, available_sectors_dict, xobjs, variables, command_line_args=None, write_to_product_db=False, config_dict=None)[source]#
Process unsectored data output.
Loop through all possible outputs, identifying output types that require unsectored data output. Produce all required unsectored data output, update final_products dictionary accordingly, and return final_products dictionary with the new unsectored outputs.
- Parameters:
final_products (dict) – Dictionary of final products, with keys of final required “compare_path” Products with no compare_path specified are stored with the key “no_comparison”
available_outputs_dict (dict) – Dictionary of all available output product specifications
available_sectors_dict (dict) – Dictionary of available sector types - we are looking for available sectors that contain the “unsectored” keyword.
xobjs (dict) – Dictionary of xarray datasets, for use in producing unsectored output formats
variables (list) – List of strings of required variables in the given product.
- Returns:
Return final_products dictionary, updated with current “cpath” key: final_products[cpath][‘files’] = <list_of_files_in_given_cpath>
- Return type:
dict
- geoips.plugins.modules.procflows.config_based.requires_bg(available_outputs_dict, sector_type)[source]#
Check if current sector requires background imagery.
Check if a given sector_type is requested for any product_types that also require background imagery.
- Parameters:
available_outputs_dict (dict) – Dictionary of all requested output_types (specified in YAML config)
sector_type (str) – sector_type to determine if any output_types that require background imagery also request the passed sector_type
- Returns:
True if any output_types that require background imagery require the passed “sector_type”
False if no output_types require both background imagery and the passed “sector_type”
- Return type:
bool
- geoips.plugins.modules.procflows.config_based.set_comparison_path(output_dict, product_name, output_type, command_line_args=None)[source]#
Replace variables specified by <varname> in compare_path.
- Parameters:
config (dict) – Dictionary of output specifications, containing key “compare_path”
product_name (str) – Current requested product name, all instances of <product> in compare_path replaced with product_name argument
output_type (str) – Current requested output type, all instances of <output> in compare_path replaced with output argument
- Returns:
Return a single string with the fully specified comparison path for current product
- Return type:
str
geoips.plugins.modules.procflows.order_based module#
Processing workflow for order based data source processing.
- geoips.plugins.modules.procflows.order_based.call(workflow, fnames, command_line_args=None)[source]#
Run the order based procflow (OBP).
Process the specified input data files using the OBP in the order of steps listed in the workflow definition file.
- Parameters:
workflow (str) – The name of the workflow to process.
fnames (list of str) – List of filenames from which to read data.
command_line_args (list of str, None) – Command line arguments to pass to the workflow.
geoips.plugins.modules.procflows.single_source module#
Processing workflow for single data source processing.
- geoips.plugins.modules.procflows.single_source.add_attrs_from_area_def(final_xarray, source_xarray, area_def)[source]#
Add attributes from an area_def.
- geoips.plugins.modules.procflows.single_source.add_filename_extra_field(xarray_obj, field_name, field_value)[source]#
Add filename extra field.
- geoips.plugins.modules.procflows.single_source.apply_alg_after_interp(interp_xarray, area_def, alg_plugin, alg_args, prod_plugin, variables, processed_xarrays)[source]#
Apply algorithm after interpolation.
MLS need to add ability here to pull from processed_xarrays if algorithm was already applied.
- geoips.plugins.modules.procflows.single_source.apply_alg_first(alg_plugin, alg_args, prod_plugin, curr_sect_xarrays, sect_xarrays, variables, variable_names, area_def)[source]#
Apply algorithm appropriately based on algorithm family.
MLS Inexplicably some of these use curr_sect_xarrays, and some use sect_xarrays. Also, some use variables and some use variable_names. I am guessing there is no reason for the difference, but maintaining the original functionality for now.
- geoips.plugins.modules.procflows.single_source.apply_alg_list_numpy_to_numpy(alg_xarray, alg_plugin, alg_args, prod_plugin, sect_xarrays, variables)[source]#
Apply list_numpy_to_numpy algorithm.
- geoips.plugins.modules.procflows.single_source.apply_alg_xarray_dict_area_def_to_numpy(alg_xarray, alg_plugin, alg_args, prod_plugin, sect_xarrays, area_def)[source]#
Apply xarray_dict_area_def_to_numpy algorithm.
- geoips.plugins.modules.procflows.single_source.apply_alg_xarray_dict_to_xarray(alg_plugin, alg_args, sect_xarrays)[source]#
Apply xarray_dict_to_xarray algorithm.
- geoips.plugins.modules.procflows.single_source.apply_alg_xarray_dict_to_xarray_dict(alg_plugin, alg_args, sect_xarrays)[source]#
Apply xarray_dict_to_xarray algorithm.
- geoips.plugins.modules.procflows.single_source.apply_alg_xarray_to_numpy(alg_xarray, alg_plugin, alg_args, prod_plugin, sect_xarrays, variable_names)[source]#
Apply xarray_to_numpy algorithm.
- geoips.plugins.modules.procflows.single_source.apply_alg_xarray_to_xarray(alg_plugin, alg_args, prod_plugin, sect_xarrays, variable_names)[source]#
Apply xarray_to_xarray algorithm.
- geoips.plugins.modules.procflows.single_source.apply_interp_after_alg(alg_xarray, interp_plugin, interp_args, prod_plugin, area_def, processed_xarrays)[source]#
Apply interpolation after algorithm.
- geoips.plugins.modules.procflows.single_source.apply_interp_first(variables, curr_sect_xarrays, prod_plugin, datasets_for_vars, resampled_read, area_def, processed_xarrays)[source]#
Apply interpolation first.
For product types that involve interpolation before algorithm.
- geoips.plugins.modules.procflows.single_source.call(fnames, command_line_args=None)[source]#
Workflow for running products from a single data source.
- Parameters:
fnames (list) – List of strings specifying full paths to input file names to process
command_line_args (dict) – dictionary of command line arguments
- Returns:
Return list of strings specifying full paths to output products that were produced
- Return type:
list
See also
geoips.commandline.args
Complete list of available command line args.
- geoips.plugins.modules.procflows.single_source.combine_filename_extra_fields(source_xarray, dest_xarray)[source]#
Combine filename extra fields.
- geoips.plugins.modules.procflows.single_source.get_alg_and_interp_plugins(prod_plugin)[source]#
Get algorithm and interpolator plugins from prod_plugin definition.
- geoips.plugins.modules.procflows.single_source.get_alg_xarray(sect_xarrays, area_def, prod_plugin, processed_xarrays=None, resector=True, resampled_read=False, variable_names=None, window_start_time=None, window_end_time=None)[source]#
Get alg xarray.
- Parameters:
sect_xarrays (dict of xarray.Dataset) – dictionary of xarray Datasets to apply algorithm.
area_def (pyresample.AreaDefinition) – Spatial region required in the final xarray Datasets.
prod_plugin (ProductPlugin) – GeoIPS Product Plugin obtained through interfaces.products.get_plugin(“name”).
resector (bool, default=True) – Specify whether to resector the data prior to applying the algorithm.
resampled_read (bool, default=False) – Specify whether a resampled read is required, needed for datatypes that will be read within “get_alg_xarray”
variable_names (list of str) – List of variable names within xarray Datasets to include in the final sectored xarray Datasets
window_start_time (datetime.datetime, default=None) – If specified, sector temporally between window_start_time and window_end_time.
window_end_time (datetime.datetime, default=None) – If specified, sector temporally between window_start_time and window_end_time.
- Returns:
xarray Dataset containing the final data after interpolation, algorithm, resectoring, etc have been applied.
- Return type:
xarray.Dataset
- geoips.plugins.modules.procflows.single_source.get_area_defs_from_command_line_args(command_line_args, xobjs, variables=None, filter_time=True)[source]#
Get area def from command line args.
- geoips.plugins.modules.procflows.single_source.get_filename(filename_formatter, prod_plugin=None, alg_xarray=None, area_def=None, supported_filenamer_types=None, output_dict=None, filename_formatter_kwargs=None)[source]#
Get filename.
- geoips.plugins.modules.procflows.single_source.get_interp_plugin_from_product(prod_plugin)[source]#
Get the interpolator plugin from the product spec.
Reassign interp_plugin based on CURRENT sect_xarray Allow re-defining interpolation for different datasets.
- geoips.plugins.modules.procflows.single_source.get_output_filenames(fname_formats, output_dict, prod_plugin, xarray_obj=None, area_def=None, supported_filenamer_types=None)[source]#
Get output filenames.
- geoips.plugins.modules.procflows.single_source.get_unique_dataset_key(area_def, xobj)[source]#
Get a unique id for xarray dataset.
- geoips.plugins.modules.procflows.single_source.output_all_metadata(output_dict, output_fnames, metadata_fnames, xarray_obj, area_def=None)[source]#
Output all metadata.
- geoips.plugins.modules.procflows.single_source.pad_area_definition(area_def, source_name=None, force_pad=False, x_scale_factor=1.5, y_scale_factor=1.5)[source]#
Pad area definition.
- geoips.plugins.modules.procflows.single_source.perform_interpolation(interp_plugin, area_def, sect_xarray, interp_xarray, interp_args, processed_xarrays)[source]#
Perform standard interpolation.
- geoips.plugins.modules.procflows.single_source.plot_data(output_dict, alg_xarray, area_def, prod_plugin, output_kwargs, fused_xarray_dict=None, no_output=False)[source]#
Plot data.
alg_xarray used for filename formats, etc. If included, fused_xarray_dict used for output format call
- geoips.plugins.modules.procflows.single_source.process_sectored_data_output(xobjs, variables, prod_plugin, output_dict, area_def=None)[source]#
Process sectored data output.
If current product family requires a sectored dictionary of xarrays, does not apply an algorithm, and DOES require an area definition, call ‘process_xarray_dict_to_output_format’, store the result in a list, and return it.
- geoips.plugins.modules.procflows.single_source.process_xarray_dict_to_output_format(xobjs, variables, prod_plugin, output_dict, area_def=None)[source]#
Process xarray dict to output format.
- geoips.plugins.modules.procflows.single_source.remove_unsupported_kwargs(module, requested_kwargs)[source]#
Remove unsupported keyword arguments.
- geoips.plugins.modules.procflows.single_source.resector_xarrays(resector, sect_xarrays, area_def, variables, window_start_time, window_end_time)[source]#
Resector xarrays if requested.
- geoips.plugins.modules.procflows.single_source.select_variables_to_interp(interp_args, area_def, source_xarray, interp_xarray, processed_xarrays)[source]#
Select interpolation variables.
- geoips.plugins.modules.procflows.single_source.use_variable_from_current_dataset(varname, key, variables, sect_xarray, interp_xarray, resampled_read, datasets_for_vars)[source]#
Use the variable from the current dataset.
If a specific dataset was requested for the current variable, and this dataset was NOT requested via a resampled_read (in which case the native datasets won’t exist, only the resampled dataset), then use the appropriately requested dataset.
- geoips.plugins.modules.procflows.single_source.verify_area_def(area_defs, check_area_def, data_start_datetime, data_end_datetime, time_range_hours=3)[source]#
Verify current area definition is the closest to the actual data time.
When looping through multiple dynamic area definitions for a full data file that temporally covers more than one dynamic area_def, there is no way of knowing which dynamic area_def has the best coverage until AFTER we have actually sectored the data to the specific area_def.
Call this utility on the current area_def (check_area_def) for the sectored data file, plus the full list of area definitions (area_defs) that cover the FULL data file.
- Returns:
True if the current area definition is NOT dynamic
True if the current area definition IS dynamic and is the closest temporally to the sectored data.
False if the current area definition is removed when filtering the list of area definitions based on the actual sectored data time.
- Return type:
bool
Module contents#
GeoIPS procflow init file.