sagemaker.mlops.workflow.emr_step#

The step definitions for workflow.

Classes

EMRStep(name, display_name, description, ...)

EMR step for workflow.

EMRStepConfig(jar[, args, main_class, ...])

Config for a Hadoop Jar step.

class sagemaker.mlops.workflow.emr_step.EMRStep(name: str, display_name: str, description: str, cluster_id: str, step_config: EMRStepConfig, depends_on: List[str | Step] | None = None, cache_config: CacheConfig | None = None, cluster_config: Dict[str, Any] | None = None, execution_role_arn: str | None = None)[source]#

Bases: Step

EMR step for workflow.

property arguments: Dict[str, Any] | List[Dict[str, Any]]#

The arguments dict that is used to call AddJobFlowSteps.

NOTE: The AddFlowJobSteps request is not quite the args list that workflow needs. The Name attribute in AddJobFlowSteps cannot be passed; it will be set during runtime. In addition to that, we will also need to include emr job inputs and output config.

property properties: Dict[str, Any] | List[Dict[str, Any]]#

A Properties object representing the EMR DescribeStepResponse model

to_request() Dict[str, Any] | List[Dict[str, Any]][source]#

Updates the dictionary with cache configuration.

class sagemaker.mlops.workflow.emr_step.EMRStepConfig(jar, args: List[str] | None = None, main_class: str | None = None, properties: List[dict] | None = None, output_args: dict[str, str] | None = None)[source]#

Bases: object

Config for a Hadoop Jar step.

to_request() Dict[str, Any] | List[Dict[str, Any]][source]#

Convert EMRStepConfig object to request dict.