sagemaker.core.remote_function.spark_config#
This module is used to define the Spark job config to remote function.
Classes
|
This is the class to initialize the spark configurations for remote function |
Util class for spark configurations |
- class sagemaker.core.remote_function.spark_config.SparkConfig(submit_jars: List[str] | None = None, submit_py_files: List[str] | None = None, submit_files: List[str] | None = None, configuration: List[Dict] | Dict | None = None, spark_event_logs_uri: str | None = None)[source]#
Bases:
objectThis is the class to initialize the spark configurations for remote function
- submit_jars#
A list which contains paths to the jars which are going to be submitted to Spark job. The location can be a valid s3 uri or local path to the jar. Defaults to
None.- Type:
Optional[List[str]]
- submit_py_files#
A list which contains paths to the python files which are going to be submitted to Spark job. The location can be a valid s3 uri or local path to the python file. Defaults to
None.- Type:
Optional[List[str]]
- submit_files#
A list which contains paths to the files which are going to be submitted to Spark job. The location can be a valid s3 uri or local path to the python file. Defaults to
None.- Type:
Optional[List[str]]
- configuration#
Configuration for Hadoop, Spark, or Hive. List or dictionary of EMR-style classifications. https://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-configure-apps.html
- Type:
list[dict] or dict
- spark_event_logs_s3_uri#
S3 path where Spark application events will be published to.
- Type:
str
- configuration: List[Dict] | Dict | None#
- spark_event_logs_uri: str | None#
- submit_files: List[str] | None#
- submit_jars: List[str] | None#
- submit_py_files: List[str] | None#
- class sagemaker.core.remote_function.spark_config.SparkConfigUtils[source]#
Bases:
objectUtil class for spark configurations
- static validate_configuration(configuration: Dict)[source]#
Validates the user-provided Hadoop/Spark/Hive configuration.
This ensures that the list or dictionary the user provides will serialize to JSON matching the schema of EMR’s application configuration
- Parameters:
configuration (Dict) – A dict that contains the configuration overrides to the default values. For more information, please visit: https://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-configure-apps.html