sagemaker.core.experiments.run#

Contains the SageMaker Experiment Run class.

Functions

list_runs(experiment_name[, created_before, ...])

Return a list of Run objects matching the given criteria.

load_run([run_name, experiment_name, ...])

Load an existing run.

Classes

Run(experiment_name[, run_name, ...])

A collection of parameters, metrics, and artifacts to create a ML model.

SortByType(value)

The type of property by which to sort the list_runs results.

SortOrderType(value)

The type of order to sort the list or search results.

class sagemaker.core.experiments.run.Run(experiment_name: str, run_name: str | None = None, experiment_display_name: str | None = None, run_display_name: str | None = None, tags: List[Dict[str, str | PipelineVariable]] | Dict[str, str | PipelineVariable] | None = None, sagemaker_session: Session | None = None, artifact_bucket: str | None = None, artifact_prefix: str | None = None)[source]#

Bases: object

A collection of parameters, metrics, and artifacts to create a ML model.

close()[source]#

Persist any data saved locally.

property experiment_config: dict#

Get experiment config from run attributes.

log_artifact(name: str, value: str, media_type: str | None = None, is_output: bool = True)[source]#

Record a single artifact for this run.

Overwrites any previous value recorded for the specified name.

Parameters:
  • name (str) – The name of the artifact.

  • value (str) – The value.

  • media_type (str) – The MediaType (MIME type) of the value (default: None).

  • is_output (bool) – Determines direction of association to the run. Defaults to True (output artifact). If set to False then represented as input association.

log_confusion_matrix(y_true: list | array, y_pred: list | array, title: str | None = None, is_output: bool = True)[source]#

Create and log a confusion matrix artifact.

The artifact is stored in S3 and represented as a lineage artifact with an association with the run.

You can view the artifact in the UI. If your job is created by a pipeline execution you can view the artifact by selecting the corresponding step in the pipelines UI. See also SageMaker Pipelines This method requires sklearn library.

Parameters:
  • y_true (list or array) – True labels. If labels are not binary then positive_label should be given.

  • y_pred (list or array) – Predicted labels.

  • title (str) – Title of the graph (default: None).

  • is_output (bool) – Determines direction of association to the run. Defaults to True (output artifact). If set to False then represented as input association.

log_file(file_path: str, name: str | None = None, media_type: str | None = None, is_output: bool | None = True, extra_args: dict | None = None)[source]#

Upload a file to s3 and store it as an input/output artifact in this run.

Parameters:
  • file_path (str) – The path of the local file to upload.

  • name (str) – The name of the artifact (default: None).

  • media_type (str) – The MediaType (MIME type) of the file. If not specified, this library will attempt to infer the media type from the file extension of file_path.

  • is_output (bool) – Determines direction of association to the run. Defaults to True (output artifact). If set to False then represented as input association.

  • extra_args (dict) – Optional extra arguments that may be passed to the upload operation. Similar to ExtraArgs parameter in S3 upload_file function. Please refer to the ExtraArgs parameter documentation here: https://boto3.amazonaws.com/v1/documentation/api/latest/guide/s3-uploading-files.html#the-extraargs-parameter

log_metric(name: str, value: float, timestamp: datetime | None = None, step: int | None = None)[source]#

Record a custom scalar metric value for this run.

Note

This method is for manual custom metrics, for automatic metrics see the enable_sagemaker_metrics parameter on the estimator class.

Parameters:
  • name (str) – The name of the metric.

  • value (float) – The value of the metric.

  • timestamp (datetime.datetime) – The timestamp of the metric. If not specified, the current UTC time will be used.

  • step (int) – The integer iteration number of the metric value (default: None).

log_parameter(name: str, value: str | int | float)[source]#

Record a single parameter value for this run.

Overwrites any previous value recorded for the specified parameter name.

Parameters:
  • name (str) – The name of the parameter.

  • value (str or int or float) – The value of the parameter.

log_parameters(parameters: Dict[str, str | int | float])[source]#

Record a collection of parameter values for this run.

Parameters:

parameters (dict[str, str or int or float]) – The parameters to record.

log_precision_recall(y_true: list | array, predicted_probabilities: list | array, positive_label: str | int | None = None, title: str | None = None, is_output: bool = True, no_skill: int | None = None)[source]#

Create and log a precision recall graph artifact for Studio UI to render.

The artifact is stored in S3 and represented as a lineage artifact with an association with the run.

You can view the artifact in the UI. If your job is created by a pipeline execution you can view the artifact by selecting the corresponding step in the pipelines UI. See also SageMaker Pipelines

This method requires sklearn library.

Parameters:
  • y_true (list or array) – True labels. If labels are not binary then positive_label should be given.

  • predicted_probabilities (list or array) – Estimated/predicted probabilities.

  • positive_label (str or int) – Label of the positive class (default: None).

  • title (str) – Title of the graph (default: None).

  • is_output (bool) – Determines direction of association to the run. Defaults to True (output artifact). If set to False then represented as input association.

  • no_skill (int) – The precision threshold under which the classifier cannot discriminate between the classes and would predict a random class or a constant class in all cases (default: None).

log_roc_curve(y_true: list | array, y_score: list | array, title: str | None = None, is_output: bool = True)[source]#

Create and log a receiver operating characteristic (ROC curve) artifact.

The artifact is stored in S3 and represented as a lineage artifact with an association with the run.

You can view the artifact in the UI. If your job is created by a pipeline execution you can view the artifact by selecting the corresponding step in the pipelines UI. See also SageMaker Pipelines

This method requires sklearn library.

Parameters:
  • y_true (list or array) – True labels. If labels are not binary then positive_label should be given.

  • y_score (list or array) – Estimated/predicted probabilities.

  • title (str) – Title of the graph (default: None).

  • is_output (bool) – Determines direction of association to the run. Defaults to True (output artifact). If set to False then represented as input association.

class sagemaker.core.experiments.run.SortByType(value)[source]#

Bases: Enum

The type of property by which to sort the list_runs results.

CREATION_TIME = 'CreationTime'#
NAME = 'Name'#
class sagemaker.core.experiments.run.SortOrderType(value)[source]#

Bases: Enum

The type of order to sort the list or search results.

ASCENDING = 'Ascending'#
DESCENDING = 'Descending'#
sagemaker.core.experiments.run.list_runs(experiment_name: str, created_before: datetime | None = None, created_after: datetime | None = None, sagemaker_session: Session | None = None, max_results: int | None = None, next_token: str | None = None, sort_by: SortByType = SortByType.CREATION_TIME, sort_order: SortOrderType = SortOrderType.DESCENDING) list[source]#

Return a list of Run objects matching the given criteria.

Parameters:
  • experiment_name (str) – Only Run objects related to the specified experiment are returned.

  • created_before (datetime.datetime) – Return Run objects created before this instant (default: None).

  • created_after (datetime.datetime) – Return Run objects created after this instant (default: None).

  • sagemaker_session (sagemaker.core.helper.session_helper.Session) – Session object which manages interactions with Amazon SageMaker APIs and any other AWS services needed. If not specified, one is created using the default AWS configuration chain.

  • max_results (int) – Maximum number of Run objects to retrieve (default: None).

  • next_token (str) – Token for next page of results (default: None).

  • sort_by (SortByType) – The property to sort results by. One of NAME, CREATION_TIME (default: CREATION_TIME).

  • sort_order (SortOrderType) – One of ASCENDING, or DESCENDING (default: DESCENDING).

Returns:

A list of Run objects.

Return type:

list

sagemaker.core.experiments.run.load_run(run_name: str | None = None, experiment_name: str | None = None, sagemaker_session: Session | None = None, artifact_bucket: str | None = None, artifact_prefix: str | None = None, tags: List[Dict[str, str]] | None = None) Run[source]#

Load an existing run.

In order to reuse an existing run to log extra data, load_run is recommended. It can be used in several ways:

  1. Use load_run by explicitly passing in run_name and experiment_name.

If run_name and experiment_name are passed in, they are honored over the default experiment config in the job environment or the run context (i.e. within the with block).

Note

Both run_name and experiment_name should be supplied to make this usage work. Otherwise, you may get a ValueError.

with load_run(experiment_name="my-exp", run_name="my-run") as run:
    run.log_metric(...)
    ...
  1. Use the load_run in a job script without supplying run_name and experiment_name.

In this case, the default experiment config (specified when creating the job) is fetched from the job environment to load the run.

# In a job script
with load_run() as run:
    run.log_metric(...)
    ...

3. Use the load_run in a notebook within a run context (i.e. the with block) but without supplying run_name and experiment_name.

Every time we call with Run(...) as run1:, the initialized run1 is tracked in the run context. Then when we call load_run() under this with statement, the run1 in the context is loaded by default.

# In a notebook
with Run(experiment_name="my-exp", run_name="my-run", ...) as run1:
    run1.log_parameter(...)

    with load_run() as run2: # run2 is the same object as run1
        run2.log_metric(...)
        ...
Parameters:
  • run_name (str) – The name of the run to be loaded (default: None). If it is None, the RunName in the ExperimentConfig of the job will be fetched to load the run.

  • experiment_name (str) – The name of the Experiment that the to be loaded run is associated with (default: None). Note: the experiment_name must be supplied along with a valid run_name. Otherwise, it will be ignored.

  • sagemaker_session (sagemaker.core.helper.session_helper.Session) – Session object which manages interactions with Amazon SageMaker APIs and any other AWS services needed. If not specified, one is created using the default AWS configuration chain.

  • artifact_bucket (str) – The S3 bucket to upload the artifact to. If not specified, the default bucket defined in sagemaker_session will be used.

  • artifact_prefix (str) – The S3 key prefix used to generate the S3 path to upload the artifact to (default: “trial-component-artifacts”).

  • tags (List[Dict[str, str]]) – A list of tags to be used for all create calls, e.g. to create an experiment, a run group, etc. (default: None).

Returns:

The loaded Run object.

Return type:

Run