sagemaker.core.local.data#
Placeholder docstring
Functions
|
Return an Instance of |
|
Return an Instance of |
|
Return an Instance of |
Classes
|
Placeholder docstring |
Placeholder docstring |
|
Split records by new line. |
|
|
Represents a data source within the local filesystem. |
|
Feed multiple records at a time for batch inference. |
Does not split records, essentially reads the whole file. |
|
Split using Amazon Recordio. |
|
|
Defines a data source given by a bucket and S3 prefix. |
|
Feed a single record at a time for batch inference. |
|
Placeholder docstring |
- class sagemaker.core.local.data.BatchStrategy(splitter)[source]#
Bases:
objectPlaceholder docstring
- abstract pad(file, size)[source]#
Group together as many records as possible to fit in the specified size.
- Parameters:
file (str) – file path to read the records from.
size (int) – maximum size in MB that each group of records will be fitted to. passing 0 means unlimited size.
- Returns:
generator of records
- class sagemaker.core.local.data.DataSource[source]#
Bases:
objectPlaceholder docstring
- class sagemaker.core.local.data.LocalFileDataSource(root_path)[source]#
Bases:
DataSourceRepresents a data source within the local filesystem.
- class sagemaker.core.local.data.MultiRecordStrategy(splitter)[source]#
Bases:
BatchStrategyFeed multiple records at a time for batch inference.
Will group up as many records as possible within the payload specified.
- pad(file, size=6)[source]#
Group together as many records as possible to fit in the specified size.
- Parameters:
file (str) – file path to read the records from.
size (int) – maximum size in MB that each group of records will be fitted to. passing 0 means unlimited size.
- Returns:
generator of records
- class sagemaker.core.local.data.NoneSplitter[source]#
Bases:
SplitterDoes not split records, essentially reads the whole file.
- split(filename)[source]#
Split a file into records using a specific strategy.
For this NoneSplitter there is no actual split happening and the file is returned as a whole.
- Parameters:
filename (str) – path to the file to split
- Returns: generator for the individual records that were split from
the file
- class sagemaker.core.local.data.RecordIOSplitter[source]#
Bases:
SplitterSplit using Amazon Recordio.
Not useful for string content.
Note: This class depends on the deprecated sagemaker.core.amazon module and is no longer functional.
- split(file)[source]#
Split a file into records using a specific strategy
This RecordIOSplitter splits the data into individual RecordIO records.
- Parameters:
file (str) – path to the file to split
Returns: generator for the individual records that were split from the file
- Raises:
NotImplementedError – This functionality has been removed due to deprecation of sagemaker.core.amazon module
- class sagemaker.core.local.data.S3DataSource(bucket, prefix, sagemaker_session)[source]#
Bases:
DataSourceDefines a data source given by a bucket and S3 prefix.
The contents will be downloaded and then processed as local data.
- class sagemaker.core.local.data.SingleRecordStrategy(splitter)[source]#
Bases:
BatchStrategyFeed a single record at a time for batch inference.
If a single record does not fit within the payload specified it will throw a RuntimeError.
- pad(file, size=6)[source]#
Group together as many records as possible to fit in the specified size.
This SingleRecordStrategy will not group any record and will return them one by one as long as they are within the maximum size.
- Parameters:
file (str) – file path to read the records from.
size (int) – maximum size in MB that each group of records will be fitted to. passing 0 means unlimited size.
- Returns:
generator of records
- sagemaker.core.local.data.get_batch_strategy_instance(strategy, splitter)[source]#
Return an Instance of
sagemaker.local.data.BatchStrategyaccording to strategy- Parameters:
strategy (str) – Either ‘SingleRecord’ or ‘MultiRecord’
( (splitter) – class:`sagemaker.local.data.Splitter): splitter to get the data from.
- Returns
sagemaker.local.data.BatchStrategy: an Instance of a BatchStrategy
- sagemaker.core.local.data.get_data_source_instance(data_source, sagemaker_session)[source]#
Return an Instance of
sagemaker.local.data.DataSource.The instance can handle the provided data_source URI.
data_source can be either file:// or s3://
- Parameters:
data_source (str) – a valid URI that points to a data source.
sagemaker_session (
sagemaker.core.helper.session.Session) – a SageMaker Session to interact with S3 if required.
- Returns:
an Instance of a Data Source
- Return type:
sagemaker.local.data.DataSource
- Raises:
ValueError – If parsed_uri scheme is neither file nor s3 , raise an error.
- sagemaker.core.local.data.get_splitter_instance(split_type)[source]#
Return an Instance of
sagemaker.local.data.Splitter.The instance returned is according to the specified split_type.
- Parameters:
split_type (str) – either ‘Line’ or ‘RecordIO’. Can be left as None to signal no data split will happen.
- Returns
sagemaker.local.data.Splitter: an Instance of a Splitter