sagemaker.mlops.feature_store.athena_query#
Classes
|
Class to manage querying of feature store data with AWS Athena. |
- class sagemaker.mlops.feature_store.athena_query.AthenaQuery(catalog: str, database: str, table_name: str, sagemaker_session: Session)[source]#
Bases:
objectClass to manage querying of feature store data with AWS Athena.
This class instantiates a AthenaQuery object that is used to retrieve data from feature store via standard SQL queries.
- catalog#
name of the data catalog.
- Type:
str
- database#
name of the database.
- Type:
str
- table_name#
name of the table.
- Type:
str
- as_dataframe(**kwargs) DataFrame[source]#
Download the result of the current query and load it into a DataFrame.
- Parameters:
**kwargs (object) – key arguments used for the method pandas.read_csv to be able to have a better tuning on data. For more info read: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_csv.html
- Returns:
A pandas DataFrame contains the query result.
- catalog: str#
- database: str#
- get_query_execution() Dict[str, Any][source]#
Get execution status of the current query.
- Returns:
Response dict from Athena.
- run(query_string: str, output_location: str, kms_key: str = None, workgroup: str = None) str[source]#
Execute a SQL query given a query string, output location and kms key.
This method executes the SQL query using Athena and outputs the results to output_location and returns the execution id of the query.
- Parameters:
query_string – SQL query string.
output_location – S3 URI of the query result.
kms_key – KMS key id. If set, will be used to encrypt the query result file.
workgroup (str) – The name of the workgroup in which the query is being started.
- Returns:
Execution id of the query.
- table_name: str#