dataset_utilities Module

Utility methods for interacting with azureml.core.Dataset.

Functions

collect_usage_telemetry

collect_usage_telemetry(compute: Any, spark_context: Any, **kwargs: Any) -> None

Parameters

Name Description
compute
Required
spark_context
Required

convert_inputs

Convert the given datasets to trackable definitions.

convert_inputs(X: Any, y: Any, sample_weight: Any, X_valid: Any, y_valid: Any, sample_weight_valid: Any) -> Tuple[Any, Any, Any, Any, Any, Any]

Parameters

Name Description
X
Required

dataset representing X

y
Required

dataset representing y

sample_weight
Required

dataset representing the sample weight

X_valid
Required

dataset representing X_valid

y_valid
Required

dataset representing y_valid

sample_weight_valid
Required

dataset representing the validation sample weight

convert_inputs_dataset

Convert the given datasets to trackable definitions.

convert_inputs_dataset(*datasets: Any) -> Tuple[Any, ...]

Parameters

Name Description
datasets
Required

datasets to convert to trackable definitions

ensure_saved

ensure_saved(workspace: Workspace, **kwargs: Any) -> None

Parameters

Name Description
workspace
Required

get_dataset_from_mltable_data_json

Get dataset from MLTable data json

get_dataset_from_mltable_data_json(ws: Workspace, mltable_data_json_obj: Dict[str, Any], data_label: MLTableDataLabel) -> AbstractDataset | None

Parameters

Name Description
ws
Required

workspace to get dataset from

mltable_data_json_obj
Required

mltable data json object

data_label
Required

label indicating dataset to load from mltable data json

get_datasets_from_data_json

Get datasets from data json that can be either MLTable data json (with uri) or Dataprep json (with dataset id)

get_datasets_from_data_json(ws: Workspace, data_preparation_json: Dict[str, Any], data_labels: List[MLTableDataLabel]) -> Tuple[AbstractDataset | None, AbstractDataset | None, AbstractDataset | None]

Parameters

Name Description
ws
Required

workspace to get dataset from

data_preparation_json
Required

data json object

data_labels
Required

list of labels indicating dataset to load from data json

get_datasets_from_dataprep_json

Get dataset from Dataprep json (with dataset id)

get_datasets_from_dataprep_json(ws: Workspace, dataprep_json: Dict[str, Any], data_labels: List[MLTableDataLabel]) -> Tuple[AbstractDataset | None, AbstractDataset | None, AbstractDataset | None]

Parameters

Name Description
ws
Required

workspace to get dataset from

data_preparation_json
Required

data json object

data_labels
Required

list of labels indicating dataset to load from data json

dataprep_json
Required

get_datasets_from_mltable_data_json

Get datasets from MLTable data json (with uri)

get_datasets_from_mltable_data_json(ws: Workspace, mltable_data_json_obj: Dict[str, Any], data_labels: List[MLTableDataLabel]) -> Tuple[AbstractDataset | None, AbstractDataset | None, AbstractDataset | None]

Parameters

Name Description
ws
Required

workspace to get dataset from

data_preparation_json
Required

data json object

data_labels
Required

list of labels indicating dataset to load from data json

mltable_data_json_obj
Required

get_datasets_json

Get dataprep json.

get_datasets_json(training_data: Any | None = None, validation_data: Any | None = None, test_data: Any | None = None) -> str | None

Parameters

Name Description
training_data

Training data.

default value: None
validation_data

Validation data

default value: None
test_data

Test data

default value: None

Returns

Type Description

JSON string representation of a dict of Dataset

is_dataset

Check to see if the given object is a dataset or dataset definition.

is_dataset(dataset: Any) -> bool

Parameters

Name Description
dataset
Required

object to check