Compartir a través de


datasets Package

Modules

datasets

Datasets used in MicrosoftML unittests.

image

Data about images.

Classes

DataSetIris

Iris dataset dataset.

DataSetInfert

Infert dataset.

Topics

Sample dataset to show Light LDA transform examples in the API Reference section

Timeseries

Sample dataset to show Timeseries transform examples in the API Reference section

DataSetAirQuality

AirQuality dataset.

WikiDetox_Train

WikiDetox dataset train.

WikiDetox_Test

WikiDetox dataset test.

Generated_Twitter_Train

Manually generated Twitter training dataset.

Generated_Twitter_Test

Manually generated Twitter testing dataset.

Generated_Ticket_Train

Manually generated Flight ticket training dataset.

Generated_Ticket_Test

Manually generated Flight ticket testing dataset.

Uci_Train

UCI Adult dataset train.

Uci_Test

UCI Adult dataset test.

MSLTR_Train

MSLTR dataset train, sampled from https://www.microsoft.com/en-us/research/project/mslr/

MSLTR_Test

MSLTR dataset test, sampled from https://www.microsoft.com/en-us/research/project/mslr/

FS_Train

Flight Schedule data, manually created

FS_Test

Flight Schedule data, manually created

Functions

get_dataset

Return a predefined datasets.

param name: options are: airquality, fstest, fstrain, gen_tickettest, gen_tickettrain, gen_twittertest, gen_twittertrain, infert, iris, msltrtest, msltrtrain, timeseries, topics, uciadult_test, uciadult_train, wiki_detox_test, wiki_detox_train.

Example:

  ```

  >>> from nimbusml.datasets import get_dataset
  >>> path = get_dataset('infert').as_filepath()
  >>> print(path)
  ...
  ```

imbusmldatasets_datagplv2infert.csv

get_dataset(name)

Parameters

Name Description
name

available_datasets

Returns the list of available datasets.

available_datasets()