DataType Class

Helper class for handling the proper manipulation of supported column types (int, bool, string, etc.). Currently used with MLTable.convert_column_types(...) & from_delimited_files(...) for specifying which types to convert columns to. Different types are selected with DataType.from_(...)* methods.

Inheritance
builtins.object
DataType

Constructor

DataType()

Methods

to_bool

Configure conversion to bool. true_values & false_values must both be None or non-empty lists of, string else an error will be thrown.

to_datetime

Configure conversion to datetime.

to_float

Configure conversion to 64-bit float.

to_int

Configure conversion to 64-bit integer.

to_stream

Configure conversion to stream.

to_string

Configure conversion to string.

to_bool

Configure conversion to bool. true_values & false_values must both be None or non-empty lists of, string else an error will be thrown.

static to_bool(true_values: List[str] | None = None, false_values: List[str] | None = None, mismatch_as: str | None = None)

Parameters

true_values
list[str]
default value: None

List of values in dataset to designate as True. For example, ['1', 'yes'] will be replaced as [True, True]. The true_values need to be present in the dataset otherwise None will be returned for values not present.

false_values
list[str]
default value: None

List of values in dataset to designate as False. For example, ['0', 'no'] will be replaced as [False, False]. The false_values need to be present in the dataset otherwise None will be returned for values not present.

mismatch_as
Optional[str]
default value: None

How cast strings that are neither in true_values or false_values; 'true' casts all as True, 'false' as False, and 'error' will error instead of casting. Defaults to None which equal to 'error'.

to_datetime

Configure conversion to datetime.

static to_datetime(formats: str | List[str], date_constant: str | None = None)

Parameters

formats
str or list[str]
Required

Formats to try for datetime conversion. For example %d-%m-%Y for data in "day-month-year", and %Y-%m-%dT%H:%M:%S.%f for "combined date and time representation" according to ISO 8601.

  • %Y: Year with 4 digits

  • %y: Year with 2 digits

  • %m: Month in digits

  • %b: Month represented by its abbreviated name in 3 letters, like Aug

  • %B: Month represented by its full name, like August

  • %d: Day in digits

  • %H: Hour as represented in 24-hour clock time

  • %I: Hour as represented in 12-hour clock time

  • %M: Minute in 2 digits

  • %S: Second in 2 digits

  • %f: Microsecond

  • %p: AM/PM designator

  • %z: Timezone, for example: -0700

date_constant
Optional[str]
default value: None

If the column contains only time values, a date to apply to the resulting DateTime.

to_float

Configure conversion to 64-bit float.

static to_float()

to_int

Configure conversion to 64-bit integer.

static to_int()

to_stream

Configure conversion to stream.

static to_stream()

to_string

Configure conversion to string.

static to_string()