AutoCatalog.InferColumns Method

Definition

Overloads

InferColumns(String, ColumnInformation, Nullable<Char>, Nullable<Boolean>, Nullable<Boolean>, Boolean, Boolean)

Infers information about the columns of a dataset in a file located at path.

InferColumns(String, String, Nullable<Char>, Nullable<Boolean>, Nullable<Boolean>, Boolean, Boolean)

Infers information about the columns of a dataset in a file located at path.

InferColumns(String, UInt32, Boolean, Nullable<Char>, Nullable<Boolean>, Nullable<Boolean>, Boolean, Boolean)

Infers information about the columns of a dataset in a file located at path.

InferColumns(String, ColumnInformation, Nullable<Char>, Nullable<Boolean>, Nullable<Boolean>, Boolean, Boolean)

Infers information about the columns of a dataset in a file located at path.

public Microsoft.ML.AutoML.ColumnInferenceResults InferColumns (string path, Microsoft.ML.AutoML.ColumnInformation columnInformation, char? separatorChar = default, bool? allowQuoting = default, bool? allowSparse = default, bool trimWhitespace = false, bool groupColumns = true);
member this.InferColumns : string * Microsoft.ML.AutoML.ColumnInformation * Nullable<char> * Nullable<bool> * Nullable<bool> * bool * bool -> Microsoft.ML.AutoML.ColumnInferenceResults
Public Function InferColumns (path As String, columnInformation As ColumnInformation, Optional separatorChar As Nullable(Of Char) = Nothing, Optional allowQuoting As Nullable(Of Boolean) = Nothing, Optional allowSparse As Nullable(Of Boolean) = Nothing, Optional trimWhitespace As Boolean = false, Optional groupColumns As Boolean = true) As ColumnInferenceResults

Parameters

path
String

Path to a dataset file.

columnInformation
ColumnInformation

Column information for the dataset.

separatorChar
Nullable<Char>

The character used as separator between data elements in a row. If null, AutoML will try to infer this value.

allowQuoting
Nullable<Boolean>

Whether the file can contain columns defined by a quoted string. If null, AutoML will try to infer this value.

allowSparse
Nullable<Boolean>

Whether the file can contain numerical vectors in sparse format. If null, AutoML will try to infer this value.

trimWhitespace
Boolean

Whether trailing whitespace should be removed from dataset file lines.

groupColumns
Boolean

Whether to group together (when possible) original columns in the dataset file into vector columns in the resulting data structures. See TextLoader.Range for more information.

Returns

Information inferred about the columns in the provided dataset.

Remarks

Infers information about the name, data type, and purpose of each column. The returned TextLoaderOptions can be used to instantiate a TextLoader. The TextLoader can be used to obtain an IDataView that can be fed into an AutoML experiment, or used elsewhere in the ML.NET ecosystem (ie in Fit(IDataView). The ColumnInformation contains the inferred purpose of each column in the dataset. (For instance, is the column categorical, numeric, or text data? Should the column be ignored? Etc.) The ColumnInformation can be inspected and modified (or kept as is) and used by an AutoML experiment.

Applies to

InferColumns(String, String, Nullable<Char>, Nullable<Boolean>, Nullable<Boolean>, Boolean, Boolean)

Infers information about the columns of a dataset in a file located at path.

public Microsoft.ML.AutoML.ColumnInferenceResults InferColumns (string path, string labelColumnName = "Label", char? separatorChar = default, bool? allowQuoting = default, bool? allowSparse = default, bool trimWhitespace = false, bool groupColumns = true);
member this.InferColumns : string * string * Nullable<char> * Nullable<bool> * Nullable<bool> * bool * bool -> Microsoft.ML.AutoML.ColumnInferenceResults
Public Function InferColumns (path As String, Optional labelColumnName As String = "Label", Optional separatorChar As Nullable(Of Char) = Nothing, Optional allowQuoting As Nullable(Of Boolean) = Nothing, Optional allowSparse As Nullable(Of Boolean) = Nothing, Optional trimWhitespace As Boolean = false, Optional groupColumns As Boolean = true) As ColumnInferenceResults

Parameters

path
String

Path to a dataset file.

labelColumnName
String

The name of the label column.

separatorChar
Nullable<Char>

The character used as separator between data elements in a row. If null, AutoML will try to infer this value.

allowQuoting
Nullable<Boolean>

Whether the file can contain columns defined by a quoted string. If null, AutoML will try to infer this value.

allowSparse
Nullable<Boolean>

Whether the file can contain numerical vectors in sparse format. If null, AutoML will try to infer this value.

trimWhitespace
Boolean

Whether trailing whitespace should be removed from dataset file lines.

groupColumns
Boolean

Whether to group together (when possible) original columns in the dataset file into vector columns in the resulting data structures. See TextLoader.Range for more information.

Returns

Information inferred about the columns in the provided dataset.

Remarks

Infers information about the name, data type, and purpose of each column. The returned TextLoaderOptions can be used to instantiate a TextLoader. The TextLoader can be used to obtain an IDataView that can be fed into an AutoML experiment, or used elsewhere in the ML.NET ecosystem (ie in Fit(IDataView). The ColumnInformation contains the inferred purpose of each column in the dataset. (For instance, is the column categorical, numeric, or text data? Should the column be ignored? Etc.) The ColumnInformation can be inspected and modified (or kept as is) and used by an AutoML experiment.

Applies to

InferColumns(String, UInt32, Boolean, Nullable<Char>, Nullable<Boolean>, Nullable<Boolean>, Boolean, Boolean)

Infers information about the columns of a dataset in a file located at path.

public Microsoft.ML.AutoML.ColumnInferenceResults InferColumns (string path, uint labelColumnIndex, bool hasHeader = false, char? separatorChar = default, bool? allowQuoting = default, bool? allowSparse = default, bool trimWhitespace = false, bool groupColumns = true);
member this.InferColumns : string * uint32 * bool * Nullable<char> * Nullable<bool> * Nullable<bool> * bool * bool -> Microsoft.ML.AutoML.ColumnInferenceResults
Public Function InferColumns (path As String, labelColumnIndex As UInteger, Optional hasHeader As Boolean = false, Optional separatorChar As Nullable(Of Char) = Nothing, Optional allowQuoting As Nullable(Of Boolean) = Nothing, Optional allowSparse As Nullable(Of Boolean) = Nothing, Optional trimWhitespace As Boolean = false, Optional groupColumns As Boolean = true) As ColumnInferenceResults

Parameters

path
String

Path to a dataset file.

labelColumnIndex
UInt32

Column index of the label column in the dataset.

hasHeader
Boolean

Whether or not the dataset file has a header row.

separatorChar
Nullable<Char>

The character used as separator between data elements in a row. If null, AutoML will try to infer this value.

allowQuoting
Nullable<Boolean>

Whether the file can contain columns defined by a quoted string. If null, AutoML will try to infer this value.

allowSparse
Nullable<Boolean>

Whether the file can contain numerical vectors in sparse format. If null, AutoML will try to infer this value.

trimWhitespace
Boolean

Whether trailing whitespace should be removed from dataset file lines.

groupColumns
Boolean

Whether to group together (when possible) original columns in the dataset file into vector columns in the resulting data structures. See TextLoader.Range for more information.

Returns

Information inferred about the columns in the provided dataset.

Remarks

Infers information about the name, data type, and purpose of each column. The returned TextLoaderOptions can be used to instantiate a TextLoader. The TextLoader can be used to obtain an IDataView that can be fed into an AutoML experiment, or used elsewhere in the ML.NET ecosystem (ie in Fit(IDataView). The ColumnInformation contains the inferred purpose of each column in the dataset. (For instance, is the column categorical, numeric, or text data? Should the column be ignored? Etc.) The ColumnInformation can be inspected and modified (or kept as is) and used by an AutoML experiment.

Applies to