Share via

TextLoader.Options Class


The settings for TextLoader

public class TextLoader.Options
type TextLoader.Options = class
Public Class TextLoader.Options





Whether the input may include double-quoted values. This parameter is used to distinguish separator characters in an input value from actual separators. When true, separators within double quotes are treated as part of the input value. When false, all separators, even those within quotes, are treated as delimiting a new column.


Whether the input may include sparse representations. For example, a row containing "5 2:6 4:3" means that there are 5 columns, and the only non-zero are columns 2 and 4, which have values 6 and 3, respectively. Column indices are zero-based, so columns 2 and 4 represent the 3rd and 5th columns. A column may also have dense values followed by sparse values represented in this fashion. For example, a row containing "1 2 5 2:6 4:3" represents two dense columns with values 1 and 2, followed by 5 sparsely represented columns with values 0, 0, 6, 0, and 3. The indices of the sparse columns start from 0, even though 0 represents the third column.

In addition, InputSize should be used when the number of sparse elements (5 in this example) is not present in each line. It should specify the total size, not just the size of the sparse part. However, indices of the spars part are relative to where the sparse part begins. If InputSize is set to 7, the line "1 2 2:6 4:3" will be mapped to "1 2 0 0 6 0 4", but if set to 10, the same line will be mapped to "1 2 0 0 6 0 4 0 0 0".


Specifies the input columns that should be mapped to IDataView columns.


The character that should be used as the decimal marker. Default value is '.'. Only '.' and ',' are allowed to be decimal markers.


Character to use to escape quotes inside quoted fields. It can't be a character used as separator.


Whether the file has a header with feature names. When true, the loader will skip the first line when Load(IMultiStreamSource) is called. The sample can be used to infer slot name annotations if present.


File containing a header with feature names. If specified, the header defined in the data file is ignored regardless of HasHeader.


Number of source columns in the text data. Default is that sparse rows contain their size information.


Maximum number of rows to produce.


If true, missing real fields (i.e. double or single fields) will be loaded as NaN. If false, they'll be loaded as 0. Default is false. A field is considered "missing" if it's empty, if it only has whitespace, or if there are missing columns at the end of a given row.


If true, new line characters are acceptable inside a quoted field, and thus one field can have multiple lines of text inside it If AllowQuoting is false, this option is ignored.


The characters that should be used as separators column separator.


Wheter to remove trailing whitespace from lines.


Whether to use separate parsing threads.

Applies to