TextLoaderSaverCatalog.LoadFromTextFile Method
Definition
Important
Some information relates to prerelease product that may be substantially modified before it’s released. Microsoft makes no warranties, express or implied, with respect to the information provided here.
Overloads
LoadFromTextFile(DataOperationsCatalog, String, TextLoader+Options) |
Load a IDataView from a text file using TextLoader. Note that IDataView's are lazy, so no actual loading happens here, just schema validation. |
LoadFromTextFile(DataOperationsCatalog, String, TextLoader+Column[], Char, Boolean, Boolean, Boolean, Boolean) |
Load a IDataView from a text file using TextLoader. Note that IDataView's are lazy, so no actual loading happens here, just schema validation. |
LoadFromTextFile<TInput>(DataOperationsCatalog, String, TextLoader+Options) |
Load a IDataView from a text file using TextLoader. Note that IDataView's are lazy, so no actual loading happens here, just schema validation. |
LoadFromTextFile<TInput>(DataOperationsCatalog, String, Char, Boolean, Boolean, Boolean, Boolean) |
Load a IDataView from a text file using TextLoader. Note that IDataView's are lazy, so no actual loading happens here, just schema validation. |
LoadFromTextFile(DataOperationsCatalog, String, TextLoader+Options)
Load a IDataView from a text file using TextLoader. Note that IDataView's are lazy, so no actual loading happens here, just schema validation.
public static Microsoft.ML.IDataView LoadFromTextFile (this Microsoft.ML.DataOperationsCatalog catalog, string path, Microsoft.ML.Data.TextLoader.Options options = default);
static member LoadFromTextFile : Microsoft.ML.DataOperationsCatalog * string * Microsoft.ML.Data.TextLoader.Options -> Microsoft.ML.IDataView
<Extension()>
Public Function LoadFromTextFile (catalog As DataOperationsCatalog, path As String, Optional options As TextLoader.Options = Nothing) As IDataView
Parameters
- catalog
- DataOperationsCatalog
The DataOperationsCatalog catalog.
- path
- String
Specifies a file or path of files from which to load.
- options
- TextLoader.Options
Defines the settings of the load operation.
Returns
Examples
using System;
using System.Collections.Generic;
using System.IO;
using Microsoft.ML;
namespace Samples.Dynamic
{
public static class SaveAndLoadFromText
{
public static void Example()
{
// Create a new context for ML.NET operations. It can be used for
// exception tracking and logging, as a catalog of available operations
// and as the source of randomness. Setting the seed to a fixed number
// in this example to make outputs deterministic.
var mlContext = new MLContext(seed: 0);
// Create a list of training data points.
var dataPoints = new List<DataPoint>()
{
new DataPoint(){ Label = 0, Features = 4},
new DataPoint(){ Label = 0, Features = 5},
new DataPoint(){ Label = 0, Features = 6},
new DataPoint(){ Label = 1, Features = 8},
new DataPoint(){ Label = 1, Features = 9},
};
// Convert the list of data points to an IDataView object, which is
// consumable by ML.NET API.
IDataView data = mlContext.Data.LoadFromEnumerable(dataPoints);
// Create a FileStream object and write the IDataView to it as a text
// file.
using (FileStream stream = new FileStream("data.tsv", FileMode.Create))
mlContext.Data.SaveAsText(data, stream);
// Create an IDataView object by loading the text file.
IDataView loadedData = mlContext.Data.LoadFromTextFile("data.tsv");
// Inspect the data that is loaded from the previously saved text file.
var loadedDataEnumerable = mlContext.Data
.CreateEnumerable<DataPoint>(loadedData, reuseRowObject: false);
foreach (DataPoint row in loadedDataEnumerable)
Console.WriteLine($"{row.Label}, {row.Features}");
// Preview of the loaded data.
// 0, 4
// 0, 5
// 0, 6
// 1, 8
// 1, 9
}
// Example with label and feature values. A data set is a collection of such
// examples.
private class DataPoint
{
public float Label { get; set; }
public float Features { get; set; }
}
}
}
Applies to
LoadFromTextFile(DataOperationsCatalog, String, TextLoader+Column[], Char, Boolean, Boolean, Boolean, Boolean)
Load a IDataView from a text file using TextLoader. Note that IDataView's are lazy, so no actual loading happens here, just schema validation.
public static Microsoft.ML.IDataView LoadFromTextFile (this Microsoft.ML.DataOperationsCatalog catalog, string path, Microsoft.ML.Data.TextLoader.Column[] columns, char separatorChar = '\t', bool hasHeader = false, bool allowQuoting = false, bool trimWhitespace = false, bool allowSparse = false);
static member LoadFromTextFile : Microsoft.ML.DataOperationsCatalog * string * Microsoft.ML.Data.TextLoader.Column[] * char * bool * bool * bool * bool -> Microsoft.ML.IDataView
<Extension()>
Public Function LoadFromTextFile (catalog As DataOperationsCatalog, path As String, columns As TextLoader.Column(), Optional separatorChar As Char = '\t', Optional hasHeader As Boolean = false, Optional allowQuoting As Boolean = false, Optional trimWhitespace As Boolean = false, Optional allowSparse As Boolean = false) As IDataView
Parameters
- catalog
- DataOperationsCatalog
The DataOperationsCatalog catalog.
- path
- String
The path to the file(s).
- columns
- TextLoader.Column[]
The columns of the schema.
- separatorChar
- Char
The character used as separator between data points in a row. By default the tab character is used as separator.
- hasHeader
- Boolean
Whether the file has a header. When true
, the loader will skip the first line when
Load(IMultiStreamSource) is called.
- allowQuoting
- Boolean
Whether the input may include double-quoted values. This parameter is used to distinguish separator characters
in an input value from actual separators. When true
, separators within double quotes are treated as part of the
input value. When false
, all separators, even those whitin quotes, are treated as delimiting a new column.
It is also used to distinguish empty values from missing values. When true
, missing value are denoted by consecutive
separators and empty values by "". When false
, empty values are denoted by consecutive separators and missing
values by the default missing value for each type documented in DataKind.
- trimWhitespace
- Boolean
Remove trailing whitespace from lines.
- allowSparse
- Boolean
Whether the input may include sparse representations. For example, a row containing "5 2:6 4:3" means that there are 5 columns, and the only non-zero are columns 2 and 4, which have values 6 and 3, respectively. Column indices are zero-based, so columns 2 and 4 represent the 3rd and 5th columns. A column may also have dense values followed by sparse values represented in this fashion. For example, a row containing "1 2 5 2:6 4:3" represents two dense columns with values 1 and 2, followed by 5 sparsely represented columns with values 0, 0, 6, 0, and 3. The indices of the sparse columns start from 0, even though 0 represents the third column.
Returns
The data view.
Applies to
LoadFromTextFile<TInput>(DataOperationsCatalog, String, TextLoader+Options)
Load a IDataView from a text file using TextLoader. Note that IDataView's are lazy, so no actual loading happens here, just schema validation.
public static Microsoft.ML.IDataView LoadFromTextFile<TInput> (this Microsoft.ML.DataOperationsCatalog catalog, string path, Microsoft.ML.Data.TextLoader.Options options);
static member LoadFromTextFile : Microsoft.ML.DataOperationsCatalog * string * Microsoft.ML.Data.TextLoader.Options -> Microsoft.ML.IDataView
<Extension()>
Public Function LoadFromTextFile(Of TInput) (catalog As DataOperationsCatalog, path As String, options As TextLoader.Options) As IDataView
Type Parameters
- TInput
Parameters
- catalog
- DataOperationsCatalog
The DataOperationsCatalog catalog.
- path
- String
Specifies a file or path of files from which to load.
- options
- TextLoader.Options
Defines the settings of the load operation. No need to specify a Columns field, as columns will be infered by this method.
Returns
The data view.
Applies to
LoadFromTextFile<TInput>(DataOperationsCatalog, String, Char, Boolean, Boolean, Boolean, Boolean)
Load a IDataView from a text file using TextLoader. Note that IDataView's are lazy, so no actual loading happens here, just schema validation.
public static Microsoft.ML.IDataView LoadFromTextFile<TInput> (this Microsoft.ML.DataOperationsCatalog catalog, string path, char separatorChar = '\t', bool hasHeader = false, bool allowQuoting = false, bool trimWhitespace = false, bool allowSparse = false);
static member LoadFromTextFile : Microsoft.ML.DataOperationsCatalog * string * char * bool * bool * bool * bool -> Microsoft.ML.IDataView
<Extension()>
Public Function LoadFromTextFile(Of TInput) (catalog As DataOperationsCatalog, path As String, Optional separatorChar As Char = '\t', Optional hasHeader As Boolean = false, Optional allowQuoting As Boolean = false, Optional trimWhitespace As Boolean = false, Optional allowSparse As Boolean = false) As IDataView
Type Parameters
- TInput
Parameters
- catalog
- DataOperationsCatalog
The DataOperationsCatalog catalog.
- path
- String
The path to the file(s).
- separatorChar
- Char
Column separator character. Default is '\t'
- hasHeader
- Boolean
Whether the file has a header. When true
, the loader will skip the first line when
Load(IMultiStreamSource) is called.
- allowQuoting
- Boolean
Whether the input may include double-quoted values. This parameter is used to distinguish separator characters
in an input value from actual separators. When true
, separators within double quotes are treated as part of the
input value. When false
, all separators, even those whitin quotes, are treated as delimiting a new column.
It is also used to distinguish empty values from missing values. When true
, missing value are denoted by consecutive
separators and empty values by "". When false
, empty values are denoted by consecutive separators and missing
values by the default missing value for each type documented in DataKind.
- trimWhitespace
- Boolean
Remove trailing whitespace from lines.
- allowSparse
- Boolean
Whether the input may include sparse representations. For example, a row containing "5 2:6 4:3" means that there are 5 columns, and the only non-zero are columns 2 and 4, which have values 6 and 3, respectively. Column indices are zero-based, so columns 2 and 4 represent the 3rd and 5th columns. A column may also have dense values followed by sparse values represented in this fashion. For example, a row containing "1 2 5 2:6 4:3" represents two dense columns with values 1 and 2, followed by 5 sparsely represented columns with values 0, 0, 6, 0, and 3. The indices of the sparse columns start from 0, even though 0 represents the third column.
Returns
The data view.