Convert to ARFF

Important

Support for Machine Learning Studio (classic) will end on 31 August 2024. We recommend you transition to Azure Machine Learning by that date.

Beginning 1 December 2021, you will not be able to create new Machine Learning Studio (classic) resources. Through 31 August 2024, you can continue to use the existing Machine Learning Studio (classic) resources.

ML Studio (classic) documentation is being retired and may not be updated in the future.

Converts data input to the attribute relation file format used by the Weka toolset

Category: Data Format Conversions

Note

Applies to: Machine Learning Studio (classic) only

Similar drag-and-drop modules are available in Azure Machine Learning designer.

Module overview

This article describes how to use the Convert to ARFF module in Machine Learning Studio (classic), to convert datasets and results the attribute-relation file format used by the Weka toolset. This format is known as ARFF.

The ARFF data specification for Weka supports multiple machine learning tasks, including data preprocessing, classification, and feature selection. In this format, data is organized by entites and their attributes, and is contained in a single text file. You can find details of the Weka file format in the Technical Notes section.

In general, conversion to the Weka file format is required only if you want to use both Machine Learning and Weka, and intend to move your training data back and forth between them.

For more information about the Weka toolset, see this Wikipedia article: Weka (machine learning)

Warning

You cannot overwrite an existing ARFF file in Azure Storage.

How to use Convert to ARFF

  1. Add the Convert to ARFF module to your experiment. You can find this module in the Data Format Conversions category in Machine Learning Studio (classic).

  2. Connect it to any module that outputs a dataset.

  3. Run the experiment, or click the Convert to ARFF module, and click Run selected.

Results

  • To create a copy of the data in a local folder, double-click the output of Convert to ARFF, and select the Download option.

    If you do not specify a folder, a default file name is applied and the file is saved in the local Downloads library.

Note

This module does not support export to Python or R code.

Examples

There are no examples specific to this format in the Azure AI Gallery. However, these experiments demonstrate other types of format conversion:

Technical notes

This section contains implementation details, tips, and answers to frequently asked questions.

Example of ARFF format

This section provides an example of how a typical dataset would look when converted to ARFF.

Typically an ARFF data file is comprised of two sections: a header that defines the data source and schema, and the data section, which contains the actual entities and their attributes.

ARFF header

The header for an ARFF file defines the list of the attributes (in columns) and their data types. The header can also contain multiple comment lines that describe the data source or any other notes.

% Source: Iris dataset, UCI % 0 = Iris-setosa, 1= Iris-virginica @RELATION iris @ATTRIBUTE sepal_length NUMERIC @ATTRIBUTE sepal_width NUMERIC @ATTRIBUTE petal_length NUMERIC @ATTRIBUTE petal_width NUMERIC @ATTRIBUTE class {0, 1}

Tip

If the dataset you are converting does not have column names, use the Edit Metadata module to add column names before using converting to ARFF.

ARFF data

The data section consists of comma-separated values, and looks very much like a CSV file without column headings.

@DATA 5.1,3.5,1.4,0.2,0

For additional information about this file format, see the Weka Wiki page: ARFF (developer version).

Current ARFF version

Machine Learning Studio (classic) saves ARFF files by using the ARFF 3.0 format.

Expected inputs

Name Type Description
Dataset Data Table Input dataset

Outputs

Name Type Description
Results dataset Arff Output dataset

See also

Data Format Conversions
A-Z Module List