rxSetVarInfo: Set Variable Information for .xdf File or Data Frame

Description

Set the variable information for an .xdf file, including variable names, descriptions, and value labels, or set attributes for variables in a data frame

Usage

  rxSetVarInfo(varInfo, data)
  rxSetVarInfoXdf(varInfo, file)

Arguments

varInfo

list containing lists of variable information for variables in the XDF data source or data frame.

data

a data frame, a character string specifying the .xdf file, or an RxXdfData object.

file

character string specifying the .xdf file or an RxXdfData object.

Details

The list for each variable contained in varInfo can have the following elements:

* position: integer indicating the position of the variable in the data set. If present, will be used instead of the list name.

* newName: character string giving a new name for the variable.

* description: character string giving a description of the variable.

* low: numeric value specifying the low value used in on the fly factor conversions using the F() transformation. Ignored for data frames.

* high: numeric value specifying the high value used in on the fly factor conversions using the F() transformation. Ignored for data frames.

* levels: character vector of factor levels. This will re-label an existing factor, but not change the underlying values.

* valueInfoCodes: character vector of value codes for informational purposes only.

* valueInfoLabels: character vector of value labels the same length as valueInfoCodes, used for informational purposes only.

* tzone: character string giving tzone attribute for a POSIXct variable.

Value

If the input data is a data frame, a data frame is returned containing variables with the new attributes. If the input data represents an .xdf file or composite file, an RxXdfData object representing the modified file is returned.

Note

rxSetVarInfoXdf and rxSetVarInfo do not change the underlying data so the user cannot set the variable type and storage type using this function, or recode the levels of a factor variable. To recode factors, use rxFactors. rxSetVarInfo is not supported for single .xdf files in HDFS.

See Also

rxDataStep, rxFactors,

Examples


 # Rename the factor levels for a variable



 # Create a sample data set
 # We will change the names of the factor levels, after the fact
 set.seed(100)
 myData1 <- data.frame(y = rnorm(30),
                       month = factor(c(0:11, as.integer(runif(18, 0, 12)))))

 # Plot the data
 rxLinePlot(y ~ month, type = "p", data = myData1)

 # Set the labels for the month variable
 monthLabels <- c("Jan", "Feb", "Mar", "Apr", "May", "Jun", "Jul", "Aug", "Sep",
                  "Oct", "Nov", "Dec")

 # Get the variable information for the data frame
 varInfo <- rxGetVarInfo(myData1)

 # Reset the names of the factor levels
 varInfo$month$levels <- monthLabels
 myData2 <- rxSetVarInfo( varInfo, myData1)

 # Plot the new data
 rxLinePlot(y ~ month, type = "p", data = myData2)

 # Redo, this time putting the data in an .xdf file
 # Specify name of output file
 tempFile <- file.path(tempdir(), "rxTempTestFile.xdf")

 # Convert the original data to an XDF file
 rxDataStep(myData1, outFile = tempFile, overwrite = TRUE)

 # Get the varInfo of the new XDF file
 varInfo <- rxGetVarInfo(tempFile)

 varInfo$month$levels <- monthLabels

 # Write the modified varInfo back to the file
 rxSetVarInfo(varInfo, tempFile)

 # Read the data and plot it, showing the new factor levels
 myData2 <- rxDataStep(inData = tempFile)
 rxLinePlot(y ~ month, type = "p", data = myData2)

 ## Change variable names and add descriptions
 varInfo <- list(y = list(newName = "Rainfall",
                          description = "Rainfall per Month"),
                 list(position = 2, description = "Month of Year"))
 rxSetVarInfo(varInfo, tempFile)
 rxGetVarInfo(tempFile)
 file.remove(tempFile)