rxOpen-methods: Managing RevoScaleR Data Source Objects
Description
These functions manage RevoScaleR data source objects.
Usage
rxOpen(src, mode = "r")
rxClose(src, mode = "r")
rxIsOpen(src, mode = "r")
rxReadNext(src)
rxWriteNext(from, to, ...)
Arguments
from
data frame object.
src
RxDataSource object.
to
RxDataSource object.
mode
character string specifying the mode (r
or w
) to open the file.
...
any other arguments to be passed on.
Value
For rxOpen
and rxClose
, a logical indicating whether the operation
was successful.
For rxIsOpen
, a logical indicating whether or not the RxDataSource is
open for the specified mode
.
For rxReadNext
, either a data frame or a list depending upon the value of
the returnDataFrame
property within src
.
Author(s)
Microsoft Corporation Microsoft Technical Support
See Also
Examples
ds <- RxXdfData(file.path(rxGetOption("sampleDataDir"), "claims.xdf"))
# ds contains only one block of data
rxOpen(ds) # must open the file before rxReadNext
rxReadNext(ds) # get the first block
rxReadNext(ds)
rxClose(ds)
# Use a data source to compute means by processing the data in chunks
# Data processing functions: for each chunk, compute sums of columns and
# number of rows, then update the results computed from previous chunks
processData <- function(dframe)
list(sumCols = colSums(dframe), numRows = nrow(dframe))
updateResults <- function(x, y)
list(sumCols = x$sumCols + y$sumCols, numRows = x$numRows + y$numRows)
# Create data source
censusWorkers <- file.path(rxGetOption("sampleDataDir"), "CensusWorkers.xdf")
ds <- RxXdfData(censusWorkers, varsToKeep = c("age", "incwage"),
blocksPerRead = 2)
# Process data and update results
rxOpen(ds)
resList <- processData(rxReadNext(ds))
while(TRUE)
{
df <- rxReadNext(ds)
if (nrow(df) == 0)
break
resList <- updateResults(resList, processData(df))
}
rxClose(ds)
# Compute the means of the variables from the accumulated results
varMeans <- resList$sumCols / resList$numRows
varMeans