| DataFrame-class {IRanges} | R Documentation |
The DataFrame extends the DataTable virtual
class and supports the storage of any type of object (with length
and [ methods) as columns.
On the whole, the DataFrame behaves very similarly to
data.frame, in terms of construction, subsetting, splitting,
combining, etc. The most notable exception is that the row names are
optional. This means calling rownames(x) will return
NULL if there are no row names. Of course, it could return
seq_len(nrow(x)), but returning NULL informs, for
example, combination functions that no row names are desired (they are
often a luxury when dealing with large data).
As DataFrame derives from Sequence, it is
possible to set an annotation string. Also, another
DataFrame can hold metadata on the columns.
In the following code snippets, x is a DataFrame.
dim(x):
Get the length two integer vector indicating in the first and
second element the number of rows and columns, respectively.
dimnames(x), dimnames(x) <- value:
Get and set the two element list containing the row names
(character vector of length nrow(x) or NULL)
and the column names (character vector of length ncol(x)).
In the following code snippets, x is a DataFrame.
x[i,j,drop]: Behaves very similarly to the
[.data.frame method, except i can be a
logical Rle object and subsetting by matrix indices
is not supported. Indices containing NA's are also not
supported.
x[i,j] <- value: Behaves very similarly to the
[<-.data.frame method.
x[[i]]: Behaves very similarly to the
[[.data.frame method, except arguments j
and exact are not supported. Column name matching is
always exact. Subsetting by matrices is not supported.
x[[i]] <- value: Behaves very similarly to the
[[<-.data.frame method, except argument j
is not supported.
DataFrame(..., row.names = NULL):
Constructs a DataFrame in similar fashion to
data.frame. Each argument in ... is coerced to
a DataFrame and combined column-wise. No special effort is
expended to automatically determine the row names from the
arguments. The row names should be given in
row.names; otherwise, there are no row names. This is by
design, as row names are normally undesirable when data is large.
In the following code snippets, x is a DataFrame.
split(x, f, drop = FALSE):
Splits x into a CompressedSplitDataFrameList,
according to f, dropping elements corresponding to
unrepresented levels if drop is TRUE.
rbind(...): Creates a new DataFrame by
combining the rows of the DataFrame objects in
.... Very similar to rbind.data.frame, except
in the handling of row names. If all elements have row names, they
are concatenated and made unique. Otherwise, the result does not
have row names. Currently, factors are not handled well (their
levels are dropped). This is not a high priority until there is an
XFactor class.
cbind(...): Creates a new DataFrame by
combining the columns of the DataFrame objects in
.... Very similar to cbind.data.frame, except
row names, if any, are dropped. Consider the DataFrame
as an alternative that allows one to specify row names.
In the following code snippets, data is a DataFrame.
aggregate(x, data, FUN, ..., subset, na.action =
na.omit):
Aggregates the DataFrame data according to the
formula x and the aggregating
function FUN. See aggregate and its method
for formula.
as(from, "DataFrame"):
By default, constructs a new DataFrame with from as
its only column. If from is a matrix or
data.frame, all of its columns become columns in the new
DataFrame. If from is a list, each element becomes a
column, recycling as necessary. Note that for the DataFrame
to behave correctly, each column object must support element-wise
subsetting via the [ method and return the number of elements with
length. It is recommended to use the DataFrame
constructor, rather than this interface.
as.list(x): Coerces x, a DataFrame,
to a list.
as.data.frame(x, row.names=NULL, optional=FALSE):
Coerces x, a DataFrame, to a data.frame.
Each column is coerced to a data.frame and then column
bound together. If row.names is NULL, they are
retrieved from x, if it has any. Otherwise, they are
inferred by the data.frame constructor.
NOTE: conversion of x to a data.frame is not
supported if x contains any list, SimpleList,
or CompressedList columns.
as(from, "data.frame"): Coerces a DataFrame
to a data.frame by calling as.data.frame(from).
Michael Lawrence
DataTable,
Sequence, and
RangedData, which makes heavy use of this class.
score <- c(1L, 3L, NA)
counts <- c(10L, 2L, NA)
row.names <- c("one", "two", "three")
df <- DataFrame(score) # single column
df[["score"]]
df <- DataFrame(score, row.names = row.names) #with row names
rownames(df)
df <- DataFrame(vals = score) # explicit naming
df[["vals"]]
# a data.frame
sw <- DataFrame(swiss)
as.data.frame(sw) # swiss, without row names
# now with row names
sw <- DataFrame(swiss, row.names = rownames(swiss))
as.data.frame(sw) # swiss
# subsetting
sw[] # identity subset
sw[,] # same
sw[NULL] # no columns
sw[,NULL] # no columns
sw[NULL,] # no rows
## select columns
sw[1:3]
sw[,1:3] # same as above
sw[,"Fertility"]
sw[,c(TRUE, FALSE, FALSE, FALSE, FALSE, FALSE)]
## select rows and columns
sw[4:5, 1:3]
sw[1] # one-column DataFrame
## the same
sw[, 1, drop = FALSE]
sw[, 1] # a (unnamed) vector
sw[[1]] # the same
sw[["Fertility"]]
sw[["Fert"]] # should return 'NULL'
sw[1,] # a one-row DataFrame
sw[1,, drop=TRUE] # a list
## duplicate row, unique row names are created
sw[c(1, 1:2),]
## indexing by row names
sw["Courtelary",]
subsw <- sw[1:5,1:4]
subsw["C",] # partially matches
## row and column names
cn <- paste("X", seq_len(ncol(swiss)), sep = ".")
colnames(sw) <- cn
colnames(sw)
rn <- seq(nrow(sw))
rownames(sw) <- rn
rownames(sw)
## column replacement
df[["counts"]] <- counts
df[["counts"]]
df[[3]] <- score
df[["X"]]
df[[3]] <- NULL # deletion
## split
sw <- DataFrame(swiss)
swsplit <- split(sw, sw[["Education"]])
## rbind
do.call(rbind, as.list(swsplit))
## cbind
cbind(DataFrame(score), DataFrame(counts))