mwhich               package:bigmemory               R Documentation

_E_x_p_a_n_d_e_d "_w_h_i_c_h"-_l_i_k_e _f_u_n_c_t_i_o_n_a_l_i_t_y.

_D_e_s_c_r_i_p_t_i_o_n:

     Implements 'which'-like functionality for a 'big.matrix', with
     additional options for efficient comparisons executed in C++
     rather than R; also works for regular numeric matrices without the
     memory overhead.

_U_s_a_g_e:

     mwhich(x, cols, vals, comps, op = 'AND')

_A_r_g_u_m_e_n_t_s:

       x: a 'big.matrix' (or a numeric matrix; see below).

    cols: a vector of column indices or names.

    vals: a list (one component for each of 'cols') of vectors of
          length 1 or 2; length 1 is used to test equality (or
          inequality), while vectors of length 2 are used for checking
          values in the range ('-Inf' and 'Inf' are allowed). If a
          scalar or vector of length 2 is provided instead of a list,
          it will be replicated 'length(cols)' times.

   comps: a list of operators (one component for each of 'cols'),
          including ''eq'', ''neq'', ''le'', ''lt'', ''ge'' and ''gt''.
           If a single operator, it will be replicated 'length(cols)'
          times.

      op: the comparison operator for combining the results of the
          individual tests, either ''AND'' or ''OR''.

_D_e_t_a_i_l_s:

     To improve performance and avoid the creation of massive vectors
     in R when doing comparisons, 'mwhich()' efficiently executes
     column-by-column comparisons of values to the specified values or
     ranges, and then returns the row indices satisfying the comparison
     using the 'op' operator.  More advanced comparisons are then
     possible (and memory-efficient) in R by doing set operations
     ('union' and 'intersect', for example) on the results of multiple
     'mwhich()' calls.

     Note that 'NA' is a valid argument in conjunction with ''eq'' or
     ''neq'', replacing traditional 'is.na()' calls. And both '-Inf'
     and 'Inf' can be used for one-sided inequalities.

     If 'mwhich()' is used with a regular numeric 'matrix', we access
     the data directly, so there is no memory overhead.  Interested
     developers might want to look at our code for this case, which
     uses a handy pointer trick in C++.

_V_a_l_u_e:

     a vector of row indices satisfying the criteria.

_A_u_t_h_o_r(_s):

     John W. Emerson and Michael J. Kane

_S_e_e _A_l_s_o:

     'big.matrix', 'which'

_E_x_a_m_p_l_e_s:

     x <- as.big.matrix(matrix(1:30, 10, 3))
     colnames(x) <- c("A", "B", "C")
     x[,]
     x[mwhich(x, 1:2, list(c(2,3), c(11,17)),
                        list(c('ge','le'), c('gt', 'lt')), 'OR'),]

     x[mwhich(x, c("A","B"), list(c(2,3), c(11,17)), 
                        list(c('ge','le'), c('gt', 'lt')), 'AND'),]

     # These should produce the same answer with a regular matrix:
     y <- matrix(1:30, 10, 3)
     y[mwhich(y, 1:2, list(c(2,3), c(11,17)),
                        list(c('ge','le'), c('gt', 'lt')), 'OR'),]

     y[mwhich(y, -3, list(c(2,3), c(11,17)),
                        list(c('ge','le'), c('gt', 'lt')), 'AND'),]

     x[1,1] <- NA
     mwhich(x, 1:2, NA, 'eq', 'OR')
     mwhich(x, 1:2, NA, 'neq', 'AND')

     # Column 1 equal to 4 and/or column 2 less than or equal to 16:
     mwhich(x, 1:2, list(4, 16), list('eq', 'le'), 'OR')
     mwhich(x, 1:2, list(4, 16), list('eq', 'le'), 'AND')

     # Column 2 less than or equal to 15:
     mwhich(x, 2, 15, 'le')

     # No NAs in either column, and column 2 strictly less than 15:
     mwhich(x, c(1:2,2), list(NA, NA, 15), list('neq', 'neq', 'lt'), 'AND')

     x <- big.matrix(4, 2, init=1, type="double")
     x[1,1] <- Inf
     mwhich(x, 1, Inf, 'eq')
     mwhich(x, 1, 1, 'gt')
     mwhich(x, 1, 1, 'le')

