mwhich               package:bigmemory               R Documentation

_E_x_p_a_n_d_e_d "_w_h_i_c_h"-_l_i_k_e _f_u_n_c_t_i_o_n_a_l_i_t_y.

_D_e_s_c_r_i_p_t_i_o_n:

     Implements 'which'-like functionality for a 'big.matrix', with
     additional options for efficient comparisons executed in C++
     rather than R; also works for regular numeric matrices without the
     memory overhead.

_U_s_a_g_e:

     mwhich(x, cols, vals, comps, op = 'AND')

_A_r_g_u_m_e_n_t_s:

       x: a 'big.matrix' (or a numeric matrix; see below).

    cols: a vector of column indices or names.

    vals: a list (one component for each of 'cols') of vectors of
          length 1 or 2; length 1 is used to test equality (or not
          equal), while vectors of length 2 are used for checking
          values in the range ('-Inf' and 'Inf' are allowed). If a
          scalar or vector of length 2 instead of a list, it will be
          replicated 'length(cols)' times.

   comps: a list of operators, including ''eq'', ''neq'', ''le'',
          ''lt'', ''ge'' and ''gt''.  If a single operator, it will be
          replicated 'length(testCol)' times.

      op: the comparison operator for combining the results of the
          individual tests, either ''AND'' or ''OR''.

_D_e_t_a_i_l_s:

     To avoid the creation of massive vectors in R when doing
     comparisons, 'mwhich()' executes column-by-column comparisons of
     values to the specified values or ranges, and then returns the row
     indices satisfying the comparison using the 'op' operator.  More
     advanced comparisons are then possible (and memory-efficient) in R
     by doing set operations ('union' and 'intersect', for example) on
     the results of multiple 'mwhich()' calls.

     Note that 'NA' is a valid argument in conjunction with ''eq'' or
     ''neq'', replacing traditional 'is.na()' calls. And both '-Inf'
     and 'Inf' can be used for one-sided inequalities.

     If 'mwhich()' is used with a regular numeric 'matrix', we access
     the data directly, so there is no memory overhead.  Interested
     developers might want to look at our code for this case, which
     uses a handy pointer trick in C++.

_V_a_l_u_e:

     a vector of row indices satisfying the criteria.

_A_u_t_h_o_r(_s):

     John W. Emerson and Michael J. Kane

_S_e_e _A_l_s_o:

     'big.matrix', 'which'

_E_x_a_m_p_l_e_s:

     x <- as.big.matrix(matrix(1:30, 10, 3))
     x[,]
     x[mwhich(x, 1:2, list(c(2,3), c(11,17)),
                        list(c('ge','le'), c('gt', 'lt')), 'OR'),]

     x[mwhich(x, 1:2, list(c(2,3), c(11,17)), 
                        list(c('ge','le'), c('gt', 'lt')), 'AND'),]

     # These should produce the same answer with a regular matrix:
     y <- matrix(1:30, 10, 3)
     y[mwhich(y, 1:2, list(c(2,3), c(11,17)),
                        list(c('ge','le'), c('gt', 'lt')), 'OR'),]

     y[mwhich(y, 1:2, list(c(2,3), c(11,17)),
                        list(c('ge','le'), c('gt', 'lt')), 'AND'),]

     x[1,1] <- NA
     mwhich(x, 1:2, NA, 'eq', 'OR')
     mwhich(x, 1:2, NA, 'neq', 'AND')

     # Column 1 equal to 4 and/or column 2 less than or equal to 16:
     mwhich(x, 1:2, list(4, 16), list('eq', 'le'), 'OR')
     mwhich(x, 1:2, list(4, 16), list('eq', 'le'), 'AND')

     # Column 2 less than or equal to 15:
     mwhich(x, 2, 15, 'le')

     # No NAs in either column, and column 2 strictly less than 15:
     mwhich(x, c(1:2,2), list(NA, NA, 15), list('neq', 'neq', 'lt'), 'AND')

     x <- big.matrix(4, 2, init=1, type="double")
     x[1,1] <- Inf
     mwhich(x, 1, Inf, 'eq')
     mwhich(x, 1, 1, 'gt')
     mwhich(x, 1, 1, 'le')

