scantwo                 package:qtl                 R Documentation

_T_w_o-_d_i_m_e_n_s_i_o_n_a_l _g_e_n_o_m_e _s_c_a_n _w_i_t_h _a _t_w_o-_Q_T_L _m_o_d_e_l

_D_e_s_c_r_i_p_t_i_o_n:

     Perform a two-dimensional genome scan with a two-QTL model, with
     possible allowance for covariates.

_U_s_a_g_e:

     scantwo(cross, chr, pheno.col=1, model=c("normal","binary"),
             method=c("em","imp","hk","mr","mr-imp","mr-argmax"),
             addcovar=NULL, intcovar=NULL, weights=NULL,
             use=c("all.obs", "complete.obs"), 
             incl.markers=FALSE, clean.output=FALSE,
             maxit=4000, tol=1e-4,
             verbose=TRUE, n.perm, perm.strata=NULL,
             assumeCondIndep=FALSE, batchsize=250)

_A_r_g_u_m_e_n_t_s:

   cross: An object of class 'cross'. See 'read.cross' for details.

     chr: Optional vector indicating the chromosomes for which LOD
          scores should be calculated.  This should be a vector of
          character strings referring to chromosomes by name; numeric
          values are converted to strings.  Refer to chromosomes with a
          preceding '-' to have all chromosomes but those considered. 
          A logical (TRUE/FALSE) vector may also be used.

pheno.col: Column number in the phenotype matrix which should be used
          as the phenotype.  This can be a vector of integers; for
          methods '"hk"' and '"imp"' this can be considerably faster
          than doing them one at a time.  One may also give character
          strings matching the phenotype names.  Finally, one may give
          a numeric vector of phenotypes, in which case it must have
          the length equal to the number of individuals in the cross,
          and there must be either non-integers or values < 1 or > no.
          phenotypes; this last case may be useful for studying
          transformations.

   model: The phenotypic model: the usual normal model or a model for
          binary traits.

  method: Indicates whether to use the the EM algorithm, imputation,
          Haley-Knott regression, or marker regression.  Marker
          regression is performed either by dropping individuals with
          missing genotypes ('"mr"'), or by first filling in missing
          data using a single imputation ('"mr-imp"') or by the Viterbi
          algorithm ('"mr-argmax"').

addcovar: Additive covariates.

intcovar: Interactive covariates (interact with QTL genotype).

 weights: Optional weights of individuals.  Should be either NULL or a
          vector of length n.ind containing positive weights. Used only
          in the case 'model="normal"'.

     use: In the case that multiple phenotypes are selected to be
          scanned, this argument indicates whether to use all
          individuals,  including those missing some phenotypes, or
          just those individuals that have data on all selected
          phenotypes.

incl.markers: If FALSE, do calculations only at points on an evenly
          spaced grid.  If 'calc.genoprob' or 'sim.geno' were run with
          'stepwidth="variable"', we force 'incl.markers=TRUE'.

clean.output: If TRUE, clean the output with 'clean.scantwo', replacing
          LOD scores for pairs of positions that are between markers
          with 0.  In permutations, this will be done for each
          permutation replicate.  This can be important for the case of
          'method="em"', as there can be difficulty with algorithm
          convergence in these regions.

   maxit: Maximum number of iterations; used only with method '"em"'.

     tol: Tolerance value for determining convergence; used only with
          method '"em"'.

 verbose: If TRUE, display information about the progress of
          calculations.  For method '"em"', if 'verbose' is an integer
          above 1, further details on the progress of the algorithm
          will be displayed.

  n.perm: If specified, a permutation test is performed rather than an
          analysis of the observed data.  This argument defines the
          number of permutation replicates.

perm.strata: If 'n.perm' > 0, this may be used to perform a stratified
          permutation test.  This should be a vector with the same
          number of individuals as in the cross data.  Unique values
          indicate the individual strata, and permutations will be
          performed within the strata.

assumeCondIndep: If TRUE, assume conditional independence of QTL
          genotypes given marker genotypes.  This is an approximation,
          but it may speed things up.

batchsize: The number of phenotypes (or permutations) to be run as a
          batch; used only for methods '"hk"' and '"imp"'.

_D_e_t_a_i_l_s:

     Standard interval mapping ('method="em"') and Haley-Knott
     regression ('method="hk"') require that multipoint genotype
     probabilities are first calculated using 'calc.genoprob'.  The
     imputation method uses the results of 'sim.geno'. 

     The method '"em"' is standard interval mapping by the EM algorithm
     (Dempster et al. 1977; Lander and Botstein 1989).  Marker
     regression ('method="mr"') is simply linear regression of
     phenotypes on marker genotypes  (individuals with missing
     genotypes are discarded). Haley-Knott regression ('method="hk"')
     uses the regression of phenotypes on multipoint genotype
     probabilities.  The imputation method ('method="imp"') uses the
     pseudomarker  algorithm described by Sen and Churchill (2001).

     Individuals with missing phenotypes are dropped.

     In the presence of covariates, the full model is 

 y = m + b[q1] + b[q2] + b[q1 x q2] + A g + Z d[q1] + Z d[q2] + Z d[q1 x q2] + e

     where q1 and q2 are the unknown QTL genotypes at two locations,
     _A_ is a matrix of covariates, and _Z_ is a matrix of covariates
     that interact with QTL genotypes.  The columns of _Z_ are forced
     to be contained in the matrix _A_.

     The above full model is compared to the additive QTL model, 

         y = m + b[q1] + b[q2] + A g + Z d[q1] + Z d[q2] + e

     and also to the null model, with no QTL, 

                           y = m + A g + e


     In the case that 'n.perm' is specified, the R function 'scantwo'
     is called repeatedly.

_V_a_l_u_e:

     If 'n.perm' is missing, the function returns a list with class
     '"scantwo"' and containing three components.  The first component
     is a matrix of dimension [tot.pos x tot.pos]; the upper triangle
     contains the LOD scores for the additive model, and the lower
     triangle contains the LOD scores for the full model.  The diagonal
     contains the results of 'scanone'. The second component of the
     output is a data.frame indicating the locations at which the
     two-QTL LOD scores were calculated.  The first column is the
     chromosome identifier, the second column is the position in cM,
     the third column is a 1/0 indicator for ease in later pulling out
     only the equally spaced positions, and the fourth column indicates
     whether the position is on the X chromosome or not.  The final
     component is a version of the results of 'scanone' including sex
     and/or cross direction as additive covariates, which is needed for
     a proper calculation of conditional LOD scores. 

     If 'n.perm' is specified, the function returns a list with six
     different LOD scores from each of the permutation replicates. 
     First, the maximum LOD score for the full model (two QTLs plus an
     interaction).  Second, for each pair of chromosomes, we take the
     difference between the full LOD and the maximum single-QTL LOD for
     those two chromosomes, and then maximize this across chromosome
     pairs.  Third, for each pair of chromosomes we take the difference
     between the maximum full LOD and the maximum additive LOD, and
     then maximize this across chromosome pairs.  Fourth, the maximum
     LOD score for the additive QTL model.  Fifth, for each pair of
     chromosomes, we take the difference between the additive LOD and
     the maximum single-QTL LOD for those two chromosomes, and then
     maximize this across chromosome pairs.  Finally, the maximum
     single-QTL LOD score (that is, from a single-QTL scan).  The
     latter is not used in 'summary.scantwo', but does get calculated
     at each permutation, so we include it for the sake of
     completeness.

_X _c_h_r_o_m_o_s_o_m_e:

     The X chromosome must be treated specially in QTL mapping.

     As in 'scanone', if both males and females are included, male
     hemizygotes are allowed to be different from female homozygotes,
     and the null hypothesis must be changed in order to ensure that
     sex- or pgm-differences in the phenotype do not results in
     spurious linkage to the X chromosome.  (See the help file for
     'scanone'.)

_A_u_t_h_o_r(_s):

     Karl W Broman, kbroman@biostat.wisc.edu; Hao Wu

_R_e_f_e_r_e_n_c_e_s:

     Churchill, G. A. and Doerge, R. W. (1994) Empirical threshold
     values for quantitative trait mapping.  _Genetics_ *138*, 963-971.

     Dempster, A. P., Laird, N. M. and Rubin, D. B. (1977) Maximum
     likelihood from incomplete data via the EM algorithm.  _J. Roy.
     Statist. Soc._ B, *39*, 1-38.

     Haley, C. S. and Knott, S. A. (1992) A simple regression method
     for mapping quantitative trait loci in line crosses using flanking
     markers. _Heredity_ *69*, 315-324.

     Lander, E. S. and Botstein, D. (1989) Mapping Mendelian factors
     underlying quantitative traits using RFLP linkage maps. 
     _Genetics_ *121*, 185-199.

     Sen, \'S. and Churchill, G. A. (2001) A statistical framework for
     quantitative trait mapping.  _Genetics_ *159*, 371-387.

     Soller, M., Brody, T. and Genizi, A. (1976) On the power of
     experimental designs for the detection of linkage between marker
     loci and quantitative loci in crosses between inbred lines.
     _Theor. Appl. Genet._ *47*, 35-39.

_S_e_e _A_l_s_o:

     'plot.scantwo', 'summary.scantwo', 'scanone', 'max.scantwo',
     'summary.scantwoperm', 'c.scantwoperm'

_E_x_a_m_p_l_e_s:

     data(fake.f2)

     fake.f2 <- calc.genoprob(fake.f2, step=5)
     out.2dim <- scantwo(fake.f2, method="hk")
     plot(out.2dim)

     # permutations

     ## Not run: permo.2dim <- scantwo(fake.f2, method="hk", n.perm=1000)
     summary(permo.2dim, alpha=0.05)

     # summary with p-values
     summary(out.2dim, perms=permo.2dim, pvalues=TRUE,
             alphas=c(0.05, 0.10, 0.10, 0.05, 0.10))

     # covariates
     data(fake.bc)

     fake.bc <- calc.genoprob(fake.bc, step=10)

     ac <- pull.pheno(fake.bc, c("sex","age"))
     ic <- pull.pheno(fake.bc, "sex")

     out <- scantwo(fake.bc, method="hk", pheno.col=1,
                    addcovar=ac, intcovar=ic)
     plot(out)

