Simulate genetic data

Simulate genetic data from the same model used in the MALECOT inference step.

sim_data(n = 100, L = 24, K = 3, data_format = "biallelic",
  pop_col_on = TRUE, alleles = 2, lambda = 1,
  COI_model = "poisson", COI_max = 20, COI_manual = rep(-1, n),
  COI_mean = 3, COI_dispersion = 2, e1 = 0, e2 = 0,
  prop_missing = 0)

Arguments

n	the number of samples
L	the number of loci per sample
K	the number of subpopulations
data_format	whether to produce data in "biallelic" or "multiallelic" format. Note that if biallelic format is chosen then `alleles` is always set to 2
pop_col_on	TODO
alleles	the number of alleles at each locus. Can be a vector of length `L` specifying the number of alleles at each locus, or a single scalar value specifying the number of alleles at all loci
lambda	the shape parameter(s) of the prior on allele frequencies. This prior is Beta in the bi-allelic case, and Dirichlet in the multi-allelic case. `lambda` can be: a single scalar value, in which case the same value is used for every allele and every locus (i.e. the prior is symmetric) a vector of values, in which case the same vector is used for every locus. Only works if the same number of alleles applies at every locus a list of vectors specifying the shape parameter separately for each allele of each locus. The list must of length `L`, and must contain vectors of length equal to the number of alleles at that locus
COI_model	the distribution from which COIs are drawn. Options include a uniform distribution (`"uniform"`), a Poisson distribution (`"poisson"`), or a negative binomial distribution (`"nb"`)
COI_max	the maximum allowed COI. Any COIs that are initially drawn larger than this value are set down to this value
COI_manual	option to override the MCMC and set the COI of one or more samples manually, in which case they are not updated. Vector of length `n` specifing the integer valued COI of each sample, with -1 indicating that a sample should be estimated
COI_mean	the mean of the distribution from which COIs are drawn. Only applies under the Poisson and negative binomial models (under the uniform model the mean is `(COI_max+1)/2` by definition)
COI_dispersion	Only used under the negative binomial model. Defines how much larger the variance is than the mean. Must be > 1
e1	the probability of a true homozygote being incorrectly called as a heterozygote
e2	the probability of a true heterozygote being incorrectly called as a homozygote
prop_missing	the proportion of the data that is missing. Note that data are masked out at random, meaning in some rare cases (and when the proportion of missing data is large) an entire sample or locus can end up being masked out, which will throw an error when loaded into a project

Details

TODO

Examples

# TODO

Arguments

Details

Examples

Contents