The admixr package provides a convenient R interface to ADMIXTOOLS, a widely used software package for calculating admixture statistics and testing population admixture hypotheses.
A typical ADMIXTOOLS workflow often involves a combination of
sed
/awk
/shell scripting and manual editing to
create different configuration files. These are then passed as
command-line arguments to one of ADMIXTOOLS commands, and control how to
run a particular analysis. The results of such computation are then
usually redirected to another file, which needs to be parsed by the user
to extract values of interest, often using command-line utilities again
or by manual copy-pasting, and finally analysed in R, Excel or another
program.
This workflow can be a little cumbersome, especially if one wants to explore many hypotheses involving different combinations of populations or data filtering strategies. Most importantly, it makes it difficult to follow the rules of best practice for reproducible science, especially given the need for manual intervention on the command-line or custom shell scripting to orchestrate more complex pipelines.
admixr makes it possible to perform all stages of an ADMIXTOOLS analysis entirely from R. It provides a set of convenient functions that completely remove the need for “low-level” configuration of individual ADMIXTOOLS programs, allowing users to focus on the analysis itself.
admixr is now published as an Application Note in the journal Bioinformatics. If you use it in your work, please cite the paper! You will join an excellent company of papers who have used it to do amazing research. 🙂
You can try out admixr without installation directly in your
browser! Simply click on and after a
short moment you will get a Binder RStudio could session running in your
web browser. However, please note that Binder’s computational resources
are extremely limited so you might run into issues if you try to run
extremely resource-intensive computations.
The package is available on CRAN. You can install it simply by running
install.packages("admixr")
from your R session. This the recommended procedure for most users.
To install the development version from Github (which might be slightly ahead in terms of new features and bugfixes compared to the stable release on CRAN), you need the R package devtools. You can run:
install.packages("devtools")
devtools::install_github("bodkan/admixr")
In order to use the admixr package, you need a working installation of ADMIXTOOLS. You can find installation instructions here.
Furthermore, you also need to make sure that R can find ADMIXTOOLS
binaries on the $PATH
. You can achieve this by specifying
PATH=<path to the location of ADMIXTOOLS programs>
in
the .Renviron
file in your home directory. If R cannot find
ADMIXTOOLS utilities, you will get a warning upon loading
library(admixr)
in your R session.
This is all the code that you need to perform ADMIXTOOLS analyses
using this package! No shell scripting, no copy-pasting and manual
editing of text files. The only thing you need is a working ADMIXTOOLS
installation and a path to EIGENSTRAT data (a trio of ind/snp/geno
files), which we call prefix
here.
library(admixr)
# download a small testing dataset to a temporary directory and process it for use in R
snp_data <- eigenstrat(download_data())
result <- d(
W = c("French", "Sardinian"), X = "Yoruba", Y = "Vindija", Z = "Chimp",
data = snp_data
)
result
Note that a single call to the d
function generates all
required intermediate config and population files, runs ADMIXTOOLS,
parses its log output and returns the result as a
data.frame
object with the D statistics results. It does
all of this behind the scenes, without the user having to deal with
low-level technical details.
Recently, a new R package called ADMIXTOOLS 2 appeared on the horizon, offering a re-implementation of several features of the original ADMIXTOOLS suite of command-line programs.
The admixr project is not related to that initiative at all. It is not a pre-cursor to it, nor it is—the way I see it—superseeded by it. I have never used ADMIXTOOLS 2 myself, but from the looks of it it seems to offer some very interesting features for fitting complex admixture graphs (something I’m not personally interested in, which is why early efforts to implement this in admixr have been eventually given up on).
The bottom-line is this: as long as the original ADMIXTOOLS continues to be developed and maintained, admixr remains relevant and useful and will continue to be supported. ADMIXTOOLS might have a smaller set of features than ADMIXTOOLS 2, but the features it provides are extremely stable. ADMIXTOOLS is one of the most battle-tested pieces of software in population genetics—if you’re happy with the set of features it provides and if you’re happy with admixr itself, there is no dramatic reason to move away from either of them.
To see many more examples of admixr in action, please check out the tutorial vignette.
If you want to stay updated on new admixr development, follow me on Twitter.