library(magrittr)
<- function(p_init, generations, Ne) {
simulate_trajectory <- c(p_init)
trajectory
# for each generation...
for (gen in seq_len(generations)) {
# ... based on the allele frequency in the current generation...
<- trajectory[gen]
p_current # ... compute the frequency in the next generation...
<- rbinom(Ne, 1, p_current)
n_next <- sum(n_next) / Ne
p_next # ... and save it in the trajectory vector
+ 1] <- p_next
trajectory[gen
}
return(trajectory)
}
<- simulate_trajectory(p_init = 0.5, generations = 1000, Ne = 1000)
traj1 <- simulate_trajectory(p_init = 0.5, generations = 1000, Ne = 1000)
traj2 <- simulate_trajectory(p_init = 0.5, generations = 1000, Ne = 1000)
traj3 <- simulate_trajectory(p_init = 0.5, generations = 1000, Ne = 1000)
traj4 <- simulate_trajectory(p_init = 0.5, generations = 1000, Ne = 1000)
traj5
plot(traj1, type = "l", ylim = c(0, 1), xlab = "generations", ylab = "allele frequency")
lines(traj2)
lines(traj3)
lines(traj4)
lines(traj5)
R programming language
In a wider programming world, R sometimes has a slightly unfortunate reputation as a badly designed “calculator language”. A computing environment which is maybe good for working with data frames and creating figures, but that’s about it. While certainly very useful for data science, R is a full-blown programming language which is actually quite powerful even from a purely computer science perspective.
Although this entire workshop is very firmly standing in the field of data science and computational population genetics, and we will certainly use R as a statistical and visualization environment, neglecting the “normal” aspects of R as a “proper” programming language like any other and, worse still, not regarding our use of R as programming, can be quite detrimental. Even when “just” doing data science and statistics, we still use programming constructs, we need to be aware of data types, algorithmic thinking, etc. Neglecting these things makes it easy to introduce bugs into our code, make it hard to find those bugs, and make our programs less efficient even when they do work.
This chapter will help you get familiar with some of the less obvious aspects of the R language or programming in general, definitely how data science is presented in undergratuate courses in life sciences.
Because here’s a crucial idea you might not be aware of: even when you’re “just” writing data analysis scripts, even when you’re “just” plotting results, you’re still writing programs. You’re a programmer. Exercises in this chapter are designed to make you comfortable with programming and algorithmic thinking.