Comparing and optimizing performance of phyloytypr to mothur (CC304)
Pat compares the performance of phylotypr to mothur and finds that mothur is faster. After revisiting his code he is able to use Rfast’s rowsums instead of colsums to match mothur’s performance with one processor using the purrr package. Then he shows how to use the furrr package to parallelize the R code. Finally, he shows how to include data files into a package and write a function to access the data in the file. This episode is part of an ongoing effort to develop an R package that implements the naive Bayesian classifier for classifying 16S rRNA gene sequences.
Code
You can browse the state of the repositories at the