Benchmarking R functions for reading tsv files (CC291)

June 24, 2024 • PD Schloss • 1 min read

Reading data tables into R is a very common activity and there are many ways to do this in base R with read.delim or with the read_tsv function from readr, the vroom function from the vroom package, or the fread function from data.table. Pat will benchmark these four approaches and discuss the tradeoffs between speed and dependencies for package development. Then he implements readr::read_tsv with test driven development (TDD) to create a read_taxonomy function in his phylotypr R package. This episode is part of an ongoing effort to develop an R package that implements the naive Bayesian classifier.


You can browse the state of the repository at the