Posts List

Reproducible package management in R

Reproducibility is a severe issue. Writing code usually helps, because the code is like a journal of your work, especially if you combine it with literate programming techniques, which in R’s world is so easy to do (Rmarkdown, knitr). However, there’s one thing, which can cause some problems - the packages versions. Some of the old code might not work, because there were changes in the API or in the behavior of the packages (I’m looking at you - dplyr).

customLayout 0.2.0 is now on CRAN

The new version of my customLayout package is on CRAN. It now supports working with PowerPoint slides using layouts created in R. For more information please read the vignette here. It also extends the idea of adjusting the font size for the flextables (see this post) - check the phl_adjust_table function. I also created a simple roadmap which describes my next steps. Please note that this package is still under development.

Notes on tidyeval

I recently watched the “Tidy eval: Programming with dplyr, tidyr, and ggplot2” video. It’s an excellent introduction to the concept of the tidy evaluation, which is the core concept for programming with dplyr and friends. In this video, Hadley showed on the slide the grouped_mean function (12:48). An attempt to implement this functions might be a good exercise in tidy evaluation, and an excellent opportunity to compare this approach with standard evaluation rules provided by the seplyr package.

FSelectorRcpp 0.2.1 release

New release of FSelectorRcpp (0.2.1) is on CRAN. I described near all the new functionality here. The last thing that we added just before release is an extract_discretize_transformer. It can be used to get a small object from the result of discretize function to transform the new data using estimated cutpoints. See the example below. library(FSelectorRcpp) set.seed(123) idx <- sort(, 100)) iris1 <- iris[idx, ] iris2 <- iris[-idx, ] disc <- discretize(Species ~ .

DepthProc hit 20k downloads.

My first package published on CRAN - DepthProc recently hit 20k downloads. library(cranlogs) library(ggplot2) downloads <- cran_downloads("DepthProc", from = "2014-08-21", to = "2018-06-10") ggplot(downloads) + geom_line(aes(x = date, y = cumsum(count))) + ylab("Downloads") + xlab("Date") + theme_bw() + ggtitle("DepthProc", "Download stats") There are some jumps on the line. I wondered if they all occurred just after the package release (old users updates to the new versions). Here’s some code to check this.

StarSpace in R

I enjoyed work with Facebook’s fastText ( library and its R’s wrapper fastrtext ( However, I want to spend some more time with StarSpace library (also Facebook’s library for NLP). Unfortunately, there’s no R package for StarSpace! It’s quite surprising because I there are thousands of packages. Nevertheless, this one is missing. In the end, I decided to write my wrapper - I had some problems with compilation because of dozens of compiler flags which must be set before compilation.