New release of FSelectorRcpp (0.2.1) is on CRAN. I described near all the new functionality here. The last thing that we added just before release is an extract_discretize_transformer. It can be used to get a small object from the result of discretize function to transform the new data using estimated cutpoints. See the example below.

library(FSelectorRcpp)
set.seed(123)
idx <- sort(sample.int(150, 100))
iris1 <- iris[idx, ]
iris2 <- iris[-idx, ]

disc <- discretize(Species ~ ., iris1)
discObj <- extract_discretize_transformer(disc)

# Print the object
discObj
## FsDiscretizeTransformer
## 
## Cutpoints:
##   Sepal.Length: -Inf, 5.55, Inf
##   Sepal.Width: -Inf, 3.15, Inf
##   Petal.Length: -Inf, 2.6, 4.75, Inf
##   Petal.Width: -Inf, 0.8, 1.7, Inf
## 
## FsDiscretizeTransformer allows to discretize data
## using discretize_transform(disc, newData) function.
## Sepal.Length

## Sepal.Width

## Petal.Length

## Petal.Width

## Species
head(discretize_transform(discObj, iris2))
##    Sepal.Length Sepal.Width Petal.Length Petal.Width Species
## 2   (-Inf,5.55] (-Inf,3.15]   (-Inf,2.6]  (-Inf,0.8]  setosa
## 4   (-Inf,5.55] (-Inf,3.15]   (-Inf,2.6]  (-Inf,0.8]  setosa
## 10  (-Inf,5.55] (-Inf,3.15]   (-Inf,2.6]  (-Inf,0.8]  setosa
## 13  (-Inf,5.55] (-Inf,3.15]   (-Inf,2.6]  (-Inf,0.8]  setosa
## 19  (5.55, Inf] (3.15, Inf]   (-Inf,2.6]  (-Inf,0.8]  setosa
## 21  (-Inf,5.55] (3.15, Inf]   (-Inf,2.6]  (-Inf,0.8]  setosa

Impact of the new version on downloads

I’m always curious of the impact of the new release on the number of downloads. It seems that in the case of FSelectorRcpp it is quite an important factor. See the code below. The vertical lines denote new versions.

library(crandb)
library(lubridate)
library(dplyr)
library(cranlogs)
library(ggplot2)

invisible(Sys.setlocale(locale = "en_US.UTF-8"))

pkg <- package("FSelectorRcpp", version = "all")
newDate <- substring(pkg$timeline[["0.2.1"]], 1, 10) %>% ymd


downloads <- cran_downloads("FSelectorRcpp", from = newDate - 5, to = newDate + 3)
ggplot(downloads) + 
  geom_line(aes(date, count)) + 
  ylab("Downloads") + xlab("Date") + 
  theme_bw() + 
  geom_vline(xintercept = newDate, lty = 2, color = "darkblue")

# cumulative downloads
firstRelease <- pkg$timeline[[1]] %>% substr(1,10) %>% ymd
allDownloads <- cran_downloads("FSelectorRcpp", from = firstRelease, to = newDate + 3)
allDownloads <- allDownloads %>% mutate(Total = cumsum(count))

newVersions <- pkg$timeline %>% unlist() %>% substr(1,10) %>% ymd

ggplot(allDownloads) + 
  geom_line(aes(date, Total)) + 
  ylab("Downloads") + xlab("Date") + 
  theme_bw() + 
  geom_vline(xintercept = newVersions, lty = 2, color = "darkblue")