Some interesting Data Science stuff found between 2018-01-16 and 2018-01-31.


https://simplystatistics.org/2018/01/22/the-dslabs-package-provides-datasets-for-teaching-data-science/ - (by Rafael Irizarry) package dslab containing datasets for teaching data science.

install.packages(“dslabs”) #CopyAndInstall

#applyrds #rstats #datascience https://t.co/db0LUvCBx8


https://github.com/facebookresearch/StarSpace - a general purpose #NLP library from @fb_research. For now, it works only from a command line. However, it’s easy to build and use from command line.

#FacebookResearch #applyrds https://t.co/hcyjVdLdIZ


https://research.fb.com/facebook-open-sources-detectron/ - Facebook open sources Detectron, a platform for object detection running on the top of Caffe2.

#applyrds #deeplearning #caffe2 #FacebookResearch https://t.co/fVwdYKmgOa


http://www.win-vector.com/blog/2018/01/supercharge-your-r-code-with-wrapr/ - (by John Mount from Win-Vector) - an #rstats package with some syntactic sugar based on := operator. let seems to be very useful, especially for programming with dplyr.

#rstats #pkg #applyrds https://t.co/KUTajME7wL


https://blog.datascienceheroes.com/exploratory-data-analysis-data-preparation-with-funmodeling/ - (by Data Science Heroes) - funModeling - a #rstats package for Exploratory Data Analysis. It contains functions for visualization, outlier detection, and other stuff.

install.packages(“funModeling”) #CopyAndInstall

#applyrds #rstat #pkg https://t.co/6XupaosPlC


http://www.alexejgossmann.com/auc/ - (by Alexej Gossmann) probabilistic interpretation of the AUC value.

“AUC is the probability of correct ranking of a random “positive”-“negative” pair.”

#auc #applyrds #datascience #statistics #machinelearning https://t.co/ZGMBCwk5Fh


https://blog.insightdatascience.com/how-to-solve-90-of-nlp-problems-a-step-by-step-guide-fda605278e4e - (by Emmanuel Ameisen) - some tips for working with NLP data.

https://news.ycombinator.com/item?id=16224346&utm_content=buffer62cea&utm_medium=social&utm_source=twitter.com&utm_campaign=buffer - link to the HackerNews discussion about this post - there’s a bit more info about this topic with a few references (fasttext, etc).

#nlp #applyrds https://t.co/kb5SbCMJBz


http://www.win-vector.com/blog/2018/01/latest-vtreat-up-on-cran/ - a new version of the vtreat package by Nina Zumel and John Mount for preparing data for analysis - handling NAs, rare levels, and much more.

https://winvector.github.io/vtreat/articles/vtreat.html - getting started.

#applyrds #rstats https://t.co/932Pmybmoy


https://www.smartly.io/blog/tutorial-how-we-productized-bayesian-revenue-estimation-with-stan - (by Markus Ojala from smartly.io) using Stan on production to perform Bayesian interference. #stan #bayesian #statistics #applyrds https://t.co/kXkAeKbBuj