Some interesting Data Science stuff found between 2018-02-01 and 2018-02-28. - (by J.J. Allaire from Rstudio) - Machine Learning with R and TensorFlow - a video introduction to Deep Learning in R.

#deeplearning #rstats #tensorflow #datascience - (by James Le) - a collection of some basic ML algorithms for newbies. Pictures in the article are pretty good.

#datascience #ml - (by Matt Dancho) - predicting customer churn using deep learning in R.

#rstats #keras #deeplearning - (by Venkat Raman) - some intuitions behind content-based and collaborative filtering recommenders systems with a simple example in Python.

#datascience #recommenders #python - (by Igor Bobriakov) a collection of useful packages for doing data science in Scala (some machine learning and datavis).

#scala #datascience - (by Andrie de Vries) - summary of Deep Learning talks from Deep rstudio::conf 2018 (RStudio conference).

#rstats #deeplearning #datascience - (by Nathan Yau) - a comparison of base R graphics and ggplot2. The author seems to prefer good old base graphics - and there’s something in it…

#datavis #rstats #base #datascience - (by John Mount) - small R tip - it’s better to use seq_len(), rather than `1:n.

#rstats #datascience - (by Dean Attali) - simple modals for shiny applications.

#rstats #shiny #datascience - (by Martin Zinkevich) - Rules of Machine Learning: Best Practices for ML Engineering - a collection of “best practices” for Machine Learning.

#datascience #ml #bestpractices (by Marek Rogala from Appsilon Data Science) - how to set up continuous integration for your private R projects with CircleCI. CI is also a great way to deploy a model to the production, so it should be familiar for all data scientists.

#rstats - (by Hamel Husain)- a short introduction to Docker containers for data scientists.

#docker - (by - David Selby) - building a neural network from scratch in R.

#deeplearning #rstats - (by David Smith) - summary of all Microsoft’s tools for R ecosystem (R in SQL Server, Visual Studio, Power BI, foreach package, and much more).

#rstats #microsoft #datascience - (by Matthew Mayo) - I’m not a fan of “5 … things … you must/should …”, however, this is an interesting one. There’s sth about fasttext (NLP), Bayesian tools, CatBoost (xgboost alternative) and Keras. It’s mostly Python stuff.

#python - some guides from Andrew Gelman on teaching journalist statistics. However, I think that that knowledge can be used more broadly for teaching every group interested in statistics.

#datascience - (by Russell Jurney) - data scientist (and others members of the team) should have some general skills to speed up the development cycle. Sometimes, being generalist is better than being a specialist.

#agiledatascience - (by Roland Stevenson) - bad data representation can be costly… This article remembers me my first query on Hive - select * from table limit 10. I learned to include WHERE in every query;)

#db #bigrquery #rstats - (by Mathew McLean) - another package for working with dates in R. I think that lubridate solves 99% of my problems with dates, but AsDateTime might be even easier.

devtools::install_github(“Displayr/flipTime”) #copyAndInstall

#rstats #appyrds - (by John Mount) - new ways of reshaping the data with cdata package from Win-Vector. It allows performing a lot of not so easy transformations in just one step. It seems to be very promising.

#rstats #pkg - (by Kevin Gray) some thoughts on data analysis principles. How to not fool others (and yourself).