Some interesting Data Science stuff found between 2018-02-01 and 2018-02-28.
https://www.youtube.com/watch?v=atiYXm7JZv0 - (by J.J. Allaire from Rstudio) - Machine Learning with R and TensorFlow - a video introduction to Deep Learning in R.
#deeplearning #rstats #tensorflow #datascience https://t.co/W4SjSTBYQq
https://towardsdatascience.com/a-tour-of-the-top-10-algorithms-for-machine-learning-newbies-dde4edffae11 - (by James Le) - a collection of some basic ML algorithms for newbies. Pictures in the article are pretty good.
#datascience #ml https://t.co/QGOliYhXgt
https://tensorflow.rstudio.com/blog/keras-customer-churn.html - (by Matt Dancho) - predicting customer churn using deep learning in R.
#rstats #keras #deeplearning https://t.co/01MBSSNc5G
https://towardsdatascience.com/recommender-engine-under-the-hood-7869d5eab072 - (by Venkat Raman) - some intuitions behind content-based and collaborative filtering recommenders systems with a simple example in Python.
#datascience #recommenders #python https://t.co/IOA7dBlw8h
https://activewizards.com/blog/top-15-scala-libraries-for-data-science/ - (by Igor Bobriakov) a collection of useful packages for doing data science in Scala (some machine learning and datavis).
#scala #datascience https://t.co/KGvMXfDPkJ
https://rviews.rstudio.com/2018/02/14/deep-learning-rstudio-conf-2018/ - (by Andrie de Vries) - summary of Deep Learning talks from Deep rstudio::conf 2018 (RStudio conference).
#rstats #deeplearning #datascience https://t.co/5uvWcr5w6C
http://flowingdata.com/2016/03/22/comparing-ggplot2-and-r-base-graphics/ - (by Nathan Yau) - a comparison of base R graphics and ggplot2. The author seems to prefer good old base graphics - and there’s something in it…
#datavis #rstats #base #datascience https://t.co/knCktHImNq
http://www.win-vector.com/blog/2018/02/r-tip-use-seq_len-to-avoid-the-backwards-sequence-bug/ - (by John Mount) - small R tip - it’s better to use
seq_len(), rather than `1:n.
#rstats #datascience https://t.co/lIC8at6OGV
https://deanattali.com/blog/shinyalert-package/ - (by Dean Attali) - simple modals for shiny applications.
#rstats #shiny #datascience https://t.co/fAcnQH4IfT
http://martin.zinkevich.org/rules_of_ml/rules_of_ml.pdf - (by Martin Zinkevich) - Rules of Machine Learning: Best Practices for ML Engineering - a collection of “best practices” for Machine Learning.
#datascience #ml #bestpractices
https://appsilondatascience.com/blog/rstats/2018/02/07/circleci.html (by Marek Rogala from Appsilon Data Science) - how to set up continuous integration for your private R projects with CircleCI. CI is also a great way to deploy a model to the production, so it should be familiar for all data scientists.
https://www.kdnuggets.com/2018/01/docker-help-become-more-effective-data-scientist.html - (by Hamel Husain)- a short introduction to Docker containers for data scientists.
http://selbydavid.com/2018/01/09/neural-network/ - (by - David Selby) - building a neural network from scratch in R.
#deeplearning #rstats https://t.co/WV4bJPMPhF
http://blog.revolutionanalytics.com/2018/02/what-does-microsoft-do-with-r.html - (by David Smith) - summary of all Microsoft’s tools for R ecosystem (R in SQL Server, Visual Studio, Power BI, foreach package, and much more).
#rstats #microsoft #datascience https://t.co/wfgIwSwTdj
https://www.kdnuggets.com/2018/02/5-machine-learning-projects-overlook-feb-2018.html - (by Matthew Mayo) - I’m not a fan of “5 … things … you must/should …”, however, this is an interesting one. There’s sth about fasttext (NLP), Bayesian tools, CatBoost (xgboost alternative) and Keras. It’s mostly Python stuff.
http://andrewgelman.com/2018/02/03/teach-statistics-course-journalists/ - some guides from Andrew Gelman on teaching journalist statistics. However, I think that that knowledge can be used more broadly for teaching every group interested in statistics.
https://www.kdnuggets.com/2018/02/generalists-dominate-data-science.html - (by Russell Jurney) - data scientist (and others members of the team) should have some general skills to speed up the development cycle. Sometimes, being generalist is better than being a specialist.
https://rviews.rstudio.com/2018/02/02/cost-effective-bigquery-with-r/ - (by Roland Stevenson) - bad data representation can be costly… This article remembers me my first query on
select * from table limit 10. I learned to include
WHERE in every query;)
#db #bigrquery #rstats https://t.co/Selg9OEF4z
https://www.displayr.com/r-date-conversion/ - (by Mathew McLean) - another package for working with dates in R. I think that
lubridate solves 99% of my problems with dates, but
AsDateTime might be even easier.
#rstats #appyrds https://t.co/4Ral1huXZ0
http://www.win-vector.com/blog/2018/01/big-cdata-news/ - (by John Mount) - new ways of reshaping the data with
cdata package from Win-Vector. It allows performing a lot of not so easy transformations in just one step. It seems to be very promising.
#rstats #pkg https://t.co/3YA75JVh3o
https://www.kdnuggets.com/2018/01/how-not-lie-statistics.html - (by Kevin Gray) some thoughts on data analysis principles. How to not fool others (and yourself).