You don’t.Continue reading “How to do stepwise regression in R?”
I am not a Podcasts person. Most episodes are too long, there is a lot of nonsense talk with inside jokes, and I usually find information quicker when googling on my own. As the past months have been quite busy, however, I was looking to fill the time where I couldn’t read or do other things like when walking to the train station, driving the car, etc. Somehow I came back to my old Overcast account with a lot of unplayed episodes. After going through my subscriptions and some lists, I created my own short list of podcasts that I actually listen to. Episodes are reasonably long, there is little chit-chat, and some thought-provoking content. In this post, I share six of those podcasts with you.Continue reading “Podcasts for Data Science Start-Ups”
Many blogs and podcasts discuss the question how to get into data science. In my experience, data scientists mainly come from computer science, physics or statistics. Social scientists are rare among data scientist – but I believe that in many business contexts, social scientists and psychologists can provide a much needed perspective to data analysis. While psychological theories, methods, and statistics provide a good starting point for data science adventures, most social scientists will need to learn and embrace additional skills. In this post, I highlight what I think they need to add to their training if they aim to pursue a data-driven career in what fancy people now call “data science”.Continue reading “From Psychologist to Data Scientist”
In September, my contract as a research assistant at the University of Bonn ended. I was lucky to have a 50% contract for three years and even more lucky that I had the option to extend the contract for another year. Nevertheless, I will leave academia as I’m close to finishing by PhD thesis and planned to work outside of the university from the beginning.Continue reading “Leaving Academia: Goodbye, cruel world!”
Another presentation I gave at the General Online Research (GOR) conference in March1, was on our first approach to using topic modelling at SKOPOS: How can we extract valuable information from survey responses to open-ended questions automatically? Unsupervised learning is a very interesting approach to this question — but very hard to do right.
I rarely read pop-sci books, and I even more rarely review books in any form. However, I bought „Everybody Lies“ some months ago and just finished reading it. It took me about four months to read it, partly because it made me so angry as a researcher reading it. Continue reading “Book Review: Everybody Lies”
This year, the BVM (German professional association for market and social researchers), hosted their first Data Science Cup. There were four tasks involving the prediction of sales data for the online sci-fi game “EVE Online”.
It was my first year working in market research and applying statistics and machine learning algorithms in a real-world context. So, naturally there is much room for improvements to my solution, but I ranked 3rd out of five, so I’m right in the middle. I would do many things differently today, but that’s how it’s supposed to be, right? For example, I would go through with a multilevel model, since the data has a natural hierarchy, that should be incorporated into the analysis.
I have uploaded my solution to a GitHub repository, so you might learn from my mistakes. In the README I have also included some of my reasoning and some technical details. But beware, the code is messy and badly documented – proceed with caution.
This is an interesting article from The Guardian on “post-truth” politics, where statistics and “experts” are frowned upon by some groups. William Davies shows how statistics in the political debate have evolved from the 17th century until today, where statistics are not regarded as an objective approach to reality anymore but as an arrogant and elitist tool to dismiss individual experiences. What comes next, however, is not the rule of emotions and subjective experience, but privatised data and data analytics that are only available to few anonymous analysts in private corporations. This allows populist politicians to buy valuable insight without any accountability, exactly what Trump and Cambridge Analytica did. The article makes a point how this is troublesome for liberal, representative democracies.