Categories
Data Science

Creating commandline tools using R and optparse

Most of the time as a data scientist or data analyst in research, you will be writing your analysis script in RStudio and probably run the script as you go. If you’re really savvy, you might also write your own R packages. If you’re looking to automate analyses or use them as a part of a larger microservices infrastructure in the cloud, you might be looking for a way to commandline tools using R.

In such use case, you might also want to pass optional arguments to the script so you can keep your script quite generic and let commandline arguments do all the configuration. Unfortunately, R does not come with tools to do this easily. In contrast to other scripting languages such as Python, most R users do not have a strong background in software development, so it’s not a use-case discussed very often.

Luckily, there are R packages such as {optparse} that do the heavy lifting for you. For my company blog at SKOPOS ELEMENTS, I have written a short introduction to optparse and how you can use it to create small command line utilities using R, RScript and optparse: So schreibst Du Kommandozeilen-Tools mit R und optparse (as you might have guessed from the title, it is written in German as our main audience is German-speaking; Google Translate or DeepL might be able to help you out.)

Categories
Science

How to do stepwise regression in R?

You don’t.

Categories
General

Podcasts for Data Science Start-Ups

I am not a Podcasts person. Most episodes are too long, there is a lot of nonsense talk with inside jokes, and I usually find information quicker when googling on my own. As the past months have been quite busy, however, I was looking to fill the time where I couldn’t read or do other things like when walking to the train station, driving the car, etc. Somehow I came back to my old Overcast account with a lot of unplayed episodes. After going through my subscriptions and some lists, I created my own short list of podcasts that I actually listen to. Episodes are reasonably long, there is little chit-chat, and some thought-provoking content. In this post, I share six of those podcasts with you.

Categories
General

From Psychologist to Data Scientist

Many blogs and podcasts discuss the question how to get into data science. In my experience, data scientists mainly come from computer science, physics or statistics. Social scientists are rare among data scientist – but I believe that in many business contexts, social scientists and psychologists can provide a much needed perspective to data analysis. While psychological theories, methods, and statistics provide a good starting point for data science adventures, most social scientists will need to learn and embrace additional skills. In this post, I highlight what I think they need to add to their training if they aim to pursue a data-driven career in what fancy people now call “data science”.

Categories
Science

Leaving Academia: Goodbye, cruel world!

In September, my contract as a research assistant at the University of Bonn ended. I was lucky to have a 50% contract for three years and even more lucky that I had the option to extend the contract for another year. Nevertheless, I will leave academia as I’m close to finishing by PhD thesis and planned to work outside of the university from the beginning.

Categories
Science

Why “Prestige” is Better Than Your h-Index

Psychological science is one of the fields that is undergoing drastic changes in how we think about research, conduct studies and evaluate previous findings. Most notably, many studies from well-known researchers are under increased scrutiny. Recently, journalists and researchers have reviewed the Stanford Prison Experiment that is closely associated with the name of Philip Zimbardo. Many consider Zimbardo a “prestigious” psychologist. In the discussion about how we should think about doing science in times of the “replicability crisis”, the issue of “prestige” comes up in different forms. Among the recurring questions: Should we trust what prestigious researchers say? Should we put more faith in articles published in prestigious journals?

It seems as if some consider prestige to be something bad: Prestige is not earned through hard work but through capitalizing on past successes and putting oneself in the spotlight. They, in my experience, often criticize journals or conferences for inviting only prestigious authors or speakers. Others (knowingly or unknowingly) put a lot of trust into prestige, for example by accepting theories from prestigious sources more easily. 

Categories
Science

Submission Criteria for Psychological Science

Daniël told me about this the other day: Our recent pre-print on informative ‘null effects’ is now cited in the submission criteria for Psychological Science in a paragraph on drawing inferences from ‘theoretically significant’ results that may not be ‘statistically significant’. I feel very honoured that the editorial board at PS considers our manuscript as a good reference. To me, this also shows the importance and usefulness of pre-prints: The manuscript is not yet published in a journal and is already well received through this blog and Twitter. Yay, future!

Categories
Science

New Preprint: Making “Null Effects” Informative

In February and March this year, I stayed at the Eindhoven Technical University in the amazing group with Daniël Lakens, Anne Scheel and Peder Isager, who are actively researching questions of replicability in psychological science. Over the two months I have learned a lot, exchanged some great ideas with the three of them – and was able to work together with Daniël on a small overview article

Categories
General

Workshop “Einführung in die Datenanalyse mit R” (Post and Slides in German)

Last weekend, I gave a 1.5 day workshop for students at my university on data analysis using R. In this post I briefly share my experience along with the workshop slides and an example project – both of which are in German. If you are looking for an English introduction into R, have a look at Hadley Wickham’s excellent “R 4 Data Science”, which you can find here.

In unserem Psychologie-Studiengang wird, wie an vielen andere Unis auch, der Umgang mit SPSS gelehrt. Dabei liegt der Fokus im Wesentlichen auf der Anwendung der gängigen Hypothesentests über die Menüs. R wurde bisher nur mal am Rande erwähnt – als Alternative wenn die Fragestellungen etwas anspruchsvoller werden. Im Rahmen der Open Science-Diskussionen ist R aber auch zu einem wichtigen Baustein geworden, wenn es um reproduzierbare Analysen und Nutzung freier Software geht.

Categories
Science

Update on the Replication Bayes Factor

In December I already blogged about the ReplicationBF package, I made available on GitHub. It allows you to calculate Replication Bayes Factors for t- and F-tests. The preprint detailing the formulas for the latter was outdated and the method in the package was not optimal, so I recently updated both.