A Day in the Life of a (Biostatistics) PhD Student

Today, I thought I would share what a day in the life of a PhD student is like. Personally, I’ve always been interested in other people’s day-to-day experience at work. So I imagine that similarly, getting a glimpse into my day could be interesting and informative, particularly for students thinking about getting a PhD in biostatistics. I took two days from my third-year calendar (i.e. pre-pandemic) to give a sense of not only the typical activities I do, but also show how each day can be drastically different.

How Good is FiveThirtyEight's NBA Prediction Model?

As someone who watches basketball and enjoys sports analytics (see my previous post on estimating win probabilities live during an NBA game), I’ve been a fan of FiveThirtyEight’s NBA prediction models, which are always fun to follow and interesting to read about. Calibration vs Accuracy Recently, I came across an article by FiveThirtyEight in which they self-evaluated their prediction models. The primary metric1 they use to evaluate their model is calibration, that is, whether their forecasted probabilities match up with the actual probabilities.

Tips for a First-Time Teaching Assistant

For the last 3 years, I’ve been teaching the lab sections as a teaching assistant for the statistics sequence at the Johns Hopkins Bloomberg School of Public Health. The course is geared towards master’s level students in public health, though some PhD students and doctors/members of the JHU Medicine community also take it. As you can imagine, there can be a range of familiarity in the classroom with regard to what students know about stats, math, and coding.

Reordering geom_bar and geom_col by Count or Value

One of the things I’m always looking up with ggplot2 is how to reorder the bars in my bar charts by their length (i.e. the count/frequency or value, depending on whether you’re using geom_bar or geom_col). If you do a Google search, there are multiple different solutions, but I will document in this post what I’ve found to be the cleanest and simplest solution.1 Reordering geom_bar by count By default, the bars are arranged by the order (levels) of the factor variable.

Analyzing Metadata from Spotify Playlists

I recently came across this post about a Python module GSA that allows you to download metadata from Spotify playlists.1 This was great, because it was a perfect opportunity for me to test RStudio 1.4’s improved support for Python, which I had been excited to try out since I saw the news. Download Metadata The original blog post lays out the steps you need to install everything properly, so I won’t repeat them here.