Swimming + Data Science

New International Swimming League Season - New Version(s) of SwimmeR

Welcome back to Swimming + Data Science friends. Last week something exciting happened - two things actually! First, the International Swimming League kicked off its 2020 season in Budapest. Second, and directly related, I released version 0.5.0 of SwimmeR to CRAN with functions to read ISL results from the inaugural 2019 season and from the newly begun 2020 season. ISL functions in SwimmeR v0.5.0 were fully tested on all the available meets and working great. Then however, there was a problem. ISL did me dirty.

They changed their reporting for the second meet, in a way that broke SwimmeR v0.5.0. What a pain! I quickly patched the problem and was preparing another CRAN submission but then I thought “hmmm there’s more ISL meets next week. What if they do it again?”. I’ve decided that rather than pestering the good folks at CRAN with another version of SwimmeR every time someone at ISL decides to mess around I’m going to limit myself to releasing development versions of SwimmeR throughout the 2020 ISL season. Hopefully that will give ISL time to settle on a results format, at which point I’ll do another CRAN release. We start today with the now-available SwimmeR v0.5.1.

SwimmeR v0.5.1 includes the function swim_parse_ISL specifically for dealing with ISL results. So update your version of SwimmeR with devtools::install_github() and let’s get going.

devtools::install_github("gpilgrim2670/SwimmeR", build_vignettes = TRUE, force = TRUE)

In addition to the new SwimmeR v0.5.1 we’ll use the always excellent dplyr, purrr, and stringr and take these new ISL results for a spin. I also want flextable for reporting, and my special flextable_style function.


flextable_style <- function(x) {
  x %>%
    flextable() %>%
    bold(part = "header") %>% # bold header
    bg(bg = "#D3D3D3", part = "header") %>% # puts gray background behind the header row
    align_nottext_col(align = "center", header = TRUE, footer = TRUE)# center alignment

ISL Results

There have been two ISL meets thus far, both in the “Budapest Bubble”. Results are available at SwimSwam.

match_1 <- "https://cdn.swimswam.com/wp-content/uploads/2020/10/Results_Book_Match_1_V2.pdf"
match_2 <- "https://cdn.swimswam.com/wp-content/uploads/2020/10/Results_Book_Full_M2-1.pdf"

ISL_matches <- c(match_1, match_2)

swim_parse_ISL works just like our old friend swim_parse. It takes the output of read_results and returns a data frame. It’s dead simple.

match_1 %>% 
  read_results() %>% 
  swim_parse_ISL() %>% 
  head(5) %>% 

We do have a list of match results though, so rather than doing them individually let’s do them all at once with some tidyverse magic. All we have to do is pass our list of ISL match results to read_results and then to swim_parse_ISL. Since we have a list we’ll use map to do the passing, applying read_results and swim_parse_ISL to each element, that is, each match result in the list of matches. Then we’ll name the resulting list elements, which are two data frames, by match number (1 and 2) with setNames and stick them together with bind_rows.

ISL_results <-
  map(ISL_matches , read_results) %>% # map SwimmeR::read_results over the list of links
  map(swim_parse_ISL) %>% # now it's swim_parse_ILS's turn
  setNames(c(1, 2)) %>%  # name the dataframes 1 and 2 respectively
  bind_rows(.id = "Match") %>% # stick the dataframes together, with a new column called "Match" which will contain the relevant dataframe name, either 1 or 2
  mutate(Match = as.numeric(Match))

ISL Dataframe - Now What?

Now we have one big data frame of ISL results, almost exactly like we do when we use swim_parse.

ISL_results %>%
  head() %>%

Those of you who are close readers of SwimmeR documentation (so all of you, right?) know that Lilly King is a hero around these parts. Let’s see how she’s doing in the ISL.

ISL_results %>% 
  filter(Name == "KING Lilly") %>% # only want Lilly's results

So Lilly swam 6 races and won all of them. That sounds like her. Lilly only swam in the first match though. Let’s look at the women’s breaststrokes in both matches. We’ll exclude the skins matches because in match 2 the skins races weren’t breaststroke. Probably because none of the teams in that match had Lilly King.

First we’ll filter out events that aren’t women’s breaststroke. Then we’ll create a new column with times in seconds format (total seconds) rather than minutes:seconds.hundreths using the sec_format function from SwimmeR. Next we’ll group_by event, arrange the entries in order of time, change the places with mutate to reflect our new ordering and finally, check out the results.

ISL_results %>% 
  filter(str_detect(Event, "Women's \\d{2,3}m Breaststroke$") == TRUE) %>% # only want women's breaststroke events
  mutate(Time_sec = sec_format(Finals_Time)) %>% # convert times to second format
  group_by(Event) %>% 
  arrange(Time_sec) %>% # order entries by increasing time 
  mutate(Place = rank(Time_sec)) %>% # recode place to new order, based on time
  select(-Time_sec, Lane, Points) %>% # don't need these columns
  slice(1:3) %>% # top three finishers in each event

Turns out Lilly King is dominant in both matches. Makes sense. SwimmeR v0.5.1 is working well.

In Closing

So that’s how you can use SwimmeR v0.5.1 to get ISL results into R. Not to bad right? Now we’ll just wait and see what other mischief the results people over at ISL can come up next week. If and when they strike again I’ll update SwimmeR appropriately. If you find an issue please leave a comment, or post it on github. That’s it until next time here at Swimming + Data Science.