1 Introduction

2 Goals

At the end of this Lab session, we will be able to:

  • Understand time series data and data structures
  • Create Time Series Visualizations
  • Perform modelling tasks on Time Series Data
  • Calculate and depict Averages, Seasons, Trends in Time Series
  • Perform Forecasting with Time Series

3 Packages

4 The tsibble data format

TBW. To be written up.

4.1 Case Study -1: Walmart Sales Dataset from timetk

Let us inspect what datasets are available in the package timetk. Type data(package = “timetk”) in your Console to see what datasets are available.

Let us choose the Walmart Sales dataset. See here for more details: Walmart Recruiting - Store Sales Forecasting |Kaggle

## Rows: 1,001
## Columns: 17
## $ id           <fct> 1_1, 1_1, 1_1, 1_1, 1_1, 1_1, 1_1, 1_1, 1_1, 1_1, 1_1, 1_1, 1_1, 1_1, 1_1, 1_1, 1_1, 1_1, 1_1, 1_1, 1_1, 1_…
## $ Store        <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,…
## $ Dept         <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,…
## $ Date         <date> 2010-02-05, 2010-02-12, 2010-02-19, 2010-02-26, 2010-03-05, 2010-03-12, 2010-03-19, 2010-03-26, 2010-04-02…
## $ Weekly_Sales <dbl> 24924.50, 46039.49, 41595.55, 19403.54, 21827.90, 21043.39, 22136.64, 26229.21, 57258.43, 42960.91, 17596.9…
## $ IsHoliday    <lgl> FALSE, TRUE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FAL…
## $ Type         <chr> "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A…
## $ Size         <dbl> 151315, 151315, 151315, 151315, 151315, 151315, 151315, 151315, 151315, 151315, 151315, 151315, 151315, 151…
## $ Temperature  <dbl> 42.31, 38.51, 39.93, 46.63, 46.50, 57.79, 54.58, 51.45, 62.27, 65.86, 66.32, 64.84, 67.41, 72.55, 74.78, 76…
## $ Fuel_Price   <dbl> 2.572, 2.548, 2.514, 2.561, 2.625, 2.667, 2.720, 2.732, 2.719, 2.770, 2.808, 2.795, 2.780, 2.835, 2.854, 2.…
## $ MarkDown1    <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
## $ MarkDown2    <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
## $ MarkDown3    <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
## $ MarkDown4    <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
## $ MarkDown5    <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
## $ CPI          <dbl> 211.0964, 211.2422, 211.2891, 211.3196, 211.3501, 211.3806, 211.2156, 211.0180, 210.8204, 210.6229, 210.488…
## $ Unemployment <dbl> 8.106, 8.106, 8.106, 8.106, 8.106, 8.106, 8.106, 8.106, 7.808, 7.808, 7.808, 7.808, 7.808, 7.808, 7.808, 7.…

The data is described as:

A tibble: 9,743 x 3

  • id Factor. Unique series identifier (4 total)
  • Store Numeric. Store ID.
  • Dept Numeric. Department ID.
  • Date Date. Weekly timestamp.
  • Weekly_Sales Numeric. Sales for the given department in the given store.
  • IsHoliday Logical. Whether the week is a “special” holiday for the store.
  • Type Character. Type identifier of the store.
  • Size Numeric. Store square-footage
  • Temperature Numeric. Average temperature in the region.
  • Fuel_Price Numeric. Cost of fuel in the region.
  • MarkDown1, MarkDown2, MarkDown3, MarkDown4, MarkDown5 Numeric. Anonymized data related to promotional markdowns that Walmart is running. MarkDown data is only available after Nov 2011, and is not available for all stores all the time. Any missing value is marked with an NA.
  • CPI Numeric. The consumer price index.
  • Unemployment Numeric. The unemployment rate in the region.

NOTE: 1. This is still a data.frame, with a time-oriented variable of course, and not yet a time-series object. Note that this data frame has the YMD columns repeated for each Dept. 2. The Date column has repeated entries for each Dept! To deal with this repetition, we will always need to split the Weekly_Sales by the Dept column before we plot or analyze.

Since our sales are weekly, we will convert Date to yearweek format:

4.1.1 Basic Time Series Plots

The easiest way is to use autoplot from the feasts package. You may need to specify the actual measured variable, if there is more than one numerical column:

The R package timetk gives us interactive plots that may be more evocative than the static plot above. The basic plot function with timetk is plot_time_series. There are arguments for the date variable, the value you want to plot, colours, groupings etc.

Let us explore this dataset using timetk, using our trusted method of asking Questions:

Q.1 How are the weekly sales different for each Department?

There are

number of Departments. So we should be fine plotting them and also facetting with them, as we will see in a bit:

Q.2. What do the sales per Dept look like during the month of December (Christmas time) in 2012? Show the individual Depts as facets.

We can of course zoom into the interactive plot above, but if we were to plot it anyway:

Clearly the “unfortunate” Dept#13 has seen something of a Christmas drop in sales, as has Dept#38 ! The rest, all is well, it seems…

Too much noise? How about some averaging?

Q.3 How do we smooth out some of the variations in the time series to be able to understand it better?

Sometimes there is too much noise in the time series observations and we want to take what is called a rolling average. For this we will use the function timetk::slidify to create an averaging function of our choice, and then apply it to the time series using regular dplyr::mutate. Let’s take the average of Sales for each month in each Department. Our function will be named “rolling_avg_month”:

OK, slidify creates a function! Let’s apply it to the Walmart Sales time series…

Graphs are smoother now. Need to check whether the averaging was done on a per-Dept basis…should we have had a group_by(Dept) before the averaging, and ungroup() before plotting? Try it !!

4.1.3 Case Study-2: Dataset from nycflights13

Let us try the flights dataset from the package nycflights13. Try data(package = "nycflights13") in your Console.

We have the following datasets in nycflights13:

  • airlines Airline names.
  • airports Airport metadata
  • flights Flights data
  • planes Plane metadata.
  • weather Hourly weather data

Let us analyze the flights data:

## Rows: 336,776
## Columns: 19
## $ year           <int> 2013, 2013, 2013, 2013, 2013, 2013, 2013, 2013, 2013, 2013, 2013, 2013, 2013, 2013, 2013, 2013, 2013, 201…
## $ month          <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, …
## $ day            <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, …
## $ dep_time       <int> 517, 533, 542, 544, 554, 554, 555, 557, 557, 558, 558, 558, 558, 558, 559, 559, 559, 600, 600, 601, 602, …
## $ sched_dep_time <int> 515, 529, 540, 545, 600, 558, 600, 600, 600, 600, 600, 600, 600, 600, 600, 559, 600, 600, 600, 600, 610, …
## $ dep_delay      <dbl> 2, 4, 2, -1, -6, -4, -5, -3, -3, -2, -2, -2, -2, -2, -1, 0, -1, 0, 0, 1, -8, -3, -4, -4, 0, 8, 11, 3, 0, …
## $ arr_time       <int> 830, 850, 923, 1004, 812, 740, 913, 709, 838, 753, 849, 853, 924, 923, 941, 702, 854, 851, 837, 844, 812,…
## $ sched_arr_time <int> 819, 830, 850, 1022, 837, 728, 854, 723, 846, 745, 851, 856, 917, 937, 910, 706, 902, 858, 825, 850, 820,…
## $ arr_delay      <dbl> 11, 20, 33, -18, -25, 12, 19, -14, -8, 8, -2, -3, 7, -14, 31, -4, -8, -7, 12, -6, -8, 16, -12, -8, -17, 3…
## $ carrier        <chr> "UA", "UA", "AA", "B6", "DL", "UA", "B6", "EV", "B6", "AA", "B6", "B6", "UA", "UA", "AA", "B6", "UA", "B6…
## $ flight         <int> 1545, 1714, 1141, 725, 461, 1696, 507, 5708, 79, 301, 49, 71, 194, 1124, 707, 1806, 1187, 371, 4650, 343,…
## $ tailnum        <chr> "N14228", "N24211", "N619AA", "N804JB", "N668DN", "N39463", "N516JB", "N829AS", "N593JB", "N3ALAA", "N793…
## $ origin         <chr> "EWR", "LGA", "JFK", "JFK", "LGA", "EWR", "EWR", "LGA", "JFK", "LGA", "JFK", "JFK", "JFK", "EWR", "LGA", …
## $ dest           <chr> "IAH", "IAH", "MIA", "BQN", "ATL", "ORD", "FLL", "IAD", "MCO", "ORD", "PBI", "TPA", "LAX", "SFO", "DFW", …
## $ air_time       <dbl> 227, 227, 160, 183, 116, 150, 158, 53, 140, 138, 149, 158, 345, 361, 257, 44, 337, 152, 134, 147, 170, 10…
## $ distance       <dbl> 1400, 1416, 1089, 1576, 762, 719, 1065, 229, 944, 733, 1028, 1005, 2475, 2565, 1389, 187, 2227, 1076, 762…
## $ hour           <dbl> 5, 5, 5, 5, 6, 5, 6, 6, 6, 6, 6, 6, 6, 6, 6, 5, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, …
## $ minute         <dbl> 15, 29, 40, 45, 0, 58, 0, 0, 0, 0, 0, 0, 0, 0, 0, 59, 0, 0, 0, 0, 10, 5, 10, 10, 7, 0, 0, 10, 15, 15, 30,…
## $ time_hour      <dttm> 2013-01-01 05:00:00, 2013-01-01 05:00:00, 2013-01-01 05:00:00, 2013-01-01 05:00:00, 2013-01-01 06:00:00,…

We have time-related columns; Apart from year, month, day we have time_hour; and time-event numerical data such as arr_delay (arrival delay) and dep_delay (departure delay). We also have categorical data such as carrier, origin, dest, flight and tailnum of the aircraft. It is also a large dataset containing 330K entries. Enough to play with!!

Let us replace the NAs in arr_delay and dep_delay with zeroes for now, and convert it into a time-series object with tsibble:

Let us proceed with our questions:

Q.1. Plot the monthly average arrival delay by carrier

Q.2. Plot a candlestick chart for total flight delays by month for each carrier

Q.2. Plot a heatmap chart for total flight delays by origin, aggregated by month

4.2 Conclusion

We can plot most of the common Time Series Plots with the help of the tidyverts packages: ( tsibble, feast, fable and fabletools) , along with timetk and ggformula.

There are other plot packages to investigate, such as dygraphs.

Recall that we have used the tsibble format for the data. There are other formats such as ts, xts and others which are also meant for time series analysis. But for our present purposes, we are able to do things with the capabilities of timetk.

4.3 References

  1. Rob J Hyndman and George Athanasopoulos. Forecasting: Principles and Practice (3rd ed). Available online at https://otexts.com/fpp3/
---
title: "Time Series in R"
subtitle: ""
author: Arvind Venkatadri
affiliation: 
output:
  html_document:
    theme: flatly
    toc: TRUE
    toc_float: TRUE
    toc_depth: 2
    number_sections: TRUE
    code_folding: hide
    code_download: TRUE
    df_print: paged
editor_options: 
  markdown: 
    wrap: 72
abstract: Part of my online course `R for Artists and Designers` to teach R using Metaphors and Code.
---

# Introduction

# Goals

At the end of this Lab session, we will be able to:

-   Understand time series data and data structures
-   Create Time Series Visualizations
-   Perform modelling tasks on Time Series Data
-   Calculate and depict Averages, Seasons, Trends in Time Series
-   Perform Forecasting with Time Series

# Packages

```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = FALSE, message = FALSE,warning = FALSE)
library(tidyverse)
library(lubridate) #Messing with Dates
library(tsibble) # Time-aware dataframes and tibbles
library(tsibbledata)
library(feasts) # visualising the data and extracting time series features.
library(fable) # forecasting methods for tsibble, such as ARIMA and ETS.
library(fabletools) #  modelling infrastructure to ease the programming with tsibble.
library(timetk) # Plotting and Analysing Time Series Data
library(ggformula)

######
library(prophet)
library(CausalImpact)

```

# The `tsibble` data format

TBW. To be written up.

## Case Study -1: Walmart Sales Dataset from timetk

Let us inspect what datasets are available in the package timetk. Type
data(package = "timetk") in your Console to see what datasets are
available.

Let us choose the Walmart Sales dataset. See here for more details:
Walmart Recruiting - Store Sales Forecasting \|Kaggle

```{r}

data("walmart_sales_weekly")
walmart_sales_weekly
glimpse(walmart_sales_weekly)

# Try this in your Console
# help("walmart_sales_weekly")

```

The data is described as:

A tibble: 9,743 x 3

-   id Factor. Unique series identifier (4 total)
-   Store Numeric. Store ID.
-   Dept Numeric. Department ID.
-   Date Date. Weekly timestamp.
-   Weekly_Sales Numeric. Sales for the given department in the given
    store.
-   IsHoliday Logical. Whether the week is a "special" holiday for the
    store.
-   Type Character. Type identifier of the store.
-   Size Numeric. Store square-footage
-   Temperature Numeric. Average temperature in the region.
-   Fuel_Price Numeric. Cost of fuel in the region.
-   MarkDown1, MarkDown2, MarkDown3, MarkDown4, MarkDown5 Numeric.
    Anonymized data related to promotional markdowns that Walmart is
    running. MarkDown data is only available after Nov 2011, and is not
    available for all stores all the time. Any missing value is marked
    with an NA.
-   CPI Numeric. The consumer price index.
-   Unemployment Numeric. The unemployment rate in the region.

NOTE: 1. This is still a data.frame, with a time-oriented variable of
course, and not yet a time-series object. Note that this data frame has
the YMD columns repeated for each Dept. 2. The Date column has repeated
entries for each Dept! To deal with this repetition, we will always need
to split the Weekly_Sales by the Dept column before we plot or analyze.

Since our sales are weekly, we will convert Date to yearweek format:

```{r}
walmart_time <- walmart_sales_weekly %>% 
  
  as_tsibble(index = Date, # Time Variable 
             
  key = Dept ) 
# Identifies unique "subject" who are measures 
# All other variables such as Weekly_sales become "measured variable" 
# Each observation should be uniquely identified by index and key
  
walmart_time
```

### Basic Time Series Plots

The easiest way is to use autoplot from the feasts package. You may need
to specify the actual measured variable, if there is more than one
numerical column:

```{r feasts-basic-plot}

autoplot(walmart_time,.vars = Weekly_Sales)
```

The R package `timetk` gives us interactive plots that may be more
evocative than the static plot above. The basic plot function with
`timetk` is `plot_time_series`. There are arguments for the date
variable, the value you want to plot, colours, groupings etc.

Let us explore this dataset using `timetk`, using our trusted method of
asking Questions:

**Q.1 How are the weekly sales different for each Department?**

There are `r walmart_sales_weekly %>% distinct(Dept) %>% count()` number
of Departments. So we should be fine plotting them and also facetting
with them, as we will see in a bit:

```{r}
walmart_time %>% 
  timetk::plot_time_series(Date, Weekly_Sales, 
                           .color_var = Dept, 
                           .legend_show = TRUE, 
                           .title = "Walmart Sales Data by Department",
                           .smooth = FALSE)
```

**Q.2. What do the sales per Dept look like during the month of December
(Christmas time) in 2012? Show the individual Depts as facets.**

We can of course zoom into the interactive plot above, but if we were to
plot it anyway:

```{r filtering-by-time}

# Only include rows from 1 to December 31, 2011
# Data goes only up to Oct 2012
walmart_time %>%
  
  # Each side of the time_formula below is specified as the character
  # 'YYYY-MM-DD HH:MM:SS',
  
  timetk::filter_by_time(.date_var = Date,
                         .start_date = "2011-12-01",
                         .end_date = "2011-12-31") %>%
  
  plot_time_series(
    .date_var = Date,
    .value = Weekly_Sales,
    .color_var = Dept,
    .facet_vars = Dept,
    .facet_ncol = 2,
    .smooth = FALSE
  ) # Only 4 points per graph
```

Clearly the "unfortunate" Dept#13 has seen something of a Christmas drop
in sales, as has Dept#38 ! The rest, all is well, it seems...

Too much noise? How about some averaging?

**Q.3 How do we smooth out some of the variations in the time series to
be able to understand it better?**

Sometimes there is too much noise in the time series observations and we
want to take what is called a *rolling average*. For this we will use
the function `timetk::slidify` to create an averaging function of our
choice, and then apply it to the time series using regular
`dplyr::mutate`. Let's take the average of Sales for each month in each
Department. Our **function** will be named "rolling_avg_month":

```{r  averaging-function}
rolling_avg_month = slidify(.period = 4, # every 4 weeks 
                            .f = mean, # The function to average 
                            .align = "center", # Aligned with middle of month 
                            .partial = TRUE) # To catch any leftover half weeks rolling_avg_month
```

OK, slidify creates a function! Let's apply it to the Walmart Sales time
series...

```{r averaging}
walmart_time %>% # group_by(Dept) %>%
  mutate(avg_monthly_sales = rolling_avg_month(Weekly_Sales)) %>%
  # ungroup() %>%
  timetk::plot_time_series(Date,
                           avg_monthly_sales,
                           .color_var = Dept,
                           .smooth = FALSE)
```

Graphs are smoother now. Need to check whether the averaging was done on
a per-Dept basis...should we have had a `group_by(Dept)` before the
averaging, and `ungroup()` before plotting? Try it !!

### Decomposing Time Series: Trends, Seasonal Patterns, and Cycles

Each data point ($Y_t$) at time $t$ in a Time Series can be expressed as
either a sum or a product of 4 components, namely, Seasonality($S_t$),
Trend($T_t$), Cyclic, and Error($e_t$) (a.k.a White Noise).

-   Trend: pattern exists when there is a long-term increase or decrease
    in the data.
-   Seasonal: pattern exists when a series is influenced by seasonal
    factors (e.g., the quarter of the year, the month, or day of the
    week).
-   Cyclic: pattern exists when data exhibit rises and falls that are
    not of fixed period (duration usually of at least 2 years).
-   Error or Noise: Random component

Decomposing non-seasonal data means breaking it up into trend and
irregular components. To estimate the trend component of a non-seasonal
time series that can be described using an additive model, it is common
to use a smoothing method, such as calculating the simple moving average
of the time series.

`timetk` has the ability to achieve this: Let us plot the trend,
seasonal, cyclic and irregular aspects of Weekly_Sales for Dept 38:

```{r decompose-time-series}
walmart_time %>% filter(Dept == "38") %>% 
  timetk::plot_stl_diagnostics(.date_var = Date, 
                               .value = Weekly_Sales)

```

We can do this for all Dept using `fable` and `fabletools`:

```{r Decomposing_trends}
walmart_decomposed <- walmart_time %>%
  
  # If we want to filter, we do it here # filter(Dept == "38") %>% #
  fabletools::model(stl = STL(Weekly_Sales))

fabletools::components(walmart_decomposed)

autoplot(components((walmart_decomposed)))

```

### Case Study-2: Dataset from nycflights13

Let us try the flights dataset from the package `nycflights13`. Try
`data(package = "nycflights13")` in your Console.

We have the following datasets in `nycflights13`:

-   airlines Airline names.
-   airports Airport metadata
-   flights Flights data
-   planes Plane metadata.
-   weather Hourly weather data

Let us analyze the flights data:

```{r}
data("flights", package = "nycflights13")
glimpse(flights)
```

We have time-related columns; Apart from year, month, day we have
`time_hour`; and time-event numerical data such as `arr_delay` (arrival
delay) and `dep_delay` (departure delay). We also have categorical data
such as `carrier`, `origin`, `dest`, `flight` and `tailnum` of the
aircraft. It is also a large dataset containing 330K entries. Enough to
play with!!

Let us replace the NAs in `arr_delay` and `dep_delay` with zeroes for
now, and convert it into a time-series object with `tsibble`:

```{r}
flights_delay_ts <- flights %>%
  mutate(arr_delay = replace_na(arr_delay, 0),
         dep_delay = replace_na(dep_delay, 0)) %>%
  select(time_hour,
         arr_delay,
         dep_delay,
         carrier,
         origin,
         dest,
         flight,
         tailnum) %>%
  tsibble::as_tsibble(
    index = time_hour,
    # All the remaining identify unique entries
    # Along with index # Many of these variables are common
    # Need all to make unique entries!
    key = c(carrier, origin, dest, flight, tailnum),
    validate = TRUE
  ) # Making sure each entry is unique

flights_delay_ts
```

Let us proceed with our questions:

**Q.1. Plot the monthly average arrival delay by carrier**

```{r}
mean_arr_delays_by_carrier <- flights_delay_ts %>%
  group_by(carrier) %>%
  index_by(month = ~ yearmonth(.)) %>%
  
  # index_by uses (year, yearquarter, yearmonth, yearweek, as.Date)
  # to create a new column to show the time-grouping # year / quarter / month/ week, or day…
  # which IS different from traditional dplyr
  
  summarise(mean_arr_delay = mean(arr_delay, na.rm = TRUE))

mean_arr_delays_by_carrier

mean_arr_delays_by_carrier %>% 
  timetk::plot_time_series(
  .date_var = month,
  .value = mean_arr_delay,
  .facet_vars = carrier,
  .smooth = FALSE,
  # .smooth_degree = 1, 
  # keep .smooth off since it throws warnings if there are too few points
  # Like if we do quarterly or even yearly summaries
  # Use only for smaller values of .smooth_degree (0,1)
  #
  .facet_ncol = 4,
  .title = "Average Monthly Arrival Delays by Carrier"
)
```

**Q.2. Plot a candlestick chart for total flight delays by month for
each carrier**

```{r}
flights_delay_ts %>% 
  mutate(total_delay = arr_delay + dep_delay) %>% 
  timetk::plot_time_series_boxplot(
  .date_var = time_hour,
  .value = total_delay,
  .color_var = origin,
  .facet_vars = origin,
  .period = "month",
  # same warning again .smooth = FALSE 
  )
```

**Q.2. Plot a heatmap chart for total flight delays by origin,
aggregated by month**

```{r}
avg_delays_month <-
  flights_delay_ts %>% 
  group_by(origin) %>% 
  mutate(total_delay = arr_delay + dep_delay) %>% 
  index_by(month = ~ yearmonth(.)) %>% 
  
  # index_by uses (year, yearquarter, yearmonth, yearweek, as.Date) 
  # to create a new column to show the time-grouping 
  # year / quarter / month/ week, or day… 
  # which IS different from traditional dplyr 
  
  summarise(mean_monthly_delay = mean(total_delay, na.rm = TRUE))
  
avg_delays_month 
# three origins 12 months therefore 36 rows 
# # Tsibble index_by + summarise also gives us a month` column

ggformula::gf_tile(
  origin ~ month,
  fill = ~ mean_monthly_delay,
  color = "black",
  data = avg_delays_month,
  title = "Mean Flight Delays from NY Airports in 2013"
) %>%
  gf_theme(theme_classic()) %>%
  gf_theme(scale_fill_viridis_c(option = "A")) %>%
  
  # "magma" (or "A") inferno" (or "B") "plasma" (or "C")
  # "viridis" (or "D") "cividis" (or "E")
  # "rocket" (or "F") "mako" (or "G") "turbo" (or "H")
  
  gf_theme(scale_x_time(breaks = 1:12, labels = month.abb))

```

## Conclusion

We can plot most of the common Time Series Plots with the help of the
`tidyverts` packages: ( `tsibble`, `feast`, `fable` and `fabletools`) , along
with `timetk` and `ggformula`.

There are other plot packages to investigate, such as `dygraphs`.

Recall that we have used the `tsibble` format for the data. There are
other formats such as `ts`, `xts` and others which are also meant for
time series analysis. But for our present purposes, we are able to do
things with the capabilities of `timetk`.

## References

1.  Rob J Hyndman and George Athanasopoulos. *Forecasting: Principles
    and Practice (3rd ed)*. Available online at
    <https://otexts.com/fpp3/>
    
    

