Raw data

The data for this exercise was found here: https://www.bea.gov/news/2018/prototype-gross-domestic-product-county-2012-2015

There are a couple of Excel spreadsheets to choose from. We chose the “Data Table for GDP by County”. The data was downloaded 16 Apr 2019 from: https://www.bea.gov/system/files/2018-12/GCP_Release_1.xlsx.

Here’s a screenshot of the file open in Excel:

knitr::include_graphics("gdp_excel.png")

Tidying the data

The Excel file has mutliple header rows. Based on a little experimentation, we drop the first two rows and manually assign column names.

I usually check the tail of a dataset, and in this case there are several blank rows. Looking in the Excel file, we learn that these are a couple of footnotes. We need to omit those rows.

Gather the GDP columns into a single column.