This repository for code, data, and figures were made during Duke ASA Datafest 2018. The Duke ASA Datafest 2018 was presented by Indeed.com. Indeed.com asked “What advice would you give a new high school about what major to choose in college? How does Indeed’s data compare to official government data on the labor market? Can it be used to provide good economic indicators?”.
Our team conducted an end-to-end data analysis including data cleaning, feature engineering, temporal and spatial modelling. and visualizing the job searching and job posting statistics in the United States. The final results were presented as a Shiny app, which provides annual statistics map, dynamic trends, a Gaussian Process model visualzation, and a Spatial Autoregressive model visualization.
- Conducted data cleaning,feature selection and creation, and data exploratory analysis (EDA) on a large dataset.
- Built Gaussian Processmodels and Spatial Autoregressive models respectively for the temporal and spatial data.
- Used ggplot2 to visual geographical data, and presented the data visualization results as a Shiny App in R.
The DataFest 2018 presented by BrunchLadies.pdf file is the final presentation slide, which provides a summary of the shiny app, with links to YouTube video demonstrations.
The gauss_process.R and Spatial_Model.R files are the scrips for building the temporal and spatial models respectively.
The ShinyApp.r file is the main script for the shiny app.
Pic 1: Total clicks of job posts by states over time