Clustering: how are jobs distributed in PA in 2019?

Only Jobs data

We use clustering analysis to find out the patterns of jobs distribution in different block groups.

The dataset we used is WAC(Workplace Area Characteristic data, jobs are totaled by work Census Block), a subset of LEHD dataset. In the dataset, every row is the numbers of employees in a block group in a year, and the employees are classified in different dimension.

2_ WAC-data-description-1 2_ WAC-data-description-2

4 Clusters

We use elbow method and determine 4 clusters in the dataset. 2_ elbow

Below are the overall interactive charts. You can select the factor you are interested in to see how different group differentiate. Note that barplot of the type of job is seperated from the main chart because of space limitation.

Overall factors of 4 clusters:

Job types of 4 clusters:

2_Edu&Earning

2_Ethinity&Race

Why k-means?

We also tried DBSCAN to do clustering, but compared to K-means, the result of DBSCAN is more complicated. It often defines excessive clusters and doesn’t show difference among them. For example, below is a barplot of jobs type in different clusters, which provides little information.

2_db

Share on

Twitter Facebook LinkedIn

Benjamin She,
Hanpu Yao

Clustering: how are jobs distributed in PA in 2019?

Only Jobs data

4 Clusters

Why k-means?

Share on

You may also enjoy

Introduction: An analysis of homes and jobs in the Philadelphia metro region

Part 1: Exploratory data analysis

Part 3: Origin-Destination Analysis

Benjamin She, Hanpu Yao

Only Jobs data

4 Clusters

Why k-means?

Share on

You may also enjoy

Introduction: An analysis of homes and jobs in the Philadelphia metro region

Part 1: Exploratory data analysis

Part 3: Origin-Destination Analysis

Benjamin She,
Hanpu Yao