on graduates and housing

Posted on:March 9, 2025 at 12:00 PM

Introduction

Recently, I stumbled across certain articles that has drawn a lot of attention:

As a soon-to-be graduate, and someone who’ll eventually enter the housing market, I thought it’d be a fun weekend project to take a look at publicly available data and try to draw conclusions from it.

Claims Made by Politicians (that I think are important to examine)

Gathering Data

Going into this, the stats are available but scattered across different sources, as I would assume that’s how the government operates. For example, data.gov.sg was the first place I looked, but for some reason, they only have Graduate Employment Survey (GES) data from 2013-2022. I found a script to parse more recent GES data (credits to hewliyang) here.

I then appended this to the GES data from data.gov.sg here, and also pulled the resale flat prices here.

I added them to separate tables in a SQLite database, and copied them over to a Grafana instance I could play around with.

Note: I ended up deciding to go back and manually get overall median salary and employment figures to get a better picture. I will say it’s quite appalling how the media tries to “hide” the employment figures by tweaking the phrasing about them. For example, in the 2024 article, “more fresh graduates were unemployed — 12.9% in 2024”: I suspect they are trying to avoid mentioning that the employment rate is at an all-time low.

Methodology

Note: Any stats with initial estimated calculations are marked with (Estimated). Everything else is marked with (Exact).

Now my initial calculations were really, really rough. My defense for this is that I just need good enough stats, not accurate stats to paint a picture of trends. Looking at current data, I’d like to say my data tends to be an underestimation. For example, in the CNA article above, it was said that 87.1% were employed, whereas my estimations show around 83.4%. I’ll quickly lay out some of my calculations and justifications below:

Median Salary=AVG(Median Salary) GROUP BY yearMedian\ Salary = AVG(Median\ Salary)\ GROUP\ BY\ year

This isn’t the best way to calculate median, as degrees with more graduates become underrepresented and degrees with fewer graduates become overrepresented, but I can’t be bothered parsing another dumb PDF file of student intake for every university.

The initial plan was to use these estimated data, but after looking at the initial data, I decided that the estimated salary values were far too inaccurate, so I manually scraped through newspaper articles to get the employment rate and median salaries, backdated to 2007.

I’ve marked all the manual stats as (Exact) and the derived ones as (Estimated).

Something I did want to note is that the median salary doesn’t really account for unemployment stats, which is too large to ignore. So I proposed a new metric I call expected median salary, which takes into account that your salary at the point of unemployment is 0.

Expected Median Salary=(Median Salary)(Employment Rate) GROUP BY yearExpected\ Median\ Salary = (Median\ Salary) * (Employment\ Rate)\ GROUP\ BY\ year

I then just divide the prices of flats by the salaries to get the cost of resale flats as a multiple of (expected) median salary, which I think is a pretty good estimate of how prices have scaled over time.

Link to the Grafana Dashboard here

Notable Observations

Notably, the all-time highs of the resale market in 2013 have not been hit yet. Being relatively ignorant about the housing market back then, I had to Google to catch up with my knowledge gaps, but from what I understand, the prices of flats skyrocketed due to low interest rates and high demand, as well as an extremely limited number of BTO flats at the time.

Claims

Claim 1: Employment rates for autonomous universities have remained broadly stable across the decade

I don’t really know what the meaning of “broadly stable” is so I’m gonna say it’s going to arbitrarily label it as anything beyond 2-3% change per year in employment rates. If we look at the data for year on year (YOY) change in employment rates, 2020, 2023 and 2024 seem to have what I would barely consider as “broadly stable”, with 2023 dropping by 4.2%, and 2024 continuing the decrease in employment rate.

Note: Interestingly, the drop in employment rate in 2023 is worse than 2008, yet nobody has really sounded the alarms yet.

“Broadly stable across the decade” I guess, is technically correct if you consider the employment stats between 2014 and 2024 in isolation, but I think it’s a bit disingenuous to ignore the 2020-2024 period. Additionally, if you were to consider the stats past 2014, full-time employment rates have dropped almost 10% since the earliest period I could find stats on in 2007.

Some other stats I found interesting:

Claim 2: Graduates are worse off today in housing affordability

In his claim, Leong Mun Wai compares the housing situation in 1979 to now. I couldn’t find employment stats that far, but this is technically false according to stats from 2007 onwards. We were way worse in 2012 and 2013, although it looks like we’re on track to exceed those.

Claim 3: This comparison used “selective” data and failed to account for significant changes in education levels, housing quality, and the maturity of the resale market over the years

Which is a fair argument to the initial data, but looking at recent stats where there hasn’t been significant changes in education levels, housing quality and maturity of resale market, the stats still doesn’t show a good picture. If we were to use the “across the decade” metric, the resale flat to median salary ratio is at an all time high. The only other time this ratio has been higher is 2012 and 2013.

My understanding here is also that he implies that he disagrees with the initial claim that graduates are worse off today, even though he doesn’t explicitly say it. I think a good metric to try and measure that is the ratio of monthy mortgage payments to median salary. I think this is a cool metric because:

If we look at this metric, the current ratio is just above 30%. This means that more than 50% of fresh graduates are unable to afford more than 50% of the resale flats on the market. (not even counting 5-rooms and above)

Assumptions

The key assumption we have to make while looking at housing data in particular is that there is no significant disparity between BTO prices and resale flat prices. Which there probably is, but I don’t have BTO pricing data to play around with.

Comparison to other countries

So how does our housing prices to median salary compare to other countries? For this, I manually calculated the average/median house price (depending on which stat I could find) to the median salary of that city.

Appendix

Here are some scripts I used if you want to recreate the data:

import sqlite3
import pandas as pd

# Define file paths
csv_file = "ges.csv"
db_file = "ges_database.db"

# Load CSV into DataFrame
df = pd.read_csv(csv_file)

# Connect to SQLite database (or create it)
conn = sqlite3.connect(db_file)
cursor = conn.cursor()

# Define table name
table_name = "ges_data"

# Convert DataFrame to SQLite
df.to_sql(table_name, conn, if_exists="replace", index=False)

# Commit and close connection
conn.commit()
conn.close()

print(f"CSV data successfully imported into {db_file}, table: {table_name}")
import pandas as pd
import glob

# Get all CSV files in the current directory
csv_files = glob.glob('*.csv')

# Initialize an empty list to store dataframes
dfs = []

# Read each CSV file
for file in csv_files:
    df = pd.read_csv(file)

    # Drop 'remaining_lease' column if it exists
    if 'remaining_lease' in df.columns:
        df = df.drop('remaining_lease', axis=1)

    dfs.append(df)

# Concatenate all dataframes
merged_df = pd.concat(dfs, ignore_index=True)

# Save the merged dataframe to a new CSV file
merged_df.to_csv('merged_resale_prices.csv', index=False)
print(f"Successfully merged {len(csv_files)} CSV files into merged_resale_prices.csv")