Introduction
Recently, I stumbled across certain articles that has drawn a lot of attention:
- Desmond Lee rebuts Leong Mun Wai’s claim that graduates are worse off today in housing affordability
- More fresh graduates from Singapore’s autonomous universities were unemployed in 2024
As a soon-to-be graduate, and someone who’ll eventually enter the housing market, I thought it’d be a fun weekend project to take a look at publicly available data and try to draw conclusions from it.
Claims Made by Politicians (that I think are important to examine)
- Gan Siow Huang - Employment rates for autonomous universities have remained broadly stable across the decade.
- Leong Mun Wai - Graduates are worse off today in housing affordability.
- Desmond Lee - (In response to Leong Mun Wai) This comparison used “selective” data and failed to account for significant changes in education levels, housing quality, and the maturity of the resale market over the years.
Gathering Data
Going into this, the stats are available but scattered across different sources, as I would assume that’s how the government operates. For example, data.gov.sg was the first place I looked, but for some reason, they only have Graduate Employment Survey (GES) data from 2013-2022. I found a script to parse more recent GES data (credits to hewliyang) here.
I then appended this to the GES data from data.gov.sg here, and also pulled the resale flat prices here.
I added them to separate tables in a SQLite database, and copied them over to a Grafana instance I could play around with.
Note: I ended up deciding to go back and manually get overall median salary and employment figures to get a better picture. I will say it’s quite appalling how the media tries to “hide” the employment figures by tweaking the phrasing about them. For example, in the 2024 article, “more fresh graduates were unemployed — 12.9% in 2024”: I suspect they are trying to avoid mentioning that the employment rate is at an all-time low.
Methodology
Note: Any stats with initial estimated calculations are marked with (Estimated). Everything else is marked with (Exact).
Now my initial calculations were really, really rough. My defense for this is that I just need good enough stats, not accurate stats to paint a picture of trends. Looking at current data, I’d like to say my data tends to be an underestimation. For example, in the CNA article above, it was said that 87.1% were employed, whereas my estimations show around 83.4%. I’ll quickly lay out some of my calculations and justifications below:
This isn’t the best way to calculate median, as degrees with more graduates become underrepresented and degrees with fewer graduates become overrepresented, but I can’t be bothered parsing another dumb PDF file of student intake for every university.
The initial plan was to use these estimated data, but after looking at the initial data, I decided that the estimated salary values were far too inaccurate, so I manually scraped through newspaper articles to get the employment rate and median salaries, backdated to 2007.
I’ve marked all the manual stats as (Exact) and the derived ones as (Estimated).
Something I did want to note is that the median salary doesn’t really account for unemployment stats, which is too large to ignore. So I proposed a new metric I call expected median salary, which takes into account that your salary at the point of unemployment is 0.
I then just divide the prices of flats by the salaries to get the cost of resale flats as a multiple of (expected) median salary, which I think is a pretty good estimate of how prices have scaled over time.
Link to the Grafana Dashboard here
Notable Observations
Notably, the all-time highs of the resale market in 2013 have not been hit yet. Being relatively ignorant about the housing market back then, I had to Google to catch up with my knowledge gaps, but from what I understand, the prices of flats skyrocketed due to low interest rates and high demand, as well as an extremely limited number of BTO flats at the time.
Claims
Claim 1: Employment rates for autonomous universities have remained broadly stable across the decade
I don’t really know what the meaning of “broadly stable” is so I’m gonna say it’s going to arbitrarily label it as anything beyond 2-3% change per year in employment rates. If we look at the data for year on year (YOY) change in employment rates, 2020, 2023 and 2024 seem to have what I would barely consider as “broadly stable”, with 2023 dropping by 4.2%, and 2024 continuing the decrease in employment rate.
Note: Interestingly, the drop in employment rate in 2023 is worse than 2008, yet nobody has really sounded the alarms yet.
“Broadly stable across the decade” I guess, is technically correct if you consider the employment stats between 2014 and 2024 in isolation, but I think it’s a bit disingenuous to ignore the 2020-2024 period. Additionally, if you were to consider the stats past 2014, full-time employment rates have dropped almost 10% since the earliest period I could find stats on in 2007.
Some other stats I found interesting:
- Graduate employment is at an all-time low (that is, since I could find relevant stats since 2007, so an 18-year low)
- Full-time employment dipped below 80%. There has only been 3 years since 2007 since this has happened. (2017, 2020 due to COVID, and 2024).
Claim 2: Graduates are worse off today in housing affordability
In his claim, Leong Mun Wai compares the housing situation in 1979 to now. I couldn’t find employment stats that far, but this is technically false according to stats from 2007 onwards. We were way worse in 2012 and 2013, although it looks like we’re on track to exceed those.
Claim 3: This comparison used “selective” data and failed to account for significant changes in education levels, housing quality, and the maturity of the resale market over the years
Which is a fair argument to the initial data, but looking at recent stats where there hasn’t been significant changes in education levels, housing quality and maturity of resale market, the stats still doesn’t show a good picture. If we were to use the “across the decade” metric, the resale flat to median salary ratio is at an all time high. The only other time this ratio has been higher is 2012 and 2013.
My understanding here is also that he implies that he disagrees with the initial claim that graduates are worse off today, even though he doesn’t explicitly say it. I think a good metric to try and measure that is the ratio of monthy mortgage payments to median salary. I think this is a cool metric because:
- This accounts for real world repayments of housing loans. In particular,
- There is a Loan to Value limit of 75% which is accounted for in this metric
- The max loan period is also accounted for, being 25 years
- This also accounts for the fact that housing loans explicitly allow a maximum of 30% of your monthly salary to be used as mortgage payments.
If we look at this metric, the current ratio is just above 30%. This means that more than 50% of fresh graduates are unable to afford more than 50% of the resale flats on the market. (not even counting 5-rooms and above)
Assumptions
The key assumption we have to make while looking at housing data in particular is that there is no significant disparity between BTO prices and resale flat prices. Which there probably is, but I don’t have BTO pricing data to play around with.
Comparison to other countries
So how does our housing prices to median salary compare to other countries? For this, I manually calculated the average/median house price (depending on which stat I could find) to the median salary of that city.
- Singapore: 11.6 (it will take 11.6 years of the median salary to pay off the house)
- London: 10.7
- New York: 11.7-13.5 (depending on what source you find)
- Hong Kong: 37.7
- San Fran: 11.88
- Tokyo: 16.1
Appendix
Here are some scripts I used if you want to recreate the data:
- https://github.com/hewliyang/ges-report-to-csv/
- Converting csvs to sqlite (i’m pretty sure there’s an inbuilt command for this but I gpt’ed this)
import sqlite3
import pandas as pd
# Define file paths
csv_file = "ges.csv"
db_file = "ges_database.db"
# Load CSV into DataFrame
df = pd.read_csv(csv_file)
# Connect to SQLite database (or create it)
conn = sqlite3.connect(db_file)
cursor = conn.cursor()
# Define table name
table_name = "ges_data"
# Convert DataFrame to SQLite
df.to_sql(table_name, conn, if_exists="replace", index=False)
# Commit and close connection
conn.commit()
conn.close()
print(f"CSV data successfully imported into {db_file}, table: {table_name}")
- Merging all the resale flat prices into one big csv
import pandas as pd
import glob
# Get all CSV files in the current directory
csv_files = glob.glob('*.csv')
# Initialize an empty list to store dataframes
dfs = []
# Read each CSV file
for file in csv_files:
df = pd.read_csv(file)
# Drop 'remaining_lease' column if it exists
if 'remaining_lease' in df.columns:
df = df.drop('remaining_lease', axis=1)
dfs.append(df)
# Concatenate all dataframes
merged_df = pd.concat(dfs, ignore_index=True)
# Save the merged dataframe to a new CSV file
merged_df.to_csv('merged_resale_prices.csv', index=False)
print(f"Successfully merged {len(csv_files)} CSV files into merged_resale_prices.csv")