Hackerrank was broken - but now it's actually harmful

Published on November 27, 2024

I had an interesting conversation with some junior and mid-level engineers recently. They were pretty discouraged about whiteboading coding tests. Apparently, they've gotten significantly harder over the last two years.

Everyone's inverting binary trees these days.

I got my hands on a few questions they've given someone with 1 YoE and 3 YoE. They were extremely hard, especially for someone at that experience level.

I think these companies are not actually testing problem solving ability anymore. They're testing who's better at using AI.

The Incentives Are Broken

From quick reddit searches, it's clear that everyone's using AI for these tests. Whether it's ChatGPT or specialized "Coding Interview Assistance" tools (I won't link them, you can find them), people have come to rely on them heavily.

If you're a company using these platforms, you're optimizing for your internal engineers' time. Fair enough. But here's what typically happens:

The test parameters are usually configured by someone semi-technical at best. The evaluation boils down to a score, and the platform tells you if the candidate is "good enough."

In one company - which was a bank - the test difficulty was calibrated by the technical recruiting team. And these companies are really good at selling to recruiting teams 👇

Basically, the promise is that you can outsource setting your benchmarks.

The Recalibration Problem

Since everyone started using AI, more candidates started clearing the first round with flying colors. The platforms had to recalibrate to let in their target percentage. But they're not measuring the code written by the candidate anymore—they're measuring how well the candidate uses an LLM.

Most developers who can actually write this code try to do it themselves. They get marked lower than peers who used AI and completed it faster.

The result? Hiring teams keep raising the bar arbitrarily, trying to find the candidates who are best at prompt engineering their way through a coding test.

Whiteboard[1] tests have been broken for a while...

...but they're outright harmful now.

Most of these testing platforms give you unit tests that tell your whether your code is correct, and you can run them multiple times. And this is really good for someone who uses an LLM because all you have to do is copy -> paste -> run tests -> feed the test failure to the LLM -> repeat; until all the tests pass.

So at the end of the day, the people who get into the second round of interviews aren't the best software engineers. They'd probably be the best "prompt engineers".

At best, they might get rejected from the later stages, but not before taking up a bunch of time by the developers who do a face to face interview. Which of course is the exact thing that these platforms are supposed to avoid.

At worst, the hiring manager gets pressured to find the least worst candidate.

If there's one silver lining here - it's that this will accelerate the demise of these white-boarding testing platforms.

[1] Some people have (rightly) pointed out that I'm interchanging whiteboarding (writing on a physical whiteboard) and online coding tests interchangeably. When I mean whiteboarding, I'm more so alluding to the popular hiring without whiteboards definition.

Whiteboards" is used as a metaphor, and is a symbol for the kinds of CS trivia questions that are associated with bad interview practices. Whiteboards are not bad – CS trivia questions are. Using sites like HackerRank/LeetCode probably fall into a similar category.

However, it's a mistake on my part to treat this as assumed knowledge. I'm leaving the original text intact.