The architecture problem behind skills intelligence

Why every skills taxonomy in production today is a static map for a shifting territory. And what it means for workforce planning.

In 2019, I watched a large European bank run a workforce planning exercise that took eleven months and cost, by their own estimate, roughly €4 million in consultant fees. The output was a spreadsheet of 3,400 job roles, each mapped to a curated list of skills. It was thorough. It was peer-reviewed. It was obsolete before it was printed.

I know this because I built the system that showed them what their workforce actually looked like in real time. The two pictures were so different they initially assumed we had made an error. We had not made an error.

The popular narrative is that skills intelligence is an information problem. Gather more data. Curate better. Refresh more often. That misses the point entirely.

What we have is an architecture problem. The dominant paradigm for workforce skills intelligence, the taxonomy, is structurally incapable of representing the thing it claims to model. Not because the people building taxonomies are careless. Because a tree cannot describe a network.

I want to be precise about what I mean, and precise about where my argument reaches the edge of what I can prove.

The three silences of the taxonomy

A taxonomy organizes skills into a hierarchy: parent categories, child categories, leaf nodes. "Technical skills" contains "Software Engineering" which contains "Python." This structure has one significant virtue: it is legible. You can print it. You can browse it. A committee can approve it.

It has three structural deficiencies that no amount of additional data can fix.

Distance: In a taxonomy, "Python" is adjacent to "Java" because they share a parent node. But the actual cognitive distance between a backend Python engineer and a machine learning engineer using Python is far greater than the shared language suggests. The ML engineer needs linear algebra, probabilistic reasoning, model evaluation, and an entirely different mental model for how software behaves. Meanwhile, the distance between a Python developer and a skilled SQL analyst may be smaller than the taxonomy implies, because both require the same underlying capacity for structured reasoning about data transformations. The tree cannot represent this. The tree has no metric.

Velocity: A taxonomy entry exists when a human curator adds it. The green economy skills cluster (carbon accounting, lifecycle assessment, ESG financial modeling) was forming in the labor market throughout 2023. Most taxonomies did not name it until 2024. The lag is not a process failure. It is a category error. You cannot add a word to a dictionary before you have decided what the word means. But the labor market does not wait for committees.

Context: A taxonomy lists "communication" once. In reality, communication for a nurse (patient empathy, care coordination, de-escalation in crisis) and communication for a PR executive (crisis messaging, stakeholder narrative, media cultivation) share a label and almost nothing else. The flat entry cannot hold that distinction. It collapses meaning precisely where meaning matters most.

I call these the three silences of the taxonomy: silence on distance, silence on velocity, silence on context. Every workforce planning decision made on top of a taxonomy is made in the presence of these silences. Most planners do not know they are there.

To be clear: the teams that built these taxonomies were not careless. They were executing well against a blueprint that made perfect sense when job roles remained stable for a decade and career paths were linear enough to classify. A static taxonomy was the best available tool for that era. The problem is not that they built the wrong thing. The problem is that the conditions those tools were designed for no longer exist.

What a knowledge graph actually computes

I co-founded SkyHive (now part of Cornerstone) because I believed we could eliminate all three silences by changing the underlying data structure from a static tree to a knowledge graph: a network of nodes and edges where the edges carry mathematical weight, direction, and contextual meaning. The graph does not describe the labor market the way a dictionary describes a language. It models it the way a physics engine models motion.

Cornerstone’s skills intelligence engine is a knowledge graph that models the labor market as a network of mathematically weighted relationships between skills, roles, and observed workforce transitions, updated daily from millions of global hiring signals.

The difference is operational. In a taxonomy, "Python" is a noun. In the graph, "Python" is a position in a multidimensional embedding space, and its relationship to every adjacent skill is a computed distance. Not a parent-child classification. A measured gap.

This means three things that matter for workforce planning:

It means we can give you a number, not a verdict. When a registered nurse matches 68% against a nurse practitioner role, that is not a classification. That is a distance. Seven specific micro-capabilities are missing. The upskilling path is fourteen months. Those are computed outputs, not editorial judgments.

For a COO, that changes everything about the decision. A computed 14-month upskilling path with seven identified gaps is not a talent metric. It is a financial input. It tells you whether to spend six to nine months retraining an internal cohort or twelve months recruiting externally at two to three times the cost.

It converts "we have a skills gap" from a slide deck abstraction into a build-versus-buy calculation with actual numbers attached.

In the European bank case, this kind of precision reduced their projected external hiring spend by over 30% in the first wave of redeployment. They could see which internal candidates had a narrow enough gap to close through structured upskilling, and they stopped paying the open market for talent they already had.

It means the graph updates before you ask it to. The system ingests millions of global job postings and hiring signals daily. When prompt engineering began its rise from niche technical curiosity to core enterprise requirement between 2022 and 2025, the graph tracked the velocity of that shift in real time. It upgraded weights, remapped adjacencies, and recalculated upskilling paths before any HR committee had scheduled a meeting to discuss it.

And it means the same label carries different meaning in different contexts. The graph does not compute "communication" once. It computes it as dozens of contextually distinct nodes, each with its own weight distribution and transition probabilities depending on the role cluster it inhabits.

An LLM knows how human beings talk. Cornerstone’s skills engine knows how human beings work.

I want to be careful here. I am not claiming the graph is complete. It reflects what is observable in structured labor market signals: job postings, hiring patterns, role transitions, patent filings, academia... It does not capture tacit knowledge, informal credentialing, or the organizational dynamics that shape actual career movement. These are real limitations. I will return to them.

Why LLM-only approaches fail on different terms

In the last two years, several vendors have taken a different path: use a large language model to infer skills relationships from text. The approach is seductive because it scales. It does not require a graph infrastructure. It does not require observed transition data. It requires only text, and there is plenty of text.

The failure mode of LLM-only skills mapping is specific. I call it the Fluency Trap: the false assumption that linguistic co-occurrence in an AI model equates to actual skills adjacency in the labor market.

LLMs are trained on language. They know that "data science" and "Python" appear together frequently in text, so they infer a relationship. They know that "leadership" and "strategy" are semantically proximate. They can generate a career path from junior analyst to chief data officer that is grammatically coherent and structurally plausible. But linguistic co-occurrence is not labor market reality. The path that sounds right and the path that is right are different paths, and the model cannot distinguish between them.

This should be called out for what it really is, a generative AI hallucination: career path recommendations that are grammatically coherent and structurally plausible, but trace no viable, observed path through the real workforce.

Here is what that looks like in practice. An LLM might look at a Customer Success Manager and a Sales Engineer and suggest a fluid transition because the words client-facing, onboarding, and product overlap heavily in their descriptions. But the graph knows the observed transition probability is near zero without an intervening technical certification. The language says these roles are neighbors. The labor market says they are not. The LLM hallucinates a bridge that does not exist.

I should be honest about the competitive landscape. Some vendors in this space do more than raw LLM text inference. They train deep learning models on large profile datasets, sometimes combined with job posting analysis or historical hiring patterns. That is more sophisticated than pure language-based inference, and I am not going to pretend otherwise. But the core question still separates the approaches: does the system model skills adjacency based on how people describe work in text, or based on how people actually move through the workforce? Semantic proximity versus observed transition probability. That is the fault line, regardless of how much data sits behind the model.

You cannot fix this within the LLM-only paradigm. A larger model will not solve it. Better prompts will not solve it. The model is optimizing for linguistic plausibility. The workforce is optimizing for economic survival. These are different objective functions.

At Cornerstone, we use generative AI for the interface: the conversational layer, the natural language outputs, the narrative interpretation. The underlying recommendations come from the knowledge graph, grounded in observed transitions. The graph anchors the AI to what actually happened, not to what sounds like it might have happened.

Not fluency. Fidelity.

The legitimate counterargument

I want to name the strongest version of the objection, because I think it is legitimate.

The objection is this: the knowledge graph is only as good as the data it ingests. Job postings and hiring signals capture formal labor market activity. They miss the informal economy, the credential substitutions, the internal mobility paths that never surface in external postings, the career moves that happen through networks rather than applications. If the graph is built on what is visible, it inherits the biases of what is visible.

This is true. I do not have a clean answer to it. What I have is a directional one.

The graph is more complete than the taxonomy for the same reason that a live flight tracker is more complete than a printed timetable. Not because it is perfect, but because it updates in response to reality rather than in response to committees. The appropriate comparison is not "knowledge graph versus ground truth." It is "knowledge graph versus the alternative you are actually using."

The alternative most organizations are using is a taxonomy last reviewed eighteen months ago by a team that had never spoken to a hiring manager.

What this means for what you are building

Here is the comparison that matters for anyone currently evaluating skills intelligence platforms.

Capability	Static taxonomies	LLM / semantic inference systems	Cornerstone’s skills engine
Business outcome	Delayed hiring decisions, recruitment spend often misaligned with actual gaps	Unvalidated talent deployment, unpredictable redeployment ROI	Precision upskilling, quantified build-vs-buy decisions, accelerated internal mobility
Update frequency	Bi-weekly / manual	Periodic model retrain	Continuous, daily ingest
Context awareness	None. Flat label.	Partial. Inferred from text.	Full. Same skill, different meaning per role cluster.
Transition data	Not modeled	Inferred from linguistic similarity (semantic proximity, not observed movement)	Observed real workforce movements and hiring outcomes
Emerging skills	Added after human curation	Requires model retrain	Detected from signal patterns before named by any committee
Gap measurement	Qualified / not qualified	Plausible-sounding paths	Precise mathematical distance with identified micro-gaps

The table is honest but it requires interpretation. "Continuous daily ingest" does not mean omniscience. "Detected before named by any committee" does not mean the detection is always correct. Early cluster detection involves signal extraction from noisy data, and we have false positives. What the graph provides is the earliest available warning from structured labor market signals. Not a crystal ball. A seismograph.

I should also be direct about the middle column. It is not monolithic. Some platforms in that space combine language models with profile data, hiring history, or labor market feeds. The question is not whether they use data beyond text. The question is whether their core inference mechanism models observed workforce transitions or infers adjacency from language patterns. That is where the approaches diverge.

The question behind the question

When HR leaders ask me whether they should replace their taxonomy with a graph-based system, they are usually asking a smaller question than they think they are.

The smaller question is: will this be more accurate? The answer is yes, with the caveats I have named.

The larger question, the one worth asking, is what it means to make workforce planning decisions using a model of the labor market that was designed for the conditions of 1995. When roles were stable enough to be classified. When skills changed slowly enough to be curated. When career paths were linear enough to be mapped on a tree.

Those conditions do not exist anymore. AI is restructuring job roles faster than any committee can track. The half-life of a relevant skill in technical domains is now measured in months. Career transitions that would have seemed lateral a decade ago are now the primary path to the most in-demand roles. The engineer who moves from backend systems to machine learning infrastructure is not getting a promotion or making a pivot. That is a drift through a continuous skill landscape, and the landscape is moving.

A taxonomy cannot model drift. It can only model positions.

I am not arguing that every organization needs to rip out their current system tomorrow. I am arguing that every organization needs to understand what their current system cannot see, and make decisions with that blindness accounted for.

The three silences of the taxonomy, on distance, on velocity, on context, are not inconveniences. They are the dimensions along which the next five years of workforce transformation will actually move.

Most of the planning I see is happening in a system that is silent on all three.

The European bank I mentioned at the beginning learned this the hard way. They spent €4 million and eleven months to produce a snapshot of exactly where their workforce was standing six months earlier. But workforce planning is not about knowing where your people are standing. It is about knowing where they are capable of moving next.

It is time to stop drawing maps. And start building the engine.

The dictionary was never the engine