Google’s Vantage Experiment: Using GenAI to Grade Soft Skills Like Critical Thinking

Google Research just dropped something interesting — a research experiment called Vantage that uses generative AI to assess what they call “future-ready” skills: critical thinking, collaboration, creative thinking, that whole durable competencies bucket that won’t be automated away.

They partnered with NYU on this, and the results are surprisingly solid. The AI scoring was on par with human experts in their study. Vantage is now available on Google Labs for sign-up, at least in English.

The hard problem of measuring soft skills

Here’s the thing about future-ready skills: everyone agrees they matter, but nobody has figured out how to test them at scale. Multiple choice won’t cut it when you’re trying to assess whether someone can build on others’ ideas or handle conflict in a group setting.

Standardized tests are too rigid. Real human interaction is the gold standard, but good luck standardizing that across hundreds of students. How do you fairly grade someone’s conflict resolution skills if their group never disagrees? Or their creative collaboration if everyone just nods at the first idea?

This is the problem Vantage is trying to solve. Instead of forcing students into artificial test scenarios, it drops them into simulated conversations with AI avatars. Think preparing for a debate or pitching a creative vision — messy, realistic scenarios where skills actually get exercised.

How Vantage works

The setup is clever. You’re placed in a multi-party conversation with AI avatars. An “Executive LLM” runs the show, using a rubric to steer the conversation. It dynamically introduces challenges — pushing back on an idea, creating conflict — to give you opportunities to demonstrate your skills.

It’s basically an adaptive assessment engine that keeps the dialogue going until it has enough information to grade you. The AI avatars aren’t just passive listeners; they’re active participants designed to elicit specific behaviors.

I’ve seen similar approaches tried before — AI-driven roleplay for training — but this one is specifically built for assessment, not just practice. That’s a meaningful distinction.

The methodology borrows from how we assess core academic subjects like math or science. Same systematic approach, just applied to squishier skills.

Is this actually useful?

Look, I’m not going to pretend this is perfect. The idea of an AI grading your “soft skills” makes me uncomfortable in ways I can’t fully articulate. There’s something fundamentally human about these competencies that resists quantification.

But here’s the reality: schools are already bad at teaching and measuring these skills. Something is better than nothing, especially if it gives students a sandbox to practice in a low-stakes environment before they’re in real situations.

The fact that the AI scoring matched human experts is promising, but I’d want to see more independent validation. Google’s own research showing their tool works is table stakes.

The bigger picture

What’s interesting to me is how this fits into the broader trend of AI being used for education beyond just content delivery. We’ve seen AI tutors, AI writing assistants, now AI assessors for skills that were previously considered unmeasurable at scale.

The OECD and WEF have been pushing frameworks around future-ready skills for years. The question has always been how to operationalize them. Vantage is one answer, but it raises uncomfortable questions about what we lose when we let machines judge human interaction.

For now, Vantage is a research experiment. Sign up on Google Labs if you’re curious. But I’d approach the results with healthy skepticism until we see third-party studies and real-world deployment data.

One thing’s for sure: this space is going to get a lot more attention as AI continues to evolve. Whether that’s a good thing depends on how thoughtfully we implement it.

Google’s Vantage Experiment: Using GenAI to Grade Soft Skills Like Critical Thinking

The hard problem of measuring soft skills

How Vantage works

Is this actually useful?

The bigger picture

Comments (0)