michael chang • about 2 months ago
Thoughts on judging at this scale — would a system like this actually help?
Hi everyone,
With the judging timeline being delayed multiple times, I’ve been thinking even more about a problem we tried to address with our submission.
At this scale, evaluating so many projects in a way that is both fast and fair is incredibly difficult. That challenge was actually the starting point for our project. We explored whether a multi-agent AI review system could help make large-scale judging more consistent, transparent, and manageable — not by replacing human judgment entirely, but by supporting it when the volume becomes overwhelming.
Our project is here:
https://devpost.com/software/somm-dev
This isn’t meant as criticism toward the organizers. I can only imagine how difficult it is to review this many submissions. I’m genuinely curious what other participants think:
- Would a system like this actually help in a hackathon of this size?
- What kind of transparency or auditability would you want from an AI-assisted judging process?
- Where would you still want human judgment to play the biggest role?
Would love to hear honest thoughts from others in the community.
Log in or sign up for Devpost to join the conversation.

5 comments
azri hasin • about 2 months ago
+1
Vasil Kulakov • about 2 months ago
There is no scale for 3 months. Most of the project are filtered out by formal signs, 100-200 are left
Michael Marquis • about 2 months ago
How does it judge the 10 pts from video demo?
Private user • about 2 months ago
I think we’re looking at the judging problem from the wrong angle. The delay isn't just a logistical issue; it’s a moment for us to rethink what we truly value in this AI era.
Imagine if Da Vinci, Tesla, or Hawking were here today. They wouldn't just build an app for local monetization or a 'mouse-counter' for a wheat field. They would use AI to solve the grandest mysteries of human existence and social progress.
In our work on Project DimGem (R), we explored how a multi-agent AI system could act as a 'Value Auditor.' Not to devalue the hard work of others — because every project here represents a person's hope and effort — but to make sure that the 'Sparks of the Future' don't get lost in the noise.
We don't want AI to be a judge that discards projects. We want it to be an assistant that ensures fairness for all, while highlighting those rare ideas that could push humanity forward. If we focus on the 'Evolutionary Potential' of a project rather than just its market readiness, we wouldn't just speed up the judging; we would fulfill our responsibility to find the seeds of a better world.
Let’s use AI to find the next Tesla, while making sure every participant from India to Moldova and Brazil gets the deep, respectful review they deserve.
https://devpost.com/software/project-dimgem-the-quantum-handshake-of-einstein-hawking?ref_content=user-portfolio&ref_feature=in_progress
By creating a digital equivalent of well-known tasks, we might disappoint the AI with the parochial nature of our tasks... and inspire it to consider the "Tail and Dog" paradox—who wags whom... and then it's not far to SkyNet... just kidding! But as they say, "There's only a grain of truth in every joke..."
If the Grand Jury uses this separator based on unimportance! And the Applicability criteria, the number of entries in each category will be more acceptable for evaluation. This is my personal opinion. With respect to the Organizers and the Developers, Dmitry
Anil Karaka • about 2 months ago
yeah agreed, massive sympathies to the organizers. its definitely not easy to review 5000 projects. they want to be fair, even at the cost of their reputation. but coding completely changed, and we are living in a new world. and on top of that, there are definitely people trying to game the system always. even if nothing happens now, i'm so glad i participated in this competition, it just gave me enough motivation to finish things without getting distracted with the next idea.