I’ve been using 3-4 different LLM’s and find the best manner of getting accuracy is to pit them against each other for fact checking, etc… which led me to think/question:
Is there any service that will automatically source feedback from other LLM’s and feed that to Gemini for a final product?
For example, I recently researched some case law… Gemini provided the best output, but Grok Heavy found some things it didn’t while also excluding others. Sent Gemini’s findings to Grok, asked Grok to stress test the claims and audit findings, Grok improved. Sent Grok to Gemini, same deal. Then sent this to Claude, had Grok/Gemini analyze, and end product was best once again with Gemini.
So this filtering process I’ve found to be very effective, but does anyone know of a tool that will autonomously QA outputs like this? If not, how hard realistically would it be to make?