
I am currently taking an AWS Cloud Solutions Architect course and had an assignment to design a client architecture migration to cloud (Web App + Hadoop). I spent hours brainstorming the solution with Grok using Socratic methods to refine every detail. Once the requirements were locked in, I decided to see what LLMs could take those text requirements and output a raw, fully functional
.drawio file in a single pass.
Problem Requirements
The prompt was not pro-level, but it was rigorously organized using tactical constraints to define the topology. I fed this exact same prompt to the highest reasoning available free tiers of Gemini, DeepSeek, ChatGPT, Claude, Mistral, and Grok.
The goal was to produce a visual schema that accurately reflected every detail of my architecture plan and details including topology, subnets, and component relationships we had designed, without needing follow-up prompts.
(Note: I haven't included the full prompt here to keep the post readable, but I’m happy to share the specific assignment details and the exact prompt input if you're interested.)
Result Comparison
Here is how they performed, ranked from best to worst:
1. Gemini
- Performance: Exceptional.
- Details: It was the clear winner. It produced a visually precise diagram using correct Amazon AWS icons and documented relations. While it didn't strictly visualize the multi-AZ implementation, it correctly labeled every subnet as "multi-az". The output was appealing, complete, and technically usable.
- Rating: 8/10
- Summary: Very acceptable draft – not production ready
- See diagram snapshot here
2. DeepSeek
- Performance: Strong logic, poor implementation.
- Details: Unexpectedly strong on technical details. It created a very clear topology, dedicating specific subnets to Availability Zones with correct IP addressing. However, visually it was a mess. It missed almost all AWS icons and the relationship arrows were scattered and overlapping, making it hard to parse.
- Rating: 5/10
- Summary: Ugly and confusing visuals but strong logic and topology
- See diagram snapshot here
3. Grok
- Performance: Disorganized.
- Details: This was disappointing given that I used Grok to brainstorm the original idea. The diagram was ill-organized, visually poor, and ignored several details we had just discussed in the chat context.
- Rating: 3/10
- Summary: Disappointing Mess – not even improvable, better start from zero myself
- See diagram snapshot here
4. ChatGPT
- Performance: Broken.
- Details: It returned code that didn't even parse. The syntax was flawed, rendering the file useless.
- Rating: 1/10
- Summary: Syntax Error
5. Mistral
- Performance: Refusal.
- Details: It didn't return code at all. It simply gave me text suggestions on how I should implement it myself.
- Rating: 1/10
- Summary: Lazy Refusal
6. Claude
- Performance: Incomplete / Timeout.
- Details: The most frustrating experience. It started writing the code, thought for a long time, and then hit an "execution paused" error because it used too many tokens—even though the total context was well under 20k.
- Rating: 0/10
- Summary: System Failure
Verdict
Gemini was the only model capable of producing a workable diagram schema in a single shot (not production levelthough). DeepSeek proves it has the reasoning engine to compete on logic but lacks the multimodal/visual understanding to organize the output. The others totally failed the assignment.
Summary:
- Gemini demonstrated reasoning capabilities that frankly amazed me. I am seeing genuine productivity improvements here compared to other models for structural tasks.
- DeepSeek is very capable. If you need raw logic and networking structure, it works, but don't expect it to look good.
- ChatGPT is becoming sloppy. It hallucinates more frequently and struggles to deliver complex syntax without errors. I'm moving away from it day by day.
- Claude is unusable for this type of task on the free tier. If I can't even get a full response to verify the model's capability, I will never convert to a paid Pro tier.
