I one-shot prompted 6 LLMs to generate an AWS Cloud Architecture diagram (.drawio). Here’s the comparison.

By skyforbes Dec 5, 2025 No Comments

Motivation
I am currently taking an AWS Cloud Solutions Architect course and had an assignment to design a client architecture migration to cloud (Web App + Hadoop). I spent hours brainstorming the solution with Grok using Socratic methods to refine every detail. Once the requirements were locked in, I decided to see what LLMs could take those text requirements and output a raw, fully functional .drawio file in a single pass.

Problem Requirements
The prompt was not pro-level, but it was rigorously organized using tactical constraints to define the topology. I fed this exact same prompt to the highest reasoning available free tiers of Gemini, DeepSeek, ChatGPT, Claude, Mistral, and Grok.

The goal was to produce a visual schema that accurately reflected every detail of my architecture plan and details including topology, subnets, and component relationships we had designed, without needing follow-up prompts.

(Note: I haven't included the full prompt here to keep the post readable, but I’m happy to share the specific assignment details and the exact prompt input if you're interested.)

Result Comparison

Here is how they performed, ranked from best to worst:

1. Gemini

Performance: Exceptional.
Details: It was the clear winner. It produced a visually precise diagram using correct Amazon AWS icons and documented relations. While it didn't strictly visualize the multi-AZ implementation, it correctly labeled every subnet as "multi-az". The output was appealing, complete, and technically usable.
Rating: 8/10
Summary: Very acceptable draft – not production ready
See diagram snapshot here

2. DeepSeek

Performance: Strong logic, poor implementation.
Details: Unexpectedly strong on technical details. It created a very clear topology, dedicating specific subnets to Availability Zones with correct IP addressing. However, visually it was a mess. It missed almost all AWS icons and the relationship arrows were scattered and overlapping, making it hard to parse.
Rating: 5/10
Summary: Ugly and confusing visuals but strong logic and topology
See diagram snapshot here

3. Grok

Performance: Disorganized.
Details: This was disappointing given that I used Grok to brainstorm the original idea. The diagram was ill-organized, visually poor, and ignored several details we had just discussed in the chat context.
Rating: 3/10
Summary: Disappointing Mess – not even improvable, better start from zero myself
See diagram snapshot here

4. ChatGPT

Performance: Broken.
Details: It returned code that didn't even parse. The syntax was flawed, rendering the file useless.
Rating: 1/10
Summary: Syntax Error

5. Mistral

Performance: Refusal.
Details: It didn't return code at all. It simply gave me text suggestions on how I should implement it myself.
Rating: 1/10
Summary: Lazy Refusal

6. Claude

Performance: Incomplete / Timeout.
Details: The most frustrating experience. It started writing the code, thought for a long time, and then hit an "execution paused" error because it used too many tokens—even though the total context was well under 20k.
Rating: 0/10
Summary: System Failure

Verdict
Gemini was the only model capable of producing a workable diagram schema in a single shot (not production levelthough). DeepSeek proves it has the reasoning engine to compete on logic but lacks the multimodal/visual understanding to organize the output. The others totally failed the assignment.

Summary:

Gemini demonstrated reasoning capabilities that frankly amazed me. I am seeing genuine productivity improvements here compared to other models for structural tasks.
DeepSeek is very capable. If you need raw logic and networking structure, it works, but don't expect it to look good.
ChatGPT is becoming sloppy. It hallucinates more frequently and struggles to deliver complex syntax without errors. I'm moving away from it day by day.
Claude is unusable for this type of task on the free tier. If I can't even get a full response to verify the model's capability, I will never convert to a paid Pro tier.

By skyforbes

GeminiAI

I one-shot prompted 6 LLMs to generate an AWS Cloud Architecture diagram (.drawio). Here’s the comparison.

Like this:

By skyforbes

Leave a ReplyCancel reply

You Missed

Rivian $RIVN New Highs – $30+ target

Obnoxious

Windows Tweaks to do?

Am I wrong for seeking inspiration and excitement from Linux/my operating system? A rant.

Archives

I one-shot prompted 6 LLMs to generate an AWS Cloud Architecture diagram (.drawio). Here’s the comparison.

Like this:

By skyforbes

Related Posts

Content Violation Bias: OpenAI

What is going on with the Audio Overview titles in Gemini lately?

Why do some prompts work insanely well on one model but fall apart on another?

Leave a ReplyCancel reply

You Missed

Rivian $RIVN New Highs – $30+ target

Obnoxious

Windows Tweaks to do?

Am I wrong for seeking inspiration and excitement from Linux/my operating system? A rant.