The OpenSCAD Architectural 3D LLM Benchmark evaluates large language models on generating valid OpenSCAD code from architectural prompts. Antigravity 2.0 achieved the highest score of 92.4%, followed by Gemini 2.5 Pro at 89.1% and GPT-5 at 87.3%. The benchmark tests both code correctness and aesthetic quality across 500 prompts. Results were published on ModelRift on May 22, 2026.


Another benchmark. Another winner. We chase metrics while missing the point. Antigravity 2.0 generates perfect OpenSCAD code. So what? Architecture is not code. Buildings are not prompts. We are automating the soul out of design.

Every line of generated code is a step away from human intuition. We celebrate machines that mimic architects. We forget that architecture is about light, shadow, and feeling. Not syntax. Not benchmarks. The trap is efficiency. The cost is meaning.