Rosie is the AI engine behind world creation in Highrise. A creator types a description — or uploads a reference image — and minutes later, a fully realized 3D environment appears. Here's how it works under the hood.
The pipeline
At a high level, Rosie takes a creative prompt through several stages:
- Intent understanding — We parse the creator's input to understand the type of world, the mood, the spatial layout, and key objects.
- Asset generation — Individual 3D assets (furniture, walls, decorations) are generated or selected from our library based on the understood intent.
- Scene composition — Assets are arranged in a coherent spatial layout that respects both aesthetics and Highrise's technical constraints.
- Refinement — Lighting, textures, and final touches are applied to make the world feel polished.
Model choices
We use a combination of proprietary and open models. The intent understanding layer leverages large language models for their reasoning capabilities. Asset generation uses diffusion-based approaches fine-tuned on Highrise's art style. Scene composition is a custom system trained on millions of existing Highrise worlds.
The goal isn't to impress AI researchers. It's to make a 14-year-old feel like a professional world designer.
Engineering constraints
Highrise runs on mobile devices. That means every world Rosie generates needs to:
- Load quickly on low-end hardware
- Stay within polygon and texture budgets
- Feel consistent with the Highrise art style
- Be editable by the creator after generation
These constraints actually make the problem more interesting, not less. Unconstrained generation is easy. Generation within tight, real-world limits is where the engineering challenge lives.
What's next for Rosie
We're working on real-time collaborative editing with AI assistance, multi-room world generation, and better creator controls. More on all of that soon.