How We Score Interactive 3D Apps: Our Evaluation Criteria Explained
Every app in Interact Gallery carries two scores: a Performance score and a UX score. Together they make up the Overall score you see at the top of each app page. But what do these numbers actually measure — and why do we score apps the way we do?
This post walks through all ten sub-criteria behind both dimensions. Understanding them will help you read our reviews with more nuance, and may give you a useful framework if you're evaluating or building a 3D configurator yourself.
The Two Dimensions
We split evaluation into two independent dimensions on purpose. A technically flawless app can still be frustrating to use, and a beautifully designed experience can fall apart if it takes 30 seconds to load on a mobile connection. Separating Performance from UX lets you see which problem an app has — not just that it has one.
Each dimension is the average of five sub-scores, all rated from 1 to 5.
Performance (5 Criteria)
Performance captures everything that sits between the user's intent and the app's ability to execute it smoothly. It's mostly invisible when it's good, and impossible to ignore when it isn't.
1. Stability
The most fundamental check: does the app work reliably? We watch for WebGL context losses, unexpected crashes, configuration states that produce broken 3D renders, options that silently fail to apply, and memory leaks that degrade the experience over a longer session. A score of 5 here means the app ran without incident across multiple testing sessions and option combinations. A 1 means the app consistently failed in ways that blocked normal use.
2. Initial Load Feel
3D assets are heavy. The question isn't whether there's a loading phase — there almost always is — but how well the app manages the perception of that wait. Does it show a meaningful loading indicator with real progress? Does the UI appear and become usable before every asset is resolved? Does something render quickly so the user isn't staring at a blank screen? Apps that stream assets progressively, use skeleton placeholders, or show a partial scene while the rest loads score significantly higher than those that block behind a long opaque spinner.
3. Responsiveness
Once the app is loaded, how quickly does it react to input? This covers camera control smoothness (orbiting, panning, zooming), the latency between selecting an option and seeing the 3D model update, and frame rate consistency during interaction. A responsive configurator feels like direct manipulation of a physical object. A low-scoring one feels like submitting a form and waiting for a page refresh.
4. Asset Strategy
Efficient asset delivery is what separates well-engineered configurators from expensive demos that happen to work. We look at whether assets are compressed appropriately (Draco, KTX2, WebP), whether the app loads only what it needs for the current configuration state rather than everything upfront, whether level-of-detail (LOD) techniques are used for complex geometry, and whether switching between options triggers unnecessary full reloads. Good asset strategy is invisible to the end user but has an enormous effect on load time, battery life on mobile, and scalability to complex product catalogs.
5. Feedback & Constraints
Does the app communicate clearly when something is happening in the background? When a user selects an incompatible option, does the app explain why it's unavailable — or does it silently do nothing? When an async operation is in flight, is there a visual indicator? This criterion rewards apps that set accurate expectations: clear loading states during option switches, disabled states with explanatory tooltips, and honest communication when a feature requires a slower operation like fetching a new model variant from the server.
UX (5 Criteria)
UX captures the design quality of the configurator as an interactive product. This is where we ask whether the app actually helps users accomplish their goal — usually understanding and personalising a product — with minimum friction.
6. Mobile
A substantial share of product research happens on phones, yet many 3D configurators are clearly designed for desktop and left to fend for themselves on mobile. We test every app on a real mobile device and score the layout, touch control quality, legibility of option panels, and whether the 3D viewport stays usable on a small screen. Pinch-to-zoom on a 3D model, swipe-to-rotate, and a UI that doesn't require precise tap targets are table stakes for a high score here.
7. Interactivity
The quality of the 3D interaction itself. Can the user freely orbit the model, zoom into fine details, and examine the product from every angle? Are there guided camera paths that highlight key features? Do animations — opening a door, extending a component, demonstrating a mechanism — fire at the right moments? This criterion is about whether the 3D medium is being genuinely used to inform — or whether the app is just a slowly-rotating render that could have been a photograph.
8. Clarity
Can a first-time user understand what to do without reading instructions? Is the option panel organised in a logical sequence? Are option labels self-explanatory, or do they rely on internal product codes? Do selected states feel selected? This criterion covers information architecture, label writing, visual hierarchy, and the overall learnability of the configurator's interface. A 5 here means we handed the app to someone unfamiliar with the product and they figured it out within seconds.
9. Findability
Related to but distinct from clarity — findability is about whether users can discover the full scope of what's configurable. Many configurators hide powerful options behind collapsed panels, non-obvious tabs, or interaction patterns that aren't signposted. We ask: if this app has 40 configurable parameters, can a motivated user find all of them? Do navigation patterns match the mental model of someone choosing a product? High-scoring apps make the complete feature set feel discoverable; low-scoring ones routinely leave users unaware of options they would have wanted.
10. Decision Aids
The best configurators don't just show options — they help users choose between them. Decision aids are any feature that reduces the cognitive burden of making a choice: side-by-side comparison views, real-time price updates, material zoom previews, augmented reality placement so you can see how a sofa fits your living room, dimension overlays for furniture, compatibility warnings that prevent mismatched selections, and AI-assisted recommendation flows. This is often the criterion that separates a configurator that converts customers from one that looks impressive but sends people back to a sales rep.
What the Scores Don't Cover
Our scoring is limited to what's observable from the browser. We don't evaluate backend integrations, CPQ logic complexity, rendering pipeline architecture, or business conversion metrics. An app can score well on our rubric and still have a poor quote-to-order rate for reasons unrelated to the interactive experience. Equally, a technically modest app can perform brilliantly in a sales context if it's deployed thoughtfully and supported well.
The scores are also a snapshot. Apps improve — and sometimes regress — over time. We flag the last-updated date on each review and revisit apps when we're aware of significant changes.
Why a Rubric at All?
Structured scoring makes comparisons possible across very different products and industries. Without a rubric, every review is just an opinion. With one, you can meaningfully compare a luxury watch configurator against a flat-pack furniture builder — not to say one is better than the other, but to surface what each does well and what it trades away. That's useful both for buyers evaluating solutions and for teams building their own.
If you have questions about how a specific score was assigned, or want to suggest refinements to the methodology, we'd love to hear from you.