Perspective

Your scan is actually two scans

Reality capture splits cleanly into render work and model work. They use different tools, target different audiences, tolerate different amounts of AI inference, and answer to different parts of the org. Conflating them is the most expensive mistake in the field today.

Jim Coleman

A 3D scanner produces one file. The downstream uses for that file fall into two almost-disjoint industries. Most of the budget burned on reality capture in 2026 is burned by people who don’t yet know which of the two industries they’re in.

This is a short field note on the distinction, why it matters, where each side is going, and how to tell, on the day you commission a capture, which one you’re actually buying.

The two scans

Every dataset coming off a terrestrial LiDAR, a drone-photogrammetry rig, or a mobile mapper splits, in practice, into two derived artifacts:

The render. A communication object. The audience is human. The job is to convey, accurately enough for the audience’s purpose, what the place looks like. A render can be a video, a web viewer, a VR experience, a Gaussian Splat that you fly through in a stakeholder meeting. AI-driven inference is welcome here — when the splat invents the bit you didn’t scan, that’s a feature. The quality bar is perceptual. The failure mode is “looks ugly,” “looks fake,” or “doesn’t load on the client’s hardware.”

The model. A computational substrate. The primary audience is software — BIM packages, finite-element solvers, change-detection pipelines, code-compliance checkers, energy simulators — and the engineers reading what that software outputs. The job is to represent geometry, with stated uncertainty, accurately enough that the next system can compute on it. AI-driven inference is dangerous here. A wall in the wrong place becomes a real wall in the wrong place. The quality bar is dimensional and topological. The failure mode is “someone built it.”

The same point cloud feeds both, but the workflows downstream of the cloud have almost nothing in common.

The contrast

RenderModel
AudienceHumansSoftware, then engineers
Primary unitA frame, a viewport, a flythroughA surface, a primitive, a tagged object
Inference isWelcomeAuditable at best, banned at worst
Best-in-class today3D Gaussian Splatting, NeRF, image-to-3D models like TrellisNKSR, Poisson reconstruction, scan-to-BIM with human review
Native formats.splat, .ply (gaussians), MP4, WebGL viewersIFC, OBJ + metadata, COPC, RVT, USD
StorageBig single artifact, re-rendered cheaplySmaller, versioned, signed-off
CadenceRe-render on demandVersioned, change-tracked, reviewed
Quality metricPerceived realismmm/cm error against control points
Org ownerMarketing, comms, design viz, trainingEngineering, AEC, GIS, asset operations
What kills itBoredom, low frame rate, uncanny valleyGeometric error, missing semantics, broken topology

The two columns are not just different tools. They’re different release cycles, different acceptance criteria, different headcount, different vocabulary, and increasingly different vendors.

The conflation traps

There are exactly two of them and they cost real money.

Trap one: the render artifact in a model-shaped meeting. A vendor produces a Gaussian Splat. It looks spectacular. Someone in the meeting points at a wall in the splat and asks “is that the structural column?” The honest answer is “the splat is rendering the column from photometry; the actual geometry behind that wall isn’t measured, it’s interpolated.” But everyone in the room, including the engineers, has been visually overwhelmed. A decision gets made. Three months later someone runs an actual measurement against control points and the column is 12cm off where the splat said it was. That doesn’t sound like a lot until you’re routing a chase through it.

The trap here isn’t that splats are bad. It’s that the rendered output passes the visual smell-test for “model” so thoroughly that nobody asks for the audit trail.

Trap two: the model artifact in a render-shaped meeting. The engineering team produces a clean NKSR mesh, perfect for downstream BIM. Flat-shaded, no textures, looks like an architectural drawing from 1995. The client is paying real money and was expecting the splat. They are unimpressed. The mesh ends up gathering dust in a project folder while marketing commissions a separate Gaussian Splat from a different vendor for the website. The org now owns two scans of the same building, neither of which talks to the other.

This trap costs less per incident but is much more common. It’s also the one that quietly erodes the perceived ROI of the entire reality-capture program.

The cleanest indicator that a client has been bitten by either trap: they own multiple scan-derived artifacts of the same site, produced by different vendors, that nobody can reconcile against each other.

What “filling in the shadows” actually means in each branch

A frequent buyer question — the one that prompted this note — is “can AI fill in the bits the scanner couldn’t see?” The answer depends entirely on which branch you’re asking about.

On the render branch: yes, and the field is moving fast. 3D Gaussian Splatting (Kerbl et al., SIGGRAPH 2023) and its 2024-2025 follow-ons (2DGS, SuGaR, Frosting, gsplat) train a photometric scene representation from imagery + sometimes LiDAR, and the renderer fills shadows with whatever pixels are most consistent with the imagery you fed it. The result is striking. Trellis (Microsoft, late 2024) and the LRM-family models (Meta/Adobe) take a small handful of images and produce a complete object. For a “let me walk through this place” deliverable, this is the bucket to be in.

On the model branch: mature within reason, generative beyond it. For interpolating across small geometric gaps in surfaces you mostly captured, NKSR (NVIDIA, SIGGRAPH 2023) and classical Poisson reconstruction both produce defensible meshes. They will not invent the back of a building you didn’t scan. For true generative completion of unobserved structure, the published research (DiffComplete, LION, scene-scale diffusion) is research-grade and not suitable for engineering decisions. The honest production answer is: you scan it, or you don’t have it.

This asymmetry is itself a useful design rule: if a deliverable depends on geometry that wasn’t measured, it belongs on the render branch. Putting it on the model branch creates exactly the kind of plausible-but-fake artifact that causes Trap One.

How to tell which one you’re buying, on day one

A short checklist that mostly works:

  1. Who reads the output? A person scrolling on a phone, or a piece of software with a schema?
  2. What happens if the geometry is 5cm wrong? Nothing visible, or someone re-routes a pipe?
  3. What does “done” look like? A signed-off file with a quality report, or a video everyone says “wow” at?
  4. What’s the version-control story? A render gets re-made; a model gets versioned, signed, and audited.
  5. Who owns the budget? Engineering and ops want models. Marketing, sales, design viz, and training want renders.

If the answers point cleanly to one column, build for that column and don’t try to make the artifact serve the other one. If they’re genuinely split — a hospital expansion project might legitimately need both — the right move is to derive two artifacts from one capture rather than one artifact that hedges.

The “one artifact that hedges” pattern is what produces the BIM exchange that doesn’t look good in marketing renders, and the Gaussian Splat that nobody can derive a quantity surveyor’s takeoff from. Both ends of the org are unhappy with the artifact and the capture program gets blamed.

Where AI fits in each branch

Different shape entirely.

On the render branch, AI is the engine. Without modern generative methods you have a flythrough of static point cloud and that’s about it. Gaussian Splatting, neural rendering, image-to-3D, and the stack of view-synthesis methods are the deliverable. The AI is doing the work end to end and the human is curating the output.

On the model branch, AI is a labour-saving prep step in a longer pipeline. Semantic segmentation (the kind we ran in the Mask3D experiment) gives you a tagged substrate. Surface reconstruction gives you a mesh. Plane and primitive fitting gives you something that can be exchanged with BIM. Each step uses some AI; none of them are end-to-end AI; and every step’s output is auditable against the underlying measurements. The engineering judgement about which AI to apply, and how to verify its output, is where most of the actual professional work lives.

A heuristic: on the render branch, more AI is usually better. On the model branch, more AI without a corresponding audit layer is usually worse.

A worked case from the lab

The scan we’ve been classifying with Mask3D is a model-branch project end-to-end. The classification is the labelled substrate another piece of software can compute on — a BIM exchange, a change-detection baseline, an energy or daylighting model. Everything in the experiment — the segmentation model, the CSF/HAG rules, the building-footprint sanity check, the 2D foundation-model fusion — is model-side work. The 98.4% classification number makes sense because every label is traceable back to measured points in the actual cloud, not hallucinated from photometry.

The same source .las file would also support a render-branch project — train a Gaussian Splat from the X9’s onboard imagery and the LiDAR as a depth prior, ship a navigable WebGL splat for a client demo. That would be a beautiful artifact. It would be a different deliverable, a different storage format, a different review process, and a different audience. We have not built it yet because we didn’t need it for the model-branch story; it’s the next planned experiment specifically to put the two artifacts side by side.

If we were doing both for a client, the right structure would be: one capture, two pipelines, two artifacts, two acceptance criteria, two owners. Not one deliverable that tries to be both.

The consulting take

For most of the buyers we talk to, the practical implications come down to three:

  1. Decide which scan you’re buying before the scanner is on site. The capture parameters — pose density, image overlap, control-point regime — are different for render-optimised vs. model-optimised work. A capture optimised for a Gaussian Splat is over-parameterised for a BIM exchange and vice versa. You can do both from one capture, but only if you plan for both up front.
  2. Match the vendor to the branch. A reality-capture firm that excels at scan-to-BIM is rarely the same firm that produces gallery-quality splats. Both are legitimate disciplines. Buying the wrong one for the job is a category error, not a quality issue.
  3. Audit AI inference on the model branch ruthlessly. If a deliverable on the model branch contains generative content, it has to be tagged as such, with explicit uncertainty, and reviewed before it informs a real-world decision. The next decade of reality-capture liability case law will be about teams that didn’t do this.

This is the framework we walk new clients through before we touch a scanner or a model. Once the render-vs-model split is clear, the rest of the engagement — tool choice, vendor selection, acceptance criteria, review cadence — falls out of it almost mechanically.

If you’re sitting on capture data and aren’t sure which branch you’re on, that’s a useful conversation to have early. The longer the artifact lives without an explicit branch assignment, the harder it gets to retrofit one.

Newsletter

More like this in your inbox

Subscribe for new essays on AI product strategy and the prototype-to-production gap. Roughly monthly. No filler.

One-click unsubscribe. I never share your email.

Contact

Working through this in your team?

If this resonates with where your team is, that's usually a good time to talk.

Goes straight to my inbox. Or email coleman.jamese@pm.me.