Image selection and workflow: factors influencing the quality of 3D reconstructions

In this extended series of experiments, we investigate how the number of images, orientation and pre-processing affect the quality of 3D reconstructions using Gaussian splatting. Using interactive web demos, we compare different data sets and show how even large numbers of images can be processed efficiently and presented with high visual quality.

About the experiment

In this second series of experiments, we deepen our investigation of 3D reconstruction using Gaussian splatting. Building on the first experiment, we extend the methodology with systematic pre-processing and differently sized image datasets. The aim is to evaluate the effects of image selection, number of images and alignment methods on the quality and efficiency of the resulting splats.

Interactive Demonstrator

Headframe in the course of time

In this interactive demonstrator, you can take a virtual tour of the headframe of the German Mining Museum Bochum. The reconstructions depict both the state in 2024 and the state after extensive restoration in 2025.

Open Demonstrator

Experiment

To illustrate the results, an interactive web demonstrator was implemented with PlayCanvas. This allows users to view and compare the generated Gaussian splats directly in the browser.

Special functions of the demonstrator:

Animated transition between splats from different years: Users can seamlessly switch between reconstructions from different points in time and thus visually understand structural changes or the influence of weather and vegetation. Using the example of the headframe, the restoration carried out between summer 2023 and 2025 is recorded.
Day-night change: A simple light simulation enables the change between day and night lighting to demonstrate the spatial appearance under different lighting conditions.
Relightable Gaussian Splatting (experimental): Initial tests with a simple, initial approach to light dynamic rendering have been integrated. This allows a rudimentary light interaction within the splats and forms the basis for further work on more realistic light models in future versions.

Data basis

For this experiment, 1,909 drone images with a resolution of 5280 × 3956 pixels were recorded. The data set is made up of four series of images: orbital flights, in which the drone flew in a circle around the headframe at several altitude levels, and a top-down series, in which the headframe was captured vertically from above. In addition, two vertical flights were carried out in which the drone captured the two inner courtyards at two points each by rotating around its own axis (but these were not yet used in this experiment).

The images were first cropped to 3450 × 3000 pixels to remove vignetting and edge blurring. The data set was then reduced to 1,247 high-quality images using automatic sharpness measurement and manual evaluation.

Three different image sets were initially created:

Best100 – Selection of the 100 sharpest images
Best500 – Selection of the 500 sharpest images
All – All 1,247 images

Influence of the number of images on the reconstruction quality

3D reconstructions of the headframe in comparison

In this experiment, various reconstructions of the winding tower of the German Mining Museum Bochum are compared with each other.

Open Demonstrator

Experiment

A central aspect of the test series was the investigation of the correlation between the number of images used and the quality of the resulting 3D reconstruction. For this purpose, a demonstrator implemented with PlayCanvas was created, which enables a direct visual comparison between the splats of the data sets.

The demonstrator clearly shows that appealing models can be generated with just 100 carefully selected images. However, as the number of images increases, the density and detail of the reconstruction increases noticeably – especially with manual alignment. At the same time, however, the computing effort and training time also increase. The interactive comparison provides a clear basis for weighing up data volume, quality and processing costs for future projects.

Comparison of the reconstruction quality

A comparison of the reconstructions shows clear differences in the quality and cleanliness of the results. When using only 100 or 500 images, numerous floaters occur, particularly in the sky and at the edges of the object – i.e. splats that are not part of the actual structure of the conveyor frame, but appear as noise. These artifacts sometimes require time-consuming manual post-processing, for example by targeted removal in PlayCanvas.

In contrast, the complete set of 1,247 images offers a significantly higher density and precision of the reconstruction: floaters only occur sporadically and the scene is much better differentiated overall. Although the training time is considerably longer due to the larger amount of data, this effort is compensated for by a significantly reduced post-processing time. The trade-off between computing time and manual effort is therefore a key criterion when selecting a suitable image size.

Tools & workflow used

Input Processing

Adobe Bridge: For selecting, evaluating and editing images (white balance, cropping, etc.)
Sharp Frames: Automated image selection and removal of outliers based on image sharpness

Pre-Processing

RealityScan: The images were initially aligned automatically depending on the image set. Additional automated alignment of the The manual alignment was performed in particular for the complete data sets in order to better combine the orbit and top-down images.

Training und Rekonstruktion

Postshot: Used for the training of Gaussian splats.
- A maximum of 3 million splats with 30,000 (or 125,000 for complete data sets) training steps were generated for each image set.
- Splatting Algorithm: Splat3
- Training duration: approx. 7 hours per complete data set, Best 100 and Best 500 significantly shorter.

Post-Processing

Self-Organizing Gaussian Splatting (SOGS): Lossy compression to reduce the file size by a factor of ~20 (e.g. from ~800 MB to 30-40 MB), with a barely visible loss of quality. Implemented here experimentally using a Python package.
SuperSplat Editor: For inspection, clean-up and further reduction of 3D reconstructions.
PlayCanvas Engine: Programming the interactive demonstrators

Result & conclusion

The test series confirms the high relevance of targeted pre-processing for the quality of the resulting 3D splats. In particular, manual alignment and the targeted reduction of the input images to particularly sharp and relevant images led to significantly better results. The influence of the number of images was also visible: while the “Best100” set was very efficient to handle, the complete data sets with manual alignment produced reconstructions that came closest to reality.

Building on these findings, future research will explore the impact of point cloud density on training outcomes more systematically. One promising avenue is the comparison between training splats directly from reduced, sparse point clouds versus training with denser point clouds and applying sparsification during post-processing. This would allow for a more precise understanding of where and how reconstruction quality can be preserved or even improved while optimizing computational resources. Moreover, such comparative studies could help develop best-practice guidelines for balancing preprocessing effort, data volume, and reconstruction fidelity in practical applications.