Enhancing the Aerial Image-Based Photogrammetric Pipeline Through the Integration of Existing and Novel Deep Neural Networks

This thesis focuses on integrating openly available and novel deep neural networks into the photogrammetric pipeline, aiming to enhance 3D reconstruction performance. The efforts are directed towards three key areas:

The initial research area involves a comparative analysis between Deep Feature Matcher (DFM), an open-source deep learning-based method, and Scale-Invariant Feature Transform, a traditional counterpart. This analysis necessitated the creation of an incremental Structure-from-Motion pipeline to assess method performance under ideal conditions. The findings underscore the limitations of both methods and emphasize the significance of applying deep learning in the upstream photogrammetric process of image acquisition.

A second research area addresses image utility assignment, a crucial aspect of efficient data acquisition. This is achieved by sorting images (views) based on the reconstruction performance of a deep learning-based Single-view AutoEncoder (SAE) trained on 3D reconstruction from single views. The sorted position of a view indicates its utility, facilitating improvement in downstream reconstruction tasks and augmentation of image datasets. Moreover, the high-utility views identified by this novel method exhibit robustness to noise and offer insights into the 2D to 3D image feature encoding of autoencoder-like networks.

In the final research area, the utility of sorted views is further tested, and the challenge of camera guidance in image acquisition is addressed. A novel deep learning-based model, the Graph-Best-ViewFinder (GBVF), is proposed. This model fuses a graph neural network designed for camera relocalization with a Long Short-Term Memory layer for sequential capture processing. Trained on an augmented view dataset leveraging the utility metric provided by the SAE, the GBVF provides an end-to-end full application solution. It iteratively constructs a graph embedding of the 3D scene and suggests the pose of the next best view when given a sequence of color images. An analysis of trajectory generation and 3D reconstruction performance highlights the learning capabilities of GBVF. Alongside GBVF deiii velopment, a comprehensive package named the View Planning Toolbox is created and made openly available to automate view planning dataset generation, trajectory visualization, and reconstruction coverage evaluation for researchers in the field of view planning.

Keywords: