The human vision system is generally recognized as being superior to all known artificial vision systems. Visual attention, among many processes that are related to human vision, is responsible for identifying relevant regions in a scene for further processing. In most cases, analyzing an entire scene is unnecessary and inevitably time consuming. Hence considering visual attention might be advantageous. A subfield of computer vision where this particular functionality is computationally emulated has been shown to retain high potential in solving real world vision problems effectively. In this monograph, elements of visual attention are explored and algorithms are proposed that exploit such elements in order to enhance image understanding capabilities. Satellite images are given special attention due to their practical relevance, inherent complexity in terms of image contents, and their resolution. Processing such large-size images using visual attention can be very helpful since one can first identify relevant regions and deploy further detailed analysis in those regions only.
Bottom-up features, which are directly derived from the scene contents, are at the core of visual attention and help identify salient image regions. In the literature, the use of intensity, orientation and color as dominant features to compute bottom-up attention is ubiquitous. The effects of incorporating an entropy feature on top of the above mentioned ones are also studied. This investigation demonstrates that such integration makes visual attention more sensitive to fine details and hence retains the potential to be exploited in a suitable context. One interesting application of bottom-up attention, which is also examined in this work, is that of image segmentation. Since low salient regions generally correspond to homogenously textured regions in the input image; a model can therefore be learned from a homogenous region and used to group similar textures existing in other image regions. Experimentation demonstrates that the proposed method produces realistic segmentation on satellite images.
Top-down attention, on the other hand, is influenced by the observer’s current states such as knowledge, goal, and expectation. It can be exploited to locate target objects depending on various features, and increases search or recognition efficiency by concentrating on the relevant image regions only. This technique is very helpful in processing large images such as satellite images. A novel algorithm for computing top-down attention is proposed which is able to learn and quantify important bottom-up features from a set of training images and enhances such features in a test image in order to localize objects having similar features. An object recognition technique is then deployed that extracts potential target objects from the computed top-down attention map and attempts to recognize them. An object descriptor is formed based on physical appearance and uses both texture and shape information. This combination is shown to be especially useful in the object recognition phase. The proposed texture descriptor is based on Legendre moments computed on local binary patterns, while shape is described using Hu moment invariants.
Several tools and techniques such as different types of moments of functions, and combinations of different measures have been applied for the purpose of experimentations. The developed algorithms are generalized, efficient and effective, and have the potential to be deployed for real world problems. A dedicated software testing platform has been designed to facilitate the manipulation of satellite images and support a modular and flexible implementation of computational methods, including various components of visual attention models.