Show Summary Details
More options …

# Open Physics

### formerly Central European Journal of Physics

Editor-in-Chief: Seidel, Sally

Managing Editor: Lesna-Szreter, Paulina

IMPACT FACTOR 2018: 1.005

CiteScore 2018: 1.01

SCImago Journal Rank (SJR) 2018: 0.237
Source Normalized Impact per Paper (SNIP) 2018: 0.541

ICV 2017: 162.45

Open Access
Online
ISSN
2391-5471
See all formats and pricing
More options …
Volume 15, Issue 1

# Comparisons of feature extraction algorithm based on unmanned aerial vehicle image

Wenfei Xi
/ Zhengtao Shi
• Corresponding author
• College of Tourism and Geographic Sciences, Yunnan Normal University, Kunming 650050, Yunnan, China
• Email
• Other articles by this author:
/ Dongsheng Li
Published Online: 2017-07-03 | DOI: https://doi.org/10.1515/phys-2017-0053

## Abstract

Feature point extraction technology has become a research hotspot in the photogrammetry and computer vision. The commonly used point feature extraction operators are SIFT operator, Forstner operator, Harris operator and Moravec operator, etc. With the high spatial resolution characteristics, UAV image is different from the traditional aviation image. Based on these characteristics of the unmanned aerial vehicle (UAV), this paper uses several operators referred above to extract feature points from the building images, grassland images, shrubbery images, and vegetable greenhouses images. Through the practical case analysis, the performance, advantages, disadvantages and adaptability of each algorithm are compared and analyzed by considering their speed and accuracy. Finally, the suggestions of how to adapt different algorithms in diverse environment are proposed.

PACS: 89.20.Bb; 89.20.Ff

## 1 Introduction

Unmanned aerial vehicles (UAV) have become a fast, efficient, low-cost and flexible remote sensing data acquisition systems to get the images with high-resolution [13]. If UAV images are used for aerial triangulation, we can carry out the geometric reconstruction, and obtain the digital surface model and the point cloud data [4], which can be widely used in fileds like archaeology, vegetation monitoring, traffic monitoring, disaster management and 3d reconstruction, etc [511]. In the process of unmanned aerial vehicle (UAV) model geometry reconstruction, it is a very critical step to extract and optimize the matching points in aerial triangulation [12]. In traditional aerial photography, the commonly used matching feature point extraction methods can be roughly divided into two categories: the matching based on pixel gray value, and the matching based on feature [13]. In traditional and standard aerial triangulation, it’s very convenient to obtain the matching points by using the method based on pixel gray value. However, there are a lot of problems in unmanned aerial vehicle (UAV) imaging, such as irregular image overlapping, or large rotation deviation angle, so some difficulties will be encounterd when the common method and software are used for the UAV images. in other words, a great deal of labor force and material resources are needed [1416].

In computer vision and photo grammetry field, there are lots of feature point extraction operators are proposed. For example, Dicksheid and Förstner compared and analyzed image orientation, performance of these feature extraction operators. However, they only used the low resolution close range images and analog images in the experiment [17]. Lowe applied SIFT algorithm to machine vision, image matching, image retrieval fields, etc. [18]. Wang, Zhang compared and evaluated the performance of several commonly used operators [19, 20]. However, they all only compare the image data of building images, without refering to other objects.

Based on the previous research, this paper uses SIFT operator, Forstner operator, Harris operator and Moravec operator to extract the feature image data from different objects and analyzes the data. Some mathematical formula used in this article can refer to [2123].

## 2.1 SIFT algorithm

The SIFT algorithm, proposed by David G. Lowe, is a feature-based matching algorithm based on scale space, mainly using Gaussian differential pyramid model [24]. The Gaussian pyramid is essentially a multi-scale representation of the signal, Gaussian blurring the same signal or picture many times, and down-sampling to produce multiple sets of signals or pictures at different scales for subsequent processing.

The application of SIFT matching algorithm in the photo grammetry has drawn more and more attention. Its superior performance has greatly enhanced the degree of automatic orientation in close range photography and unconventional aerial photography (like UAV images). In this way, it still can carry out automatic aerial triangulation in the lack of GPS/IMU data support [25].

## 2.1.1 The establishment of Gaussian differential pyramid

On the basis of a series of reasonable assumptions, Koenderink and Lindeberg proved that the Gaussian function is probably the only spatial kernel scale to realize image scale conversion. One tile of image spatial scale can be defined as: $L(x,y,s)=G(x,y,s)⋆I(x,y),$(1) where represents the sum of convolution operation, (x, y) denotes the space coordinates of pixel, s stands for scale coordinates of pixel, and G(x, y, s) is scale variable Gaussian kernel function, which is defined as: $G(x,y,s)=12ps2e−(x2+y2)/2s2.$(2)

David G. Lowe raised that it uses Gaussian difference operator to establish image Gaussian difference scale space. It is able to obtain Gaussian difference images D(x, y, s) with different scales by using the Gaussian difference kernels and image convolution on different scales. All Gaussian difference images together make up the Gaussian difference pyramid of the image. $D(x,y,s)=(G(x,y,ks)−G(x,y,s))⋆I(x,y)=L(x,y,ks)−L(x,y,s).$(3)

Construction process of image Gaussian difference pyramid is shown in the left part of Figure 1. In the group, the images from the upper layer are convoluted with the Gaussian functions of different scales to obtain the spatial images in which the constant scale factor is increased k times. In other words, the scale of first layer is s, the scale of the second layer is ks, and the next layer follows this rule. The image of the first layer of each group is obtained from the first layer sample of the previous set of images. The size of the image after sampling is 1/4 of the size of the images of the previous group, but scale size is twice of the original one. In this way, the Gaussian difference pyramid is constructed. The right part of Figure 1 describes the process how to construct Gaussian pyramid. In the group, Gaussian difference image is obtained after the subtraction between adjacent images. Then, after repeating the previous operations on the full set of images, we get the Gaussian difference pyramid.

Figure 1

The generation of Gaussian difference pyramid

## 2.1.2 Generation of feature point

After the key point SIFT feature area is generated, firstly the axis direction is rotated to the main direction of the key points, which ensures the rotation invariance. Then we select the neighboring window with the key as the center, as shown in the left part of Figure 2. The central pixel is the key point to be described. The surrounding little grids represent the pixels of the neighboring key points; the length of the arrow represents the gradient value of the pixel and the direction of the arrow represents the gradient direction of the pixel. The circles in the figure represent the range of Gaussian weighting of pixel values. As shown in right part of Figure 2, in every sub block, if we calculate the gradient direction histogram in eight directions, we can get a seed point. When all the seed points are obtained, we get a descriptor.

Figure 2

Feature vector generated by critical point neighborhood gradient information

## 2.2 Forstner operator [26]

First, the Robert gradient and the gray covariance matrix of each pixel are calculated. Then the feature points, whose error ellipse is the smallest and closest to the round, are searched as the chracteristic point.

The process is described as follows:

1. Calculate the Robert gradient for each pixel $gu=∂g∂u=gi+1,j+1−gi,jgv=∂g∂v=gi,j+1−gi+1,j$(4)

2. Calculate |x| covariance matrix of the gray value in the window $Q=N−1=∑gu2∑gugv∑gvgu∑gv2−1,$ where $∑gu2=∑i=c−kc+k−1∑j=r−kr+k−1(gi+1,j+1−gi,j)2,∑gv2=∑i=c−kc+k−1∑j=r−kr+k−1(gi,j+1−gi+1,j)2,∑gugv=∑i=c−kc+k−1∑j=r−kr+k−1(gi+1,j+1−gi,j)(gi,j+1−gi+1,j).$(5)

3. Calculate the interest values of q and w: $w=1trQ=DetNtrN,q=4DetN(trN)2,$(6) where DetN represents the determinant of N, and trN represents the trace of matrix N.

4. Determine candidate points $Tq=0.5∼0.75Tw=fw¯(f=0.5∼1.5)cwc(c=5)$(7) When q > Tq and w > Tw at the same time, it confirms to be the candidate point.

5. Select extremum points

The extreme points are selected based on the weights, that is, the largest candidate points are selected in an appropriate window, and the rest of the remaining points are eliminated.

## 2.3 Harris operator

Harris operator, presented by Harris and Stephens, is a signal-based feature extraction operator [27].

The process is stated as follows: $M=G(s~)⊗gxgxgygxgygy,R=Det(M)−k⋆tr2(M),k=0.04∼0.06,$(8) where gx is the gradient of x axis direction, gy is the gradient of y axis direction, $\begin{array}{c}G\left(\stackrel{~}{s}\right)\end{array}$ is Gaussian template, det is matrix determinant, tr is matrix trace, k is a constant, R represents the interest value of the corresponding pixels in the figure.

## 2.4 Moravec operator

Moravec operator, proposed by Moravec, uses the characteristics of gray variance to extract feature point operator [28].

1. Calculate the pixel IV (Interest Value). In the image window with w × w as the center of the pixel (m, n), the sum of the squares of the grayscale of the adjacent four directions is: $V1=∑i=−kk−1(gm+i,n−gm+i+1,n)2,V2=∑i=−kk−1(gm+i,n+i−gm+i+1,n+i+1)2,V3=∑i=−kk−1(gm,n+i−gm,n+i+1)2,V4=∑i=−kk−1(gm+i,n−i−gm+i+1,n−i−1)2.$(9)

In the function: k = int(w/2). Take the smallest value as interest value of the pixel, that is $IVm,n=min{V1,V2,V3,V4}.$

2. Give an empirical threshold value, the interest value, which is bigger than the threshold, is chosen as the candidate point. The threshold value should be selected to include required feature points in the candidate points, without excessive non-feature points.

3. Select extreme value pointing the cadidate point as the feature point, in a certain size of the window (which can be different from the interest value calculation window), then eliminate the candidate value of interest which is not the largest, leaving the only one largest interest value at last. The pixel of the remaining one is a feature point.

## 3.1 UAV data acquisition

This paper takes Yunnan Normal University campus as an example, where unmanned aerial vehicle (UAV) is chosen as a platform to collect image data. In order to verify the effect of four operators to extract the feature points, building image, grassland image, shrubbery image, and vegetable greenhouses images are obtained. In this UAV flight, we choose the dajiang PHANTOM3 unmanned aerial vehicles (UAV) and use GPS/GLONASS positioning system with 12.4 million pixels. Its take-off weight is 1280g. The main idea of this experiment is designed as follows:

1. Use unmanned aerial vehicle (UAV) to obtain the original image data. In this process, the overlap degree of the obtained images should not be less than 60%. The uniform source of light and the same flight heights should be ensured in the field data collection.

2. The quality of the acquired image data must be checked. The software is used to check whether the image meets the quality requirements of modeling. If not, the UAV must fly again to collect enough data.

3. Use all kinds of operators to extract feature points.

The single image data from unmanned aerial vehicle (UAV) is shown in Figure 3.

Figure 3

Original image data

## 3.1.1 The operator feature extraction

We use matlab2014 for programming. The running environment requires Windows 7 system, memory 4.00 GB, 64 - bit operating system. Then the features are extracted by using SIFT operator, Forstner operator, Harris operator, and Moravec operator, respectively. The Forstner operator has a threshold of 0.65, the k value of the Harris operator is 0.06. The interest threshold of Moravec operator is 10000. And the results of the extraction are shown in Figures 4, 5, 6, and 7. The resolution of the images are: building (638 × 535), grassland (686 × 524), shrubbery (858 × 583), vegetable greenhouses (701 × 522).

Figure 4

SIFT operator feature extraction

Figure 5

Forstner operator feature extraction

Figure 6

Harris operator feature extraction

Figure 7

Moravec operator feature extraction

## 3.2 Speed and accuracy analysis

In the same experimental conditions, the extraction speed of each operator is compared, which is shown in Table 1.

Table 1

Time comparison of each operator’s extraction (unit: second)

It can be seen from Table 1, in the process of building feature points extraction, Harris operator is slower; in the process of grassland feature points extraction process, Forstner operator is slower; in the process of bushes feature points extraction, SIFT operator is slower; in the process of vegetable greenhouses feature points extraction, Harris operator is slower.

In this experiment, in order to compare the accuracy of feature point extraction, the corner of the building is selected for our analysis. We measure 45 corner point coordinates. After that, we compare and analyzes the measured value and the coordinate value extracted from the feature points. The result is shown in Table 2.

Table 2

The RMS of each algorithm

As we can see from Table 2, the extraction accuracy of SIFT operator for building is the lowest, while the extraction accuracy of Forstner operator is the highest. SIFT operator extracts round feature points better than the corner point, so SIFT operator is not suitable for corner point feature extraction.

## 4 Conclusion

In this paper, the feature extraction algorithm is studied, and the main algorithms of photogrammetry and computer vision are analyzed and compared systematically in terms of speed and accuracy. The result shows that each algorithm has its own advantages and disadvantages, so the appropriate algorithm should be selectd according to the specific task requirements.

## Acknowledgement

We thank the reviewers for their constructive comments on improving the quality of this paper. This work research was supported in part by the Yunnan Provincial Department of education research foundation (2016ZZX067).

## References

• [1]

Colomina I., Molina P., Unmanned aerial systems for photogrammetry and remote sensing: a review, ISPRS J. Photogramm. Remote Sens., 2014, 92, 79-97.

• [2]

Harwin S., Lucieer A., Assessing the accuracy of georeferenced point clouds produced via multi-view stereopsis from unmanned aerial vehicle (UAV) imagery, Remote Sens., 2012, 4, 1573-1599.

• [3]

Gerke M., Przybilla H.J., Accuracy analysis of photogrammetric UAV image blocks: influence of onboard RTK-GNSS and cross flight patterns, Photogramm. Fernerkund. Geoinf., 2016, 14, 17-30.

• [4]

Nex F., Remondino F., UAV for 3D mapping applications: a review, Appl. Geomat., 2014, 6, 1-15.

• [5]

Eisenbeiss H., Sauerbier M., Investigation of UAV systems and flight modes for photogrammetric applications, Photogramm. Rec., 2011, 26, 400-421.

• [6]

Fernández-Hernandez J., González-Aguilera D., Rodríguez-Gonzálvez P., Mancera-Taboada J., Image-based modelling from Unmanned Aerial Vehicle (UAV) photogrammetry: an effective, low-cost tool for archaeological applications, Archaeometry, 2015, 57, 128-145.

• [7]

Zhang C., Kovacs J.M., The application of small unmanned aerial systems for precision agriculture: a review, Precis. Agric., 2012, 13, 693-712.

• [8]

Berni J., Zarco-Tejada P., Suárez L., González-Dugo V., Fereres E., Remote sensing of vegetation from UAV platforms using lightweight multispectral and thermal imaging sensors, Proc. ISPRS, 2009, 38, 22-29.Google Scholar

• [9]

Bendea H., Boccardo P., Dequal S., Giulio Tonolo F., Marenchino D., Piras M, Low cost UAV for post-disaster assessment, Proc. ISPRS, 2008, 37, 1373-1379.Google Scholar

• [10]

Chou T.Y., Yeh M.L., Chen Y.C., Chen Y.H., Disaster monitoring and management by the unmanned aerial vehicle technology, Proc. ISPRS, 2010, 35, 137-142.Google Scholar

• [11]

Irschara A., Kaufmann V., Klopschitz M., Bischof H., Leberl F., Towards fully automatic photogrammetric reconstruction using digital images taken from UAVs, Proc. ISPRS, 2010, 2010, 1-6. Google Scholar

• [12]

Zhang Y.J., Geometric processing of low altitude remote sensing images captured by unmanned airship, Geomatics and Information Science of Wuhan University, 2009, 34, 284-288. Google Scholar

• [13]

Yuan X.X., Ming Y., POS-supportedMatching Method for Aer ial Images between Neighbor ing Str ips, Acta Geodaetic a et Cartogra phica Sinic a, 2010, 39, 156-161. Google Scholar

• [14]

Frstner W., Dickscheid T., Schindler F., Detecting interpretable and accurate scale-invariant keypoints, Proceedings of 12th IEEE International Conference on Computer Vision Kyoto, Japan, 2009. Google Scholar

• [15]

Schmid C., Mohr R., Bauckhage C., Evaluation of interest point detectors, Int. J. Comput. Vision, 2000, 37, 151-172.

• [16]

Mikolajczyk K., Schmid C., Scale and affine invariant interest point detectors, Int. J. Comput. Vision, 2004, 60, 63-86.

• [17]

Dicksheid T.F., Förstner W., Evaluating the Suitability of Feature Detectors for Automatic Image Orientation Systems Proceedings of 7th ICVS Liege Belgium: Springer 2009. Google Scholar

• [18]

Lowe D., Distinctive image features from scale-invariant key points, Int. J. Comput. Vision, 2004, 60, 91-110.

• [19]

Wang Q.C., Guo G.L., Zha JF., Liu S., Compara tive on interest point detect operators based on image’s gray and improvement, Journal of Geodesy and Geodynamics, 2012,32, 148-154. Google Scholar

• [20]

Zhang C.M., Gong Z.H., Huang Y., Performance evaluation and improvement of several feature point detectors, Journal of Geomatics Science and Technology, 2008, 25, 231-234. Google Scholar

• [21]

Gao W., Guo Y., Wang K.Y., Ontology algorithm using singular value decomposition and applied in multidisciplinary, Cluster Comput., 2016, 19, 2201-2210.

• [22]

Gao W., Wang W.F., The fifth geometric arithmetic index of bridge graph and carbon nanocones, J. Differ. Equ. Appl., 2017, http://dx.doi.org/10.1080/10236198.2016.1197214.Web of Science

• [23]

Gao W., Wang W.F., The eccentric connectivity polynomial of two classes of nanotubes, Chaos Soliton. Fract., 2016, 89, 290-294.

• [24]

Lowe D.G., Distinctive Image Features from Scale-invariant Keypoints, Int. J. Comput. Vision, 2004, 60, 91-110.

• [25]

Snavely N., Seitz S. M., Szeliski R., Modeling the world from internet photo collections, Int. J. Comput. Vision, 2008, 80, 189-210.

• [26]

Forstner W., Gulch E.A., Fast operator for detection and precise location of distinct points, comers and centres of circular features, Interlaken: Switzerland Proceedings of Intercommission Workshop on Fast Processing of Photogrammetric Data, 1987. Google Scholar

• [27]

Harris C.G., Stephens M.J.A., Combined corner and edge detector, Manchester Proceedings Fourth Alvey Vision Conference, 1988. Google Scholar

• [28]

Moravec H.P., Towards automatic visual obstacle a voidance, Int. Joint Conf. of Artif. Intelligence, 1977. Google Scholar

Accepted: 2017-02-16

Published Online: 2017-07-03

Conflict of interestConflicts of interests: The authors declare that there is no conflict of interests regarding the publication of this paper.

Citation Information: Open Physics, Volume 15, Issue 1, Pages 472–478, ISSN (Online) 2391-5471,

Export Citation