Unmanned aerial vehicles (UAV) have become a fast, efficient, low-cost and flexible remote sensing data acquisition systems to get the images with high-resolution [1–3]. If UAV images are used for aerial triangulation, we can carry out the geometric reconstruction, and obtain the digital surface model and the point cloud data , which can be widely used in fileds like archaeology, vegetation monitoring, traffic monitoring, disaster management and 3d reconstruction, etc [5–11]. In the process of unmanned aerial vehicle (UAV) model geometry reconstruction, it is a very critical step to extract and optimize the matching points in aerial triangulation . In traditional aerial photography, the commonly used matching feature point extraction methods can be roughly divided into two categories: the matching based on pixel gray value, and the matching based on feature . In traditional and standard aerial triangulation, it’s very convenient to obtain the matching points by using the method based on pixel gray value. However, there are a lot of problems in unmanned aerial vehicle (UAV) imaging, such as irregular image overlapping, or large rotation deviation angle, so some difficulties will be encounterd when the common method and software are used for the UAV images. in other words, a great deal of labor force and material resources are needed [14–16].
In computer vision and photo grammetry field, there are lots of feature point extraction operators are proposed. For example, Dicksheid and Förstner compared and analyzed image orientation, performance of these feature extraction operators. However, they only used the low resolution close range images and analog images in the experiment . Lowe applied SIFT algorithm to machine vision, image matching, image retrieval fields, etc. . Wang, Zhang compared and evaluated the performance of several commonly used operators [19, 20]. However, they all only compare the image data of building images, without refering to other objects.
Based on the previous research, this paper uses SIFT operator, Forstner operator, Harris operator and Moravec operator to extract the feature image data from different objects and analyzes the data. Some mathematical formula used in this article can refer to [21–23].
2 Feature operator extraction
2.1 SIFT algorithm
The SIFT algorithm, proposed by David G. Lowe, is a feature-based matching algorithm based on scale space, mainly using Gaussian differential pyramid model . The Gaussian pyramid is essentially a multi-scale representation of the signal, Gaussian blurring the same signal or picture many times, and down-sampling to produce multiple sets of signals or pictures at different scales for subsequent processing.
The application of SIFT matching algorithm in the photo grammetry has drawn more and more attention. Its superior performance has greatly enhanced the degree of automatic orientation in close range photography and unconventional aerial photography (like UAV images). In this way, it still can carry out automatic aerial triangulation in the lack of GPS/IMU data support .
2.1.1 The establishment of Gaussian differential pyramid
On the basis of a series of reasonable assumptions, Koenderink and Lindeberg proved that the Gaussian function is probably the only spatial kernel scale to realize image scale conversion. One tile of image spatial scale can be defined as: (1) where ⋆ represents the sum of convolution operation, (x, y) denotes the space coordinates of pixel, s stands for scale coordinates of pixel, and G(x, y, s) is scale variable Gaussian kernel function, which is defined as: (2)
David G. Lowe raised that it uses Gaussian difference operator to establish image Gaussian difference scale space. It is able to obtain Gaussian difference images D(x, y, s) with different scales by using the Gaussian difference kernels and image convolution on different scales. All Gaussian difference images together make up the Gaussian difference pyramid of the image. (3)
Construction process of image Gaussian difference pyramid is shown in the left part of Figure 1. In the group, the images from the upper layer are convoluted with the Gaussian functions of different scales to obtain the spatial images in which the constant scale factor is increased k times. In other words, the scale of first layer is s, the scale of the second layer is ks, and the next layer follows this rule. The image of the first layer of each group is obtained from the first layer sample of the previous set of images. The size of the image after sampling is 1/4 of the size of the images of the previous group, but scale size is twice of the original one. In this way, the Gaussian difference pyramid is constructed. The right part of Figure 1 describes the process how to construct Gaussian pyramid. In the group, Gaussian difference image is obtained after the subtraction between adjacent images. Then, after repeating the previous operations on the full set of images, we get the Gaussian difference pyramid.
2.1.2 Generation of feature point
After the key point SIFT feature area is generated, firstly the axis direction is rotated to the main direction of the key points, which ensures the rotation invariance. Then we select the neighboring window with the key as the center, as shown in the left part of Figure 2. The central pixel is the key point to be described. The surrounding little grids represent the pixels of the neighboring key points; the length of the arrow represents the gradient value of the pixel and the direction of the arrow represents the gradient direction of the pixel. The circles in the figure represent the range of Gaussian weighting of pixel values. As shown in right part of Figure 2, in every sub block, if we calculate the gradient direction histogram in eight directions, we can get a seed point. When all the seed points are obtained, we get a descriptor.
2.2 Forstner operator 
First, the Robert gradient and the gray covariance matrix of each pixel are calculated. Then the feature points, whose error ellipse is the smallest and closest to the round, are searched as the chracteristic point.
The process is described as follows:
Calculate the Robert gradient for each pixel (4)
Calculate |x| covariance matrix of the gray value in the window where (5)
Calculate the interest values of q and w: (6) where DetN represents the determinant of N, and trN represents the trace of matrix N.
Determine candidate points (7) When q > Tq and w > Tw at the same time, it confirms to be the candidate point.
Select extremum points
The extreme points are selected based on the weights, that is, the largest candidate points are selected in an appropriate window, and the rest of the remaining points are eliminated.
2.3 Harris operator
Harris operator, presented by Harris and Stephens, is a signal-based feature extraction operator .
The process is stated as follows: (8) where gx is the gradient of x axis direction, gy is the gradient of y axis direction, is Gaussian template, det is matrix determinant, tr is matrix trace, k is a constant, R represents the interest value of the corresponding pixels in the figure.
2.4 Moravec operator
Moravec operator, proposed by Moravec, uses the characteristics of gray variance to extract feature point operator .
Calculate the pixel IV (Interest Value). In the image window with w × w as the center of the pixel (m, n), the sum of the squares of the grayscale of the adjacent four directions is: (9)
In the function: k = int(w/2). Take the smallest value as interest value of the pixel, that is
Give an empirical threshold value, the interest value, which is bigger than the threshold, is chosen as the candidate point. The threshold value should be selected to include required feature points in the candidate points, without excessive non-feature points.
Select extreme value pointing the cadidate point as the feature point, in a certain size of the window (which can be different from the interest value calculation window), then eliminate the candidate value of interest which is not the largest, leaving the only one largest interest value at last. The pixel of the remaining one is a feature point.
3 The experimental case
3.1 UAV data acquisition
This paper takes Yunnan Normal University campus as an example, where unmanned aerial vehicle (UAV) is chosen as a platform to collect image data. In order to verify the effect of four operators to extract the feature points, building image, grassland image, shrubbery image, and vegetable greenhouses images are obtained. In this UAV flight, we choose the dajiang PHANTOM3 unmanned aerial vehicles (UAV) and use GPS/GLONASS positioning system with 12.4 million pixels. Its take-off weight is 1280g. The main idea of this experiment is designed as follows:
Use unmanned aerial vehicle (UAV) to obtain the original image data. In this process, the overlap degree of the obtained images should not be less than 60%. The uniform source of light and the same flight heights should be ensured in the field data collection.
The quality of the acquired image data must be checked. The software is used to check whether the image meets the quality requirements of modeling. If not, the UAV must fly again to collect enough data.
Use all kinds of operators to extract feature points.
The single image data from unmanned aerial vehicle (UAV) is shown in Figure 3.
3.1.1 The operator feature extraction
We use matlab2014 for programming. The running environment requires Windows 7 system, memory 4.00 GB, 64 - bit operating system. Then the features are extracted by using SIFT operator, Forstner operator, Harris operator, and Moravec operator, respectively. The Forstner operator has a threshold of 0.65, the k value of the Harris operator is 0.06. The interest threshold of Moravec operator is 10000. And the results of the extraction are shown in Figures 4, 5, 6, and 7. The resolution of the images are: building (638 × 535), grassland (686 × 524), shrubbery (858 × 583), vegetable greenhouses (701 × 522).
3.2 Speed and accuracy analysis
In the same experimental conditions, the extraction speed of each operator is compared, which is shown in Table 1.
It can be seen from Table 1, in the process of building feature points extraction, Harris operator is slower; in the process of grassland feature points extraction process, Forstner operator is slower; in the process of bushes feature points extraction, SIFT operator is slower; in the process of vegetable greenhouses feature points extraction, Harris operator is slower.
In this experiment, in order to compare the accuracy of feature point extraction, the corner of the building is selected for our analysis. We measure 45 corner point coordinates. After that, we compare and analyzes the measured value and the coordinate value extracted from the feature points. The result is shown in Table 2.
As we can see from Table 2, the extraction accuracy of SIFT operator for building is the lowest, while the extraction accuracy of Forstner operator is the highest. SIFT operator extracts round feature points better than the corner point, so SIFT operator is not suitable for corner point feature extraction.
In this paper, the feature extraction algorithm is studied, and the main algorithms of photogrammetry and computer vision are analyzed and compared systematically in terms of speed and accuracy. The result shows that each algorithm has its own advantages and disadvantages, so the appropriate algorithm should be selectd according to the specific task requirements.
We thank the reviewers for their constructive comments on improving the quality of this paper. This work research was supported in part by the Yunnan Provincial Department of education research foundation (2016ZZX067).
Harwin S., Lucieer A., Assessing the accuracy of georeferenced point clouds produced via multi-view stereopsis from unmanned aerial vehicle (UAV) imagery, Remote Sens., 2012, 4, 1573-1599.Web of ScienceCrossrefGoogle Scholar
Gerke M., Przybilla H.J., Accuracy analysis of photogrammetric UAV image blocks: influence of onboard RTK-GNSS and cross flight patterns, Photogramm. Fernerkund. Geoinf., 2016, 14, 17-30. Web of ScienceGoogle Scholar
Fernández-Hernandez J., González-Aguilera D., Rodríguez-Gonzálvez P., Mancera-Taboada J., Image-based modelling from Unmanned Aerial Vehicle (UAV) photogrammetry: an effective, low-cost tool for archaeological applications, Archaeometry, 2015, 57, 128-145.CrossrefWeb of ScienceGoogle Scholar
Berni J., Zarco-Tejada P., Suárez L., González-Dugo V., Fereres E., Remote sensing of vegetation from UAV platforms using lightweight multispectral and thermal imaging sensors, Proc. ISPRS, 2009, 38, 22-29.Google Scholar
Bendea H., Boccardo P., Dequal S., Giulio Tonolo F., Marenchino D., Piras M, Low cost UAV for post-disaster assessment, Proc. ISPRS, 2008, 37, 1373-1379.Google Scholar
Chou T.Y., Yeh M.L., Chen Y.C., Chen Y.H., Disaster monitoring and management by the unmanned aerial vehicle technology, Proc. ISPRS, 2010, 35, 137-142.Google Scholar
Irschara A., Kaufmann V., Klopschitz M., Bischof H., Leberl F., Towards fully automatic photogrammetric reconstruction using digital images taken from UAVs, Proc. ISPRS, 2010, 2010, 1-6. Google Scholar
Zhang Y.J., Geometric processing of low altitude remote sensing images captured by unmanned airship, Geomatics and Information Science of Wuhan University, 2009, 34, 284-288. Google Scholar
Yuan X.X., Ming Y., POS-supportedMatching Method for Aer ial Images between Neighbor ing Str ips, Acta Geodaetic a et Cartogra phica Sinic a, 2010, 39, 156-161. Google Scholar
Frstner W., Dickscheid T., Schindler F., Detecting interpretable and accurate scale-invariant keypoints, Proceedings of 12th IEEE International Conference on Computer Vision Kyoto, Japan, 2009. Google Scholar
Dicksheid T.F., Förstner W., Evaluating the Suitability of Feature Detectors for Automatic Image Orientation Systems Proceedings of 7th ICVS Liege Belgium: Springer 2009. Google Scholar
Wang Q.C., Guo G.L., Zha JF., Liu S., Compara tive on interest point detect operators based on image’s gray and improvement, Journal of Geodesy and Geodynamics, 2012,32, 148-154. Google Scholar
Zhang C.M., Gong Z.H., Huang Y., Performance evaluation and improvement of several feature point detectors, Journal of Geomatics Science and Technology, 2008, 25, 231-234. Google Scholar
Forstner W., Gulch E.A., Fast operator for detection and precise location of distinct points, comers and centres of circular features, Interlaken: Switzerland Proceedings of Intercommission Workshop on Fast Processing of Photogrammetric Data, 1987. Google Scholar
Harris C.G., Stephens M.J.A., Combined corner and edge detector, Manchester Proceedings Fourth Alvey Vision Conference, 1988. Google Scholar
Moravec H.P., Towards automatic visual obstacle a voidance, Int. Joint Conf. of Artif. Intelligence, 1977. Google Scholar
About the article
Published Online: 2017-07-03
Conflict of interestConflicts of interests: The authors declare that there is no conflict of interests regarding the publication of this paper.