COMPARATIVE ANALYSIS OF KEYPOINT DETECTION METHODS IN IMAGES
Main Article Content
Abstract
Introduction. Image search – finding one image within another – is a fundamental
problem in computer vision, with applications in object recognition, motion tracking, 3D
reconstruction, aerial image analysis and autonomous navigation systems. The core challenge is
identifying local image features, called keypoints, that remain stable under changes in scale, rotation,
viewpoint and illumination. Since the pioneering work of Moravec and Harris on corner detection,
and the subsequent development of SIFT by Lowe, the field has produced a number of competing
methods that vary significantly in their theoretical foundations, accuracy and computational
efficiency.
Purpose. This paper presents a systematic comparison of six keypoint detection methods –
SIFT, SURF, BRISK, ORB, KAZE and AKAZE – evaluated on images of varying geometric
complexity. Two evaluation criteria are used: the spatial configuration quality of detected keypoints
relative to salient image features, and the average computation time per keypoint.
Results. The mathematical background of SIFT is described in detail, covering scale-space
extrema detection via the Difference of Gaussians function, precise keypoint localization using Taylor
series expansion of the scale-space function, orientation assignment and descriptor computation. For
all six methods, experiments were conducted on four synthetic images (line segments, closed polylines,
ellipses, and a color image with programming language logos) and two real-world images (book
covers). Results show that BRISK and ORB are the fastest methods, with per-keypoint times of 0.032
ms and 0.070 ms respectively on real images. However, BRISK detects an excessive number of
keypoints (148,300 over 20 trials), many of which are unstable. ORB provides a more balanced result:
moderate keypoint count (10,000), the second-fastest computation time, and a spatial configuration
concentrated near meaningful image features such as corners and intersections. SIFT, while the
slowest method (0.220 ms per keypoint), provides a theoretically well-grounded and selective
response. KAZE is the slowest across all test cases (0.861 ms per keypoint), limiting its practical use
in time-critical applications.
Conclusion. No single method dominates across all evaluation criteria and image types. The
optimal choice depends on the characteristics of the target images and the specific requirements of the
application. For real-time systems with limited computational resources, ORB provides the best tradeoff between speed and keypoint quality. For tasks where accuracy is prioritized over speed, SIFT
remains competitive. These findings highlight the importance of method selection guided by systematic
benchmarking rather than general assumptions.
Article Details

This work is licensed under a Creative Commons Attribution 4.0 International License.
References
Bay H., Ess A., Tuytelaars T., Van Gool L. (2008) Speeded-Up Robust Features (SURF).
Computer Vision and Image Understanding, 110(3), 346–359.
Brown M., Lowe D.G. (2002) Invariant features from interest point groups. Proceedings of the
British Machine Vision Conference, Cardiff, Wales, 656–665.
Crowley J.L., Parker A.C. (1984) A representation for shape based on peaks and ridges in the
difference of low-pass transform. IEEE Transactions on Pattern Analysis and Machine
Intelligence, 6(2), 156–170.
Harris C., Stephens M. (1988) A combined corner and edge detector. Proceedings of the Fourth
Alvey Vision Conference, Manchester, UK, 147–151.
Leutenegger S., Chli M., Siegwart R.Y. (2011) BRISK: Binary Robust Invariant Scalable
Keypoints. Proceedings of the IEEE International Conference on Computer Vision, 2548–2555.
Lindeberg T. (1993) Detecting salient blob-like image structures and their scales with a scalespace primal sketch. International Journal of Computer Vision, 11(3), 283–318.
Lowe D.G. (1999) Object recognition from local scale-invariant features. Proceedings of the
International Conference on Computer Vision, 1150–1157.
Lowe D.G. (2004) Distinctive image features from scale-invariant keypoints. International Journal
of Computer Vision, 60(2), 91–110.
Matas J., Chum O., Urban M., Pajdla T. (2002) Robust wide baseline stereo from maximally
stable extremal regions. Proceedings of the British Machine Vision Conference, Cardiff, Wales,
–393.
Mikolajczyk K., Schmid C. (2002) An affine invariant interest point detector. Proceedings of the
European Conference on Computer Vision, Copenhagen, Denmark, 128–142.
Mikolajczyk K., Zisserman A., Schmid C. (2003) Shape recognition with edge-based features.
Proceedings of the British Machine Vision Conference, Norwich, UK.
Moravec H. (1981) Rover visual obstacle avoidance. Proceedings of the International Joint
Conference on Artificial Intelligence, Vancouver, Canada, 785–790.
Nelson R.C., Selinger A. (1998) Large-scale tests of a keyed, appearance-based 3-D object
recognition system. Vision Research, 38(15), 2469–2488.
Pope A.R., Lowe D.G. (2000) Probabilistic models of appearance for 3-D object recognition.
International Journal of Computer Vision, 40(2), 149–167.
Rublee E., Rabaud V., Konolige K., Bradski G. (2011) ORB: an efficient alternative to SIFT or
SURF. Proceedings of the IEEE International Conference on Computer Vision, 2564–2571.
Schmid C., Mohr R. (1997) Local grayvalue invariants for image retrieval. IEEE Transactions on
Pattern Analysis and Machine Intelligence, 19(5), 530–534.
Shokoufandeh A., Marsic I., Dickinson S.J. (1999) View-based object recognition using saliency
maps. Image and Vision Computing, 17, 445–460.
Torr P. (1995) Motion Segmentation and Outlier Detection. Ph.D. Thesis, University of Oxford,
UK.
Zhang Z., Deriche R., Faugeras O., Luong Q.T. (1995) A robust technique for matching two
uncalibrated images through the recovery of the unknown epipolar geometry. Artificial
Intelligence, 78, 87–119.
OpenCV: Open Source Computer Vision Library. Available at: https://opencv.org/