Friday, October 17, 2008

Performance Characterization in Computer Vision A Tutorial

The discipline variously known as Computer Vision, Machine Vision and Image Analysis has its origins in the early artificial intelligence research of the late 1950s and early 1960s. Hence, roughly two generations of researchers have pitted their wits against the problem. The pioneers of the first generation worked with computers that were barely capable of handling image data — processing had to be done line-by-line from backing store — and programs almost always had to be run as batch jobs, ruling out any form of interaction. Even capturing digital images was an impressive feat. Under such difficult conditions, the techniques that were developed were inevitably based on the mathematics of image formation and exploited the values of pixels in neighbouring regions. Implementing them was a non-trivial task, so much so that pretty well any result was an impressive achievement.

The second generation of researchers coincided with the birth of the workstation. At last, an individual researcher could process images online, display them, and interact with them. These extra capabilities allowed researchers to develop algorithms that involved significant amounts of processing. A major characteristic of many algorithms developed during this second generation was the quest for optimality. By formulating and manipulating a set of equations that described the nature of the problem, a solution can usually be obtained by a least-squares method which, of course, is in some sense optimal. Consequently, any number of techniques appeared with this ‘optimality’ tag. Sadly, none of these papers were able to provide credible experimental evidence that the results from the optimal technique was significantly better than existing (presumably sub-optimal) ones.

We are now in the early years of the third generation. Computers, even PCs, are so fast and so well-endowed with storage that it is entirely feasible to process large datasets of images in a reasonable time — and this means it is possible to quantify the performance of an algorithm. As a result, the vision community has finally started to turn its attention to issues related to testing and comparing algorithms: performance assessment. The most visible (no pun intended) aspect of this is the competitions that are often organized in association with major vision conferences. These essentially ask the question “which algorithm is best?” Although a natural enough question to ask, it lacks subtlety and is potentially rather dangerous: if the community as a whole adopts an algorithm as “the standard” and concentrates on improving it further, that action can stifle research into other algorithms.

Download pdf Performance Characterization in Computer Vision A Tutorial

No comments:

Post a Comment