The design of vision solutions
Introduction
PurposeThis document describes a plan for providing software support in the development of vision solutions. It aims at consolidating the products of the IOP-project. Without an explicit software consolidation step as described here, it is fair to conclude that the code developed in the IOP-programme would largely be lost.
OverviewSoftware support should not be based on the technology pushing what can be delivered, but rather on what are the real needs of the designer of a vision solution. To that end, we start with analyzing the state of the art in vision. Then we consider the development of a solution as a business process and as a software design process. From there we derive requirements on vision libraries tailor made for the business and software design process. Finally we specify a product and a plan to develop the product.
Application areasThe analysis of deriving good software support considers vision covering the use in manufacturing, in inspection, in steering, in observing, in guiding a vehicle or any such industrial activity. With minor adaptations, the analysis is applicable to robotics, agriculture and multimedia analysis. Development in medical image analysis may be partially different with its large amounts of detailed a priori knowledge on the shape and appearance of human organs, and its high demands on interactivity and visualization, as well as the legal consequences of errors in the processing may be different.
No easy solutionsA most important asset in the development of application centers for computer vision is their capability to demonstrate or to falsify the feasibility of a solution. Problem statements from industrial, agricultural or medical industry for computerized vision are never formulated as an off-the-shelf-solution, when they first reach an application centre. It is rarely the case that only tuning some parameters will provide a solution. The process from problem articulation to feasibility and onto practical solutions is communication, labour, experience and knowledge intensive.
Seeing is believingAt the current state of the art it is safe to conclude that industrial, agricultural and medical vision are undecided. That is, when confronted with a problem, in the majority of cases one cannot predict whether the problem is solvable. At each stage of the design process, some solution has to be implemented before it can be concluded that the solution is really viable. In short, for most vision problems the existence of a solution can not be assured until the solution is actually shown. In this aspect, vision departs drastically from other information systems. There it is usually known whether a problem can be solved without knowing precisely what the solution is.
Stages in designIn the design of vision systems, the design process is often described as a staged process. First quickly work out a slow solution on a very limited dataset while visualising all intermediate processing results as a proof of concept. Then proceed to more solid, less visualized code while shifting the emphasis of evaluation towards robustness on large datasets expressed in a figure of merit. Finally, redesign the code to optimize it towards the run-time platform and perform run-time tests.
Development as a process
Design processTo support the design process in an application centre by software we take a more detailed look at the design cycle of industrial, medical or agricultural vision applications. The design process still has many aspects of a craft. For the purpose of analysis of the design process, we break the design cycle down by a number of boxes.
Solutions
Problem articulation First data
Problem Feasible Proof of
first statementpossibility
communication diagnostics
Not likely feasibleNot feasible
Real-life data Real-life & simulated data
Algorithmic design Algorithmic performanceOn-line trials
Robustness testsPrepare production
Proof of Zero Ready for
principle product product
researchdesigndevelopment
Unsolvable Dropped
1. ArticulationFirst of all, the problem needs to be articulated as a commercial opportunity, economical founded and feasible from a manufactoring or production point of view. Within the first box, the emphasis is on communication with the client. The problem statement will likely cover:
classification by problem type,
data in all its variability,
availability of test data,
sensor, light and scene,
the desired outcome,
the desired accuracy,
operational circumstances,
the type of hardware,
manufacturing issues,
critical conditions: time, parameter dependency, fault tolerance,
the requested adaptability,
the embedding system, software, middleware,
interactivity,
maintenance and support,
heritage, and
possibilities for multipliers into other problems.
We will not deal with these aspects here. We recognize, however, that the articulation of the problem is an important step for success if not the most important step. The required expertise in this box is communication skills, intuition for computer vision solutions, ability and experience for seeing the white spots and black holes, and broad operational access to a large variety of solutions. This phase needs to be completed to have a full risk-reward available. The software tools required here are demonstrationtools and design tools.
2. DiagnosticsIn the second box, we call problem diagnostics, it is assessed whether the problem is understood correctly. How can the problem be solved by analyzing similar problems? Can trial solutions be designed on an example image? Compared to the current practice this stage needs to be separated from the next step. Too many problems skip this stage and stumble into the next stage where they solve the wrong problem. In order to shorten the time to market, this phase needs to be shortened most. The typical target time is 1 week mostly by restricting what is done here to parameter tweeking. The skills required for this box are a mastership over the state of the art, short development cycles, and hence programming by picking from function libraries, parameter setting and visualisation. The software tools required here are prototyping tools, large libraries of solutions, and visualisation tools.
3. ResearchIn the third box, we call research, it is assessed whether the problem can be solved at all. Compared to the current practice this stage needs to be speeded up. This is the most costly stage both in terms of time, risk as well as in money spend. The better and broader the tools are available in this stage, the better the chance the research is finished in time. The skills required for this box are in depth expertise of vision, literature, design creativity, and an abstract and future-safe programming style. The tools required here are a large library of basictoolsof amendable software libraries with visualisation capabilities embedded in an experimentation environment. The libraries should be available at the level of source code to make the effort minimal. In fact, new code is best designed starting from abstract patterns for which the basics of pixel addressing has already been solved.
4. DesignIn the design box the solution in principle is designed further to match all operational conditions. Compared to the current practice this stage needs to be matched better with the preceeding stage of the development. A seemless transition between research solution and the design stage furthers the shortening of the overall design cycle. The skills required for this box are an eye for practicality and practical problem hot spots, design creativity, and a problem-specific programming style. The software tools required here are amendablesoftware source code of solution in a design environment. In the development stage, the software environment requires menu tools to enhance the efficient addressing of the functions in the library by non-vision-experts.
5. DevelopmentOnce a proven, robust design is there, in the development box it is developed further into a commercially viable product. Compared to the current practice this stage needs to be matched better with the preceeding stage of the development. A seemless transition between the design result and the development stage furthers the shortening of the overall design cycle. Skills required for this box are processing speed and a platform-specific programming style.The software tools are centered on vision platforms and on-line experimentation.
Types of toolsThe software support of the design process may be achieved by demonstrators to show how a similar problem has been solved, prototyping to enlighten the feasibility aspects of a potential solution, and computerized instruction from textbooks to consult theoretical or practical instructions when considering a particular solution.
DemonstratorsDemonstrators are particularly useful when the idea has been insufficiently articulated. To see a vision solution being applied to a problem may generate the spark to specify a solution to another problem as well. The software support here consists of demonstrators is a matter of experience and proper version management. What once has been generated for another solution may serve in the articulation of the next problem provided it is still operational and parts of that solution can be rapidly employed in the next solution.
PrototypesPrototyping aims at generating a trial solution in a very short time with the focus on testing the feasibility of a solution. Can the problem be solved at all? At this stage time critical performance is not the prime issue, rather whether there exists a solution and what the circumstances are under which such a solution can be reached. Prototyping is indispensible for quick product articulation. From a software point of view, prototyping is highly demanding. It requires that the loop between method specification and program specification is short. Prototyping relies on the availability of the broadest possible image processing library with a good coverage including recent scientific advancements. The scientific advancements should be addressable and tunable at various levels of understanding: both the vision expert as well as the domain expert should be capable of employing the operation. As a consequence, the interactivity and the visual rendering of the rationale behind the solution should be high. For the follow-up the step towards the real product code should be absent in the ideal case. Due to the conflicting demands between the prototype phase of broadness and state-of-the-art at all time versus narrow and speed-optimized solutions for the product phase, the barrier between prototyping and production oriented phase may in practice still be considerable. In such case that barrier should be as small as possible to shorten the overall design cycle effort.
InstructorsInstructions are helpful in the articulation phase as they provide clear bounds on what can be done and cannot be done provided that such knowledge is available from textbooks. At the current state of the art very few methods are characterized by clear instructions when to use this algorithm and when to use another one.
Data managementIn the expression of the problem, data need to be known in sufficient variety and detail, specified on their visual, geometrical, physical and stochastical properties. When the problem is error-critical all outliers of the data need to be known as well. Acquisition requires data management tools to synthesize or to gather data, to keep large quantaties of data, to annotate them, to delineate the ideal segmentation, to analyze the results and to visualize their results.
Development of an algorithm
Algorithm specificationThe development of an algorithm in vision can be seen as a walk through four levels of specification as shown in the figure: the computational concept, the method, the program specification and the actual code.
Computational Computational Program Program
Concept Method Specification Code
For each computational concept several computational method may exist. In turn, to each method several specifications may exist and to each specification also several codings may exist.
a. ConceptFor an example of this chain consider the computational concept of an edge, it may be computed by the method of Prewitt, Sobel and Canny each with their own preferred properties and drawbacks. The concept is usually specified at the level of continuous mathematics as if the image data field was still dense.
b. Computaional methodThe computational concept of Canny edge detection requires a Gaussian filter for which several computational methods are known: the anonymous parallel implementation with separation by dimension, the recursive implementation due to DeRiche and the improved version due to Young. The computational method is usually specified in discrete mathematics corresponding to situation where the image data have been sampled from the real dense field.
c. ProgramFor the computational method of the Gaussian filter in parallel implementation several program specifications exist: a parallel implementation computing the pixels in the image independently, a sequential implementation computing the pixels in the image in a fixed scan-order. The program specification can be done in data-flow diagrams using symbolic representations for building blocks or other program specification means.
d. CodeFor the program specification of the Gaussian filter in parallel implementation several program specifications exist: a specific implementation for the Gaussian filter, an instantiation of the abstract linear filter, an instantiation of the symmetric and separable linear filter, or as an instantiation of the abstract parallel pattern. The code specification is done in a programming language such as C.
Conditions of failure In the expression of the problem, the algorithm needs to be verified at all steps of development. The computational method needs to represent the methodological concept under all circumstances it will encounter in practice, and the conditions of failure should be accurately listed. When the concept of an edge is needed in the solution and the Prewitt edge detector fails to detect an edge at sharp corners in the contour, this may be problematic depending on the presence and the relevance of the sharp corners to the solution. The program specification also needs to represent the computational model under all circumstances it will encounter in practice, and, again, the conditions of failure should be accurately recorded. When a recursive linear filter is specified as a very quick but approximate solution for computing a filter and the operational characteristics do not permit the approximation, the solution may fail. Finally, the code needs to accurately represent the specification, in addition to being logically flawless and sufficiently fast for all data variations it might encounter. Rarely code has been tested to the degree that one can guarantee it has seen everything. When all this is fulfilled, the development of the solution loops back to questioning the assumptions underlying the computational concept. By testing the solution to the data one tests wether the concept needs adaption to accurately fit the current problem.
Experience with software support for vision
Vision in NLIn the absorption of vision in the Netherlands, a few scientific centres which have concentrated in vision have played a leading role. The vision library plus development interface SCIL-Image has played an important role in the development of the discipline in the Netherlands, next to the CBP-course. SCIL-Image not only served as the major platform where scientific centres shared their toolkits during most of the 90’s, it also served as medium for the export of knowledge and people to industrial research and development.
SCIL-ImageThe unique element of SCIL-Image is the interactive C-interpreter enabling a very short development cycle from an interactively specified sequence of methods to a program coded in C. The newly developed code could be added with little additional effort to the library of vision processing functions. As a consequence, the library grew over the years to reach its biggest extent of about 500 callable functions for the developer. Therefore, SCIL-Image is a good vehicle for the evaluation of new concepts and the subsequent exchange of new implementations among developers in the Netherlands. SCIL-Image has sold over a thousand copies in spite of a limited commercial effort and a price of klf 10 per item. Philips Medical Systems build the product in their common vision module to coordinate the world-wide development of vision solutions. Over the years the revenues have balanced the maintenance costs.
Existing vision libraries
Off the shelfA point of view is that the choice for a software toolset should be limited to commercial packages as they are the only ones ready for long time support. Commercial software when it is capable of solving the problem at hand has a preference, as the cost of coding is very high. The likelihood that such off-the-shelf solution exists in a commercially interesting application in vision is low, as it is likely that the product would already exist. For a review see: E. Malamas, E. Petrakis, M. Zervakis, L. Petit and J-D. Legat "Industrial vision: state of the art, problems and prospects". They mention dozens of packages. The mostimportant competitors are Khoros, Visilog, and specialized libraries as available under Matlab and Labview.
MatlabMatlab is originally designed for matrix manipulations. It is in wide spread use as a mathematical and numerical toolbox for engineers. Also it visualization tools are well known. The open architecture makes it easy to use Matlab. Content-wise, is Matlab particularly good in its home ground: math. Relatively recently it has been extended with image processing capabilities but they are considered not rather limited in its coverage of modern image processing techniques. An exceptionholds for the mathematical morphology library which is good.