You have reached:Details of image processing research.

File is gmimpdet.htm

Introduction.

High level features carry information about an image in an abstracted or propositional form, [1]. The use of high level features is appropriate in model based computer vision, as illustrated in figure 1, where the matching is applied to attributed relational graphs which are widely used as a flexible high level feature.

This paper describes a method of extracting such graphs from images of 3-D objects.

There are few accounts of practical methods for extracting any sort of high level feature from images and this is an important gap in the computer vision and image processing literature.

Examining figure 1, real world data is processed so as to create an attributed relational graph describing the image of an object. The reference representation was computed off-line from a geometric or ray-tracing model. This predicts the appearance of a 2-D projection of the real world as a graph with the same conventions as that created from the real world data.

A graph matching procedure is used to associate fragments of the two graphs and obtain a cost. A control strategy selects candidate classes and presents fragments of reference graphs for matching. It also monitors system performance and allocates system resources to more promising candidates, [2]. This paper is limited to describing the extraction of the relational graphs.

2. Construction of attributed relational graphs.

2.1 Overview.

The following describes the method that is in use for constructing relational graphs from colour and range image data. The relational graphs are built from nodes, representing the contours of regions, connected by arcs representing the relative positions of the contours in the image's 2-D coordinate system. This basic graph structure has attributes attached to it creating a flexible and extensible propositional representation.

2.2 Steps in the generation of the relational graphs.

2.2.1 Separation of regions.

The input image(s) are processed to identify and label connected regions. The standard definition of a connected region: '...set of pixels that are connected under some definition and share some common property...' is employed. 4-connectedness is required, and is taken as an indication of adequate sampling, [3].

The examples presented here employ coloured images registered (to within 1 pixel) with range images, forming a 4 component vector valued pixel image.

In the present implementation, the colour image has the dominant role in defining regions. The 'common property' is a uniformity of colour ratio, r:g:b, while the total intensity remains above a given threshold. Regions are extracted by subjecting the 3-component colour images to systematic averaging in small patches to reduce variance without crossing the 'walls' of high variance between regions.

Figure 2. shows a colour image of an object.

Figure 3. shows a depth image that is registered with the image in figure 2.

The images have been segmented and figure 4 shows the regions identified in the images which are used as masks to control the extraction of data to complete the graph.

2.2.2 Extracting the nodes.

The next stage is the tracing of the contours of the regions such as those of figure 4.

The output is a list of coordinates of the 8-connected contour pixels. The contour trace defines pixels in the image which are used to calculate a number of numerical and logical attributes for each contour.

The attributes extracted in the present implementation are:
1. Centroid, cg(x,y);
2. Area, aa(area);
3. Moments, mm(m1,m2,m3,m4);
4. Mean colour values, cr(r,g,b);
5. Plane equation coefficients, pl(a,b,c,d);
6. Shape classification,
7. External/internal contour flag.

The image is a projection onto 2-D of the original 3-D object via the optics of the camera and the quantisations implicit in the image capture device. The attributes are calculated using the image coordinate system. The known problems of camera imperfections, [4], are not dealt with in the present implementation.

2.2.2.1 Centroid.

The centroid of the contour is calculated. This is expressed by the proposition cg(xbar,ybar). The centroid is used as the 'handle' by which to describe the position of the contour and region.

2.2.2.2 Area.

The area enclosed by each external contour, excluding areas enclosed by any internal contours, is computed. The area is the pixel count in the image. The proposition is aa(area).

2.2.2.3 Moments.

The phenomenon of rotation and scale invariant combinations of body centred moments is well known. The first 4 such combinations are calculated from the coordinates of the contour pixels and are expressed as the proposition mm(m1,m2,m3,m4).

The values are scaled and truncated to become large integers for ease of transmission and processing. The use of the rotation and scale invariant combinations rather that the raw moments does not discard size and orientation information which is available from the area and centroid attributes. It is convenient to use the moment attribute purely to represent shape.

2.2.2.4 Mean colour values.

The mean colour values found under the track of the contour trace are computed yielding values for red, green and blue. The regions were defined based on a uniformity of colour ratio. The proposition is cr(r,g,b).

2.2.2.5 Plane equation coefficients.

If the contour belongs to a region that is approximately planar on the 3-D object, then the equation of the most closely fitting plane, in the form ax+by+cz+d=0 is useful and can be computed from the range image. The calculation chooses (a,b,c,d) to minimise the sum of absolute distances between the plane and the points on the contour. If no acceptable fit is possible the contour is assumed to be non-planar and no attribute is given otherwise the propositional form is pl(a,b,c,d).

The a,b,c,d values are based on a coordinate system after perspective projection because of the viewing conditions. Work must be done to recover a plane equation valid in the real world coordinate system.

2.2.2.6 Limited shape classification.

The 4 rotation and scale invariant body centred moment combinations are used to form a minimum distance classification of the contour against the three references of a circle, square and triangle. This crude classification can save time in further feature processing. The propositions are sqr or cir or tri.

2.2.2.7 External/Internal contour flag.

The discovery of contours in the image allows a determination of whether they are internal or external. This property is indicated with the proposition ext or int.

2.2.3 Creating the arcs.

The arcs of the relational graph are labelled with codes that show the geometrical relation between the contours.

2.2.3.1 Creating arc labels.

The existence of an arc between two nodes implies that they are within a certain threshold distance (between centroids) of each other. Without this filtering, the graph would be totally connected.

In the present implementation, the label on each arc is one of 16 direction codes. These correspond to compass headings North, North by North East, North East, North East by East etc..

An arc can also carry one of the labels 'outside' or 'inside'. These, if present, show that there is a significant intrusion of one contour into the bounding rectangle of the other.

2.2.3.2 Creating arc attributes.

Attributes on the arcs indicate the importance of including the arcs in the matching. They are used for scheduling matching attempts, [2], as part of a technique to partition the matching and monitor its progress, [5].

2.3 Algorithms.

The block with a cylinder mounted on it, figures 2 and 3, will continue to be used as a running example.

The creation of a relational graph as a reference can begin either with a geometric model such as a CAD/CAM model, or can employ ray tracing, [6].

The use of a CAD/CAM model requires issues of surface visibility and colour to be handled explicitly.

In the examples here, a z-buffer algorithm is used to determine visibility and colour values are appended after all geometric processing has been completed.

When ray tracing is employed, these issues are handled implicitly after the 'studio definition' has been given.

Whatever the synthesis, the resulting image is processed by the same feature extraction software that is applied to real image data. The sequence of steps in creating the attributed relational graph is
(1). capture or synthesise an image, including a depth image;
(2), Grow regions in the colour image and extract region masks,
(3), Construct the relational graph using contours in the region mask to provide the graph's nodes.

Figure 5, below, gives pseudo code for growing regions with uniform colour ratios and intensities above a threshold.

___________________________________________________________________

Input: 3 component image,
       R:G:B ratio variance tolerance, Tol.
       Intensity threshold  Thr
       Iteration count  Itcnt
       Neighbourhood size 2*n+1 **2   n

Output: Regions labelled on a pixel grid.

For each pixel position
Begin
      If the total brightness exceeds Thr
      Then Begin
                 Rescale the pixels in the 3 colour
                 planes so that the brightest is Imax;

           End
      Else Zeroise all 3 pixels at this position;
End;

For i:= 1 To Itcnt Do
Begin
       For each pixel position in the image that
           can serve as a patch centre
       Begin

              If variance over patch size 2n+3 **2
                  <=
                 Tol * variance over patch size 2n+1 **2
              Then
              Begin
                   Average pixels in each colour plane over
                   the patch of size 2n+1 ** 2;
              End;

       End
End


For each pixel position in the image that can be the centre
    of a patch of size 2n+1 **2, compute a new image pixel
    whose value is the sum of the variances of the red, green
    and blue image patches.

Threshold the variance image with Thr, invert it and use the
result as a visibility mask for the colour planes;

Label the 4-connected regions with unique identifiers.

Output the labelled image.


              Figure 5. Pseudocode for growing regions.
_________________________________________________________________



Figure 6, below, gives pseudocode for building a relational graph from the image data and the colour segmentation.


______________________________________________________________________________

Input: 3 colour planes, Region mask, depth image all registered.

Output: Relational graph descibing the image.


Scan the region mask and allocate unique labels to 4-connected
regions. Open a tabular data structure with the unique region
labels as the key


For each region in the region mask,
Begin
     Trace the contour, compute the following and
     enter the result in the table:

    - the  image coordinates of the centroid;
    - the mean colour values around the track of the contour;
    - the first four invariant combinations of body centred moments;
    - the area enclosed by the contour;
    - the equation of the best fitting plane Ax+By+Cz+D=0 to the range
      image data masked by the current region;

End;


For each record, R1, in the table
Begin
      For each other record, R2, in the table
      Begin
            If the distance between the centroids
               is less than Thr
            Then
            Begin
                  Compute the direction code from the centroid in R1
                  to the centroid in R2;

                  Examine the values in the table for priority cues;

                  Output the relation between R1 and R2 in the format
                  of figure xx;
            End;

      End;
End;

[Notes: The parameter Thr controls the extent to which the graph is connected,
        Thr >= root(2) * SIZE guarantees that the graph is fully connected,
        Thr = 0 will suppress all arcs.
        The algoritm above creates a relation from node p to node q and
        an equal relation from node q to node p. An asymmetry may be exploited
        in later work.]



        Figure 6. Pseudocode to create a relational graph from region data.

_____________________________________________________________________________

Figure 7 shows a fragment of the machine readable relational graph obtained from the example object. Only one relation is shown and the values of the attributes can be seen. The nodes are related by a code number, 5 in this example, that indicates the relative positions of the centroids of the node contours.

The relation has a 'strength' parameter, @rep=1 in this case. This value will be used in future to indicate the reliability of the estimate of the relation. It has been used previously to represent the length of a line segment when a line drawing is being represented in this graph format.

__________________________________________________________________________

             Graph components                            Notes

[A[@id=crot3.77;ext:sqr:aa(1659):mm(19815,148,0,0):    <- Leading node
cg(62,86):cr(255,153,153):pl(-204,574,792,10353)]]        with attributes

[[5][@rep=   1 ]]                                      <- Relation
                                                          and strength

[B[@id=crot3.77;ext:sqr:aa(199):mm(17759,244,1,0):     <- Trailing node
cg(98,77):cr(153,153,255):pl(966,2,258,-79198)]]$         with attributes



      Figure 7. Fragment of relational graph showing attributes.

__________________________________________________________________________

3. Building a control structure for matching.

One of the main motives for studying structural features is to automate the building of a control strategy, shown in figure 1, for the matching, [7], [8].

4. Conclusions.

Structural features seem to provide a robust and concise description of objects.

The use of the same feature extraction methods on real and synthetic images requires the image synthesis to be realistic.

The use of structural features in matching seems to be extremely flexible from the point of view of software development and the ability to merge data from the multi-sensor sources, illustrated here by colour and range images.

5. References.

[1] F. van der Heijden, Image Based Measurement Systems, Wiley, 1994.
[2] R. E. Blake, Development of an Incremental Graph Matching Device, Pattern Recognition Theory and Applications, NATO ASI Series, Vol 30, pp355-366, Springer, Berlin, 1987.
[3] T. Pavlidis, Algorithms for Graphics and Image Processing, Computer Science Press, 1982.
[4] N. Thune, Stereo Vision: An Integrated Approach, dr.ing. thesis, University of Trondheim, 1991.
[5] R. E. Blake, Partitioning Graph Matching with Constraints, Pattern Recognition, Vol 27, No.3, pp 439-446, March 1994.
[6] J. Foley, A. van Dam, S. Feiner and J. Hughes, Computer Graphics: Principles and Practice, Second Edition, Addison-Wesley, Reading, MA., 1990.
[7] R. E. Blake and P. Boros, The extraction of structural features for use in computer vision. Proceedings of the Second Asian Conference on Computer Vision, Singapore, December 1995.
[8] P. Boros and R. E. Blake, The calculation of the aspect graph from a CAD model. Proceedings of the Second Asian Conference on Computer Vision, Singapore, December 1995.

The page tree has more information on:

Details of computer vision research Here

Return to home page Here