Automatic 3D object segmentation in multiple views using volumetric graph-cuts

Neill D.F. Campbell, G. Vogiatzis, C. Hernández, R. Cipolla

    Research output: Contribution to journalArticlepeer-review


    We propose an algorithm for automatically obtaining a segmentation of a rigid object in a sequence of images that are calibrated for camera pose and intrinsic parameters. Until recently, the best segmentation results have been obtained by interactive methods that require manual labelling of image regions. Our method requires no user input but instead relies on the camera fixating on the object of interest during the sequence. We begin by learning a model of the object's colour, from the image pixels around the fixation points. We then extract image edges and combine these with the object colour information in a volumetric binary MRF model. The globally optimal segmentation of 3D space is obtained by a graph-cut optimisation. From this segmentation an improved colour model is extracted and the whole process is iterated until convergence. Our first finding is that the fixation constraint, which requires that the object of interest is more or less central in the image, is enough to determine what to segment and initialise an automatic segmentation process. Second, we find that by performing a single segmentation in 3D, we implicitly exploit a 3D rigidity constraint, expressed as silhouette coherency, which significantly improves silhouette quality over independent 2D segmentations. We demonstrate the validity of our approach by providing segmentation results on real sequences.

    Original languageEnglish
    Pages (from-to)14-25
    Number of pages12
    JournalImage and Vision Computing
    Issue number1
    Publication statusPublished - 1 Jan 2010

    Bibliographical note

    © 2010, Elsevier. Licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International


    • Graph-cut
    • Multiple view
    • Segmentation


    Dive into the research topics of 'Automatic 3D object segmentation in multiple views using volumetric graph-cuts'. Together they form a unique fingerprint.

    Cite this