Jacobus J Barnard
Publications
Abstract:
We decided to test a surprisingly simple hypothesis; namely, that the relationship between an image of a scene and the chromaticity of scene illumination could be learned by a neural network. The thought was that if this relationship could be extracted by a neural network, then the trained network would be able to determine a scene's Illuminant from its image, which would then allow correction of the image colors to those relative to a standard illuminance, thereby providing color constancy. Using a database of surface reflectances and illuminants, along with the spectral sensitivity functions of our camera, we generated thousands of images of randomly selected illuminants lighting `scenes' of 1 to 60 randomly selected reflectances. During the learning phase the network is provided the image data along with the chromaticity of its illuminant. After training, the network outputs (very quickly) the chromaticity of the illumination given only the image data. We obtained surprisingly good estimates of the ambient illumination lighting from the network even when applied to scenes in our lab that were completely unrelated to the training data.
Abstract:
We develop a comprehensive Bayesian generative model for understanding indoor scenes. While it is common in this domain to approximate objects with 3D bounding boxes, we propose using strong representations with finer granularity. For example, we model a chair as a set of four legs, a seat and a backrest. We find that modeling detailed geometry improves recognition and reconstruction, and enables more refined use of appearance for scene understanding. We demonstrate this with a new likelihood function that rewards 3D object hypotheses whose 2D projection is more uniform in color distribution. Such a measure would be confused by background pixels if we used a bounding box to represent a concave object like a chair. Complex objects are modeled using a set or re-usable 3D parts, and we show that this representation captures much of the variation among object instances with relatively few parameters. We also designed specific data-driven inference mechanisms for each part that are shared by all objects containing that part, which helps make inference transparent to the modeler. Further, we show how to exploit contextual relationships to detect more objects, by, for example, proposing chairs around and underneath tables. We present results showing the benefits of each of these innovations. The performance of our approach often exceeds that of state-of-the-art methods on the two tasks of room layout estimation and object recognition, as evaluated on two bench mark data sets used in this domain. © 2013 IEEE.
Abstract:
We propose a new Web image selection method which employs the region-based bag-of-features representation. The contribution of this work is (1) to introduce the region-based bag-of-features representation into an Web image selection task where training data is incomplete, and (2) to prove its effectiveness by experiments with both generative and discriminative machine learning methods. In the experiments, we used a multiple-instance learning SVM and a standard SVM as discriminative methods, and pLSA and LDA mixture models as probabilistic generative methods. Several works on Web image filtering task with bag-of-features have been proposed so far. However, in case that the training data includes much noise, sufficient results could not be obtained. In this paper, we divide images into regions and classify each region instead of classifying whole images. By this region-based classification, we can separate foreground regions from background regions and achieve more effective image training from incomplete training data. By the experiments, we show that the results by the proposed methods outperformed the results by the whole-image-based bag-of-features. Copyright 2010 ACM.
Abstract:
We propose a method for understanding the 3D geometry of indoor environments (e.g. bedrooms, kitchens) while simultaneously identifying objects in the scene (e.g. beds, couches, doors). We focus on how modeling the geometry and location of specific objects is helpful for indoor scene understanding. For example, beds are shorter than they are wide, and are more likely to be in the center of the room than cabinets, which are tall and narrow. We use a generative statistical model that integrates a camera model, an enclosing room box, frames (windows, doors, pictures), and objects (beds, tables, couches, cabinets), each with their own prior on size, relative dimensions, and locations. We fit the parameters of this complex, multi-dimensional statistical model using an MCMC sampling approach that combines discrete changes (e.g, adding a bed), and continuous parameter changes (e.g., making the bed larger). We find that introducing object category leads to state-of-the-art performance on room layout estimation, while also enabling recognition based only on geometry. © 2012 IEEE.
Abstract:
We propose measuring "visualness" of concepts with images on the Web, that is, what extent concepts have visual characteristics. This is a new application of "Web image mining". To know which concept has visually discriminative power is important for image recognition, since not all concepts are related to visual contents. Mining image data on the Web with our method enables it. Our method performs probabilistic region selection for images and computes an entropy measure which represents "visualness" of concepts. In the experiments, we collected about forty thousand images from the Web for 150 concepts. We examined which concepts are suitable for annotation of image contents.