Visual Computing

Our research in visual computing lies at the intersection of visualization, computer graphics, and computer vision. It spans a wide range of topics, including bio-medical visualization, image and video analysis, 3D fabrication, and data science.

Our Research

Our goal is to combine interactive computer systems with the perceptual and cognitive power of human observers to solve practical problems in science and engineering. We are providing visual analysis tools and methods to help scientists and researchers better process and understand large, multi-dimensional data sets in various domains such as neuroscience, genomics, systems biology, astronomy, and medicine. And we are developing data-driven approaches for the acquisition, modeling, visualization, and fabrication of complex objects. 


Michaela Kapp
Administrative Manager of Research

33 Oxford Street
Maxwell Dworkin 143
Cambridge, MA 02138
Office Phone: (617) 496-0964

Our Lab

Our group belongs to Harvard's School of Engineering and Applied Sciences and the Center for Brain Science. We are located in the Maxwell Dworkin Building (33 Oxford St.) as well as the Northwest Laboratory (52 Oxford St.) on Harvard's main campus in Cambridge, Massachusetts.

Recent Publications

J. Pan, D. Sun, M. - H. Yang, and H. Pfister, “Blind Image Deblurring Using Dark Channel Prior,” in IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2016, 2016.Abstract
We present a simple and effective blind image deblur-
ring method based on the dark channel prior. Our work is
inspired by the interesting observation that the dark chan-
nel of blurred images is less sparse. While most image
patches in the clean image contain some dark pixels, these
pixels are not dark when averaged with neighboring high-
intensity pixels during the blur process. This change in the
sparsity of the dark channel is an inherent property of the
blur process, which we both prove mathematically and val-
idate using training data. Therefore, enforcing the sparsity
of the dark channel helps blind deblurring on various sce-
narios, including natural, face, text, and low-illumination
images. However, sparsity of the dark channel introduces
a non-convex non-linear optimization problem. We intro-
duce a linear approximation of the
operator to com-
pute the dark channel. Our look-up-table-based method
converges fast in practice and can be directly extended to
non-uniform deblurring. Extensive experiments show that
our method achieves state-of-the-art results on deblurring
natural images and compares favorably methods that are
well-engineered for specific scenarios.
M. Piovarči, et al., “An Interaction-Aware, Perceptual Model For Non-Linear Elastic Objects,” ACM Transactions on Graphics 35(4) (Proc. SIGGRAPH 2016, Anaheim, California, USA). 2016.Abstract

Everyone, from a shopper buying shoes to a doctor palpating a growth, uses their sense of touch to learn about the world. 3D printing is a powerful technology because it gives us the ability to control the haptic impression an object creates. This is critical for both replicating existing, real-world constructs and designing novel ones. However, each 3D printer has different capabilities and supports different materials, leaving us to ask: How can we best replicate a given haptic result on a particular output device? In this work, we address the problem of mapping a real-world material to its nearest 3D printable counterpart by constructing a perceptual model for the compliance of nonlinearly elastic objects. We begin by building a perceptual space from experimentally obtained user comparisons of twelve 3D-printed metamaterials. By comparing this space to a number of hypothetical computational models, we identify those that can be used to accurately and efficiently evaluate human-perceived differences in nonlinear stiffness. Furthermore, we demonstrate how such models can be applied to complex geometries in an interaction-aware way where the compliance is influenced not only by the material properties from which the object is made but also its geometry. We demonstrate several applications of our method in the context of fabrication and evaluate them in a series of user experiments.

VESICLE: Volumetric Evaluation of Synaptic Interfaces using Computer Vision at Large Scale
W. G. Roncal, et al., “VESICLE: Volumetric Evaluation of Synaptic Interfaces using Computer Vision at Large Scale,” in Proceedings of the British Machine Vision Conference (BMVC), 2015, pp. 81.1-81.13. Publisher's VersionAbstract

An open challenge at the forefront of modern neuroscience is to obtain a comprehensive mapping of the neural pathways that underlie human brain function; an enhanced understanding of the wiring diagram of the brain promises to lead to new breakthroughs in diagnosing and treating neurological disorders. Inferring brain structure from image data, such as that obtained via electron microscopy (EM), entails solving the problem of identifying biological structures in large data volumes. Synapses, which are a key communication structure in the brain, are particularly difficult to detect due to their small size and limited contrast. Prior work in automated synapse detection has relied upon time-intensive, error-prone biological preparations (isotropic slicing, post-staining) in order to simplify the problem. This paper presents VESICLE, the first known approach designed for mammalian synapse detection in anisotropic, non-poststained data. Our methods explicitly leverage biological context, and the results exceed existing synapse detection methods in terms of accuracy and scalability. We provide two different approaches - a deep learning classifier (VESICLE-CNN) and a lightweight Random Forest approach (VESICLE-RF), to offer alternatives in the performance-scalability space. Addressing this synapse detection challenge enables the analysis of high-throughput imaging that is soon expected to produce petabytes of data, and provides tools for more rapid estimation of brain-graphs. Finally, to facilitate community efforts, we developed tools for large-scale object detection, and demonstrated this framework to find ~50,000 synapses in 60,000 um^3 (220 GB on disk) of electron microscopy data.

A Crowdsourced Alternative to Eye-tracking for Visualization Understanding
N. W. Kim, Z. Bylinskii, M. A. Borkin, A. Oliva, K. Z. Gajos, and H. Pfister, “A Crowdsourced Alternative to Eye-tracking for Visualization Understanding,” in CHI’15 Extended Abstracts, Seoul, Korea, 2015, pp. 1349-1354. Publisher's VersionAbstract

In this study we investigate the utility of using mouse clicks as an alternative for eye fixations in the context of understanding data visualizations. We developed a crowdsourced study online in which participants were presented with a series of images containing graphs and diagrams and asked to describe them. Each image was blurred so that the participant needed to click to reveal bubbles - small, circular areas of the image at normal resolution. This is similar to having a confined area of focus like the human eye fovea. We compared the bubble click data with the fixation data from a complementary eye-tracking experiment by calculating the similarity between the resulting heatmaps. A high similarity score suggests that our methodology may be a viable crowdsourced alternative to eye-tracking experiments, especially when little to no eye-tracking data is available. This methodology can also be used to complement eye-tracking studies with an additional behavioral measurement, since it is specifically designed to measure which information people consciously choose to examine for understanding visualizations.

State-of-the-Art in GPU-Based Large-Scale Volume Visualization
J. Beyer, M. Hadwiger, and H. Pfister, “State-of-the-Art in GPU-Based Large-Scale Volume Visualization,” Computer Graphics Forum, 2015. Publisher's VersionAbstract

This survey gives an overview of the current state of the art in GPU techniques for interactive large-scale volume visualization. Modern techniques in this field have brought about a sea change in how interactive visualization and analysis of giga-, tera-, and petabytes of volume data can be enabled on GPUs. In addition to combining the parallel processing power of GPUs with out-of-core methods and data streaming, a major enabler for interactivity is making both the computational and the visualization effort proportional to the amount and resolution of data that is actually visible on screen, i.e., “output-sensitive” algorithms and system designs. This leads to recent output- sensitive approaches that are “ray-guided,” “visualization-driven,” or “display-aware.” In this survey, we focus on these characteristics and propose a new categorization of GPU-based large-scale volume visualization techniques based on the notions of actual output-resolution visibility and the current working set of volume bricks—the current subset of data that is minimally required to produce an output image of the desired display resolution. Furthermore, we discuss the differences and similarities of different rendering and data traversal strategies in volume rendering by putting them into a common context—the notion of address translation. For our purposes here, we view parallel (distributed) visualization using clusters as an orthogonal set of techniques that we do not discuss in detail but that can be used in conjunction with what we discuss in this survey.

Large-Scale Automatic Reconstruction of Neuronal Processes from Electron Microscopy Images
V. Kaynig, et al., “Large-Scale Automatic Reconstruction of Neuronal Processes from Electron Microscopy Images,” Medical Image Analysis, vol. 22, no. 1, pp. 77-88, 2015.Abstract

Automated sample preparation and electron microscopy enables acquisition of very large image data sets. These technical advances are of special importance to the field of neuroanatomy, as 3D reconstructions of neuronal processes at the nm scale can provide new
insight into the fine grained structure of the brain. Segmentation of large-scale electron microscopy data is the main bottleneck in the analysis of these data sets. In this paper we present a pipeline that provides state-of-the art reconstruction performance while scaling
to data sets in the GB-TB range. First, we train a random forest classifier on interactive sparse user annotations. The classifier output is combined with an anisotropic smoothing prior in a Conditional Random Field framework to generate multiple segmentation
hypotheses per image. These segmentations are then combined into geometrically consistent 3D objects by segmentation fusion. We provide qualitative and quantitative evaluation of the automatic segmentation and demonstrate large-scale 3D reconstructions
of neuronal processes from a 27; 000 m3 volume of brain tissue over a cube of 30 m in each dimension corresponding to 1,000 consecutive image sections. We also introduce Mojo, a proofreading tool including semi-automated correction of merge errors
based on sparse user scribbles.