This paper describes a convolution with a systolic array structure for perspective projection in real-time volume graphics based on the shear-warp method. In the original method, the further the ray proceeds, the more voxels are required to calculate the convolution. The increase in required voxels makes it difficult to implement the method in a VLSI-oriented architecture. We implement a 3D convolution using three serial 1D convolutions along the X, Y, and Z axes, which reduces the number of calculation units from M^3 to 3M, where the convolution is calculated for the M^3 area. The number of pipelines for the rays is V^2 for V^3 voxel datasets. If the hardware of a single pipeline can calculate the V rays, then each of the implemented pipelines is assigned to V theoretical pipelines (for V^2 rays). The number of hardware pipelines should be much smaller than V theoretical pipelines in actual implementation. We folded the theoretical pipelines and reduced them to a certain number of hardware pipelines. We examined the relation between the folding process and its necessary time delay. The architecture can generate an image of a 256^3 voxel dataset (V=256) at 30Hz with four pipelines. In addition, the architecture can be extended easily for 512^3 (V=512) and 1024^3 (V=1024) datasets, with 32 pipelines and 256 pipelines. Our architecture has processing scalability.
Surface elements (surfels) are a powerful paradigm to efficiently render complex geometric objects at interactive frame rates. Unlike classical surface discretizations, i.e., triangles or quadrilateral meshes, surfels are point primitives without explicit connectivity. Surfel attributes comprise depth, texture color, normal, and others. As a pre-process, an octree-based surfel representation of a geometric object is computed. During sampling, surfel positions and normals are optionally perturbed, and different levels of texture colors are prefiltered and stored per surfel. During rendering, a hierarchical forward warping algorithm projects surfels to a z-buffer. A novel method called visibility splatting determines visible surfels and holes in the z-buffer. Visible surfels are shaded using texture filtering, Phong illumination, and environment mapping using per-surfel normals. Several methods of image reconstruction, including supersampling, offer flexible speed-quality tradeoffs. Due to the simplicity of the operations, the surfel rendering pipeline is amenable for hardware implementation. Surfel objects offer complex shape, low rendering cost and high image quality, which makes them specifically suited for low-cost, real-time graphics, such as games.
Over the last decade, volume rendering has become an invaluable visualization technique for a wide variety of applications. This paper reviews three special-purpose architectures for interactive volume rendering: texture mapping, VIRIM, and VolumePro. Commercial
implementations of these architectures are available or underway. The discussion of each architecture will focus on the algorithm, system architecture, memory system, and volume rendering performance.
Real-time visualization of large volume datasets demands high performance computation, pushing the storage, processing, and data communication requirements to the limits of current technology. General purpose parallel processors have been used to visualize moderate size datasets at interactive frame rates; however, the cost and size of these supercomputers inhibits the widespread use
for real-time visualization. This paper surveys several special purpose architectures that seek to render volumes at interactive rates. These specialized visualization accelerators have cost, performance, and size advantages over parallel processors. All architectures implement ray casting using parallel and pipelined hardware. We introduce a new metric that normalizes performance to compare these architectures. The architectures included in this survey are VOGUE, VIRIM, Array Based Ray Casting, EM-Cube, and VIZARD II. We also discuss future applications of special purpose accelerators.
Imagine a doctor having the ability to visualize and diagnose a defect in an unborn baby’s heart. Imagine a team of geophysicists being able to interact in real-time with seismic data in the discovery of a deep ocean reservoir of oil. Imagine a cell biologist visualizing in-vivo the precise molecular structure that allows a new AIDS drug to attack mutant strains of the virus. Imagine that every piece of luggage moving through airports is instantly inspected with high accuracy to ensure the safety of all traveling passengers. This is the promise of real-time volume rendering.
’VolumePro’, Mitsubishi Electric’s new family of PCI boards, provides the power to solve these and other difﬁcult problems for the ﬁrst time on PC class computers. ’VolumePro’ achieves signiﬁcantly higher levels of performance and image quality than has previously existed. It visualizes not only external but internal properties of acquired or simulated 3D data through real-time volume rendering.
This paper describes VolumePro, the world’s ﬁrst single-chip realtime volume rendering system for consumer PCs. VolumePro implements ray-casting with parallel slice-by-slice processing. Our discussion of the architecture focuses mainly on the rendering pipeline and the memory organization. VolumePro has hardware for gradient estimation, classiﬁcation, and per-sample Phong illumination. The system does not perform any pre-processing and makes parameter adjustments and changes to the volume data immediately visible. We describe several advanced features of VolumePro, such as gradient magnitude modulation of opacity and illumination, supersampling, cropping and cut planes. The system renders 500 million interpolated, Phong illuminated, composited samples per second. This is sufﬁcient to render volumes with up to 16 million voxels (e.g., 256^3) at 30 frames per second.
This paper describes an object-order real-time volume rendering architecture using an adaptive resampling scheme to perform resampling operations in a uniﬁed parallel-pipeline manner for both parallel and perspective projections. Unlike parallel projections, perspective projections require a variable resampling structure due to diverging perspective rays. In order to address this issue, we propose an adaptive pipelined convolution block for resampling operations using the level of resolution to keep the parallel-pipeline structure regular. We also propose to use multi-resolution datasets prepared for different levels of grid resolution to bound the convolution operations. The proposed convolution block is organized using a systolic array structure, which works well with a distributed skewed memory for conﬂict-free accesses of voxels. We present the results of some experiments with our software simulators of the proposed architecture and discuss about important technical issues.
We present Cube-4, a special-purpose volume rendering architecture that is capable of rendering high-resolution (e.g., 10243) datasets at 30 frames per second. The underlying algorithm, called slice-parallel ray-casting, uses tri-linear interpolation of samples between data slices for parallel and perspective projections. The architecture uses a distributed interleaved memory, several parallel processing pipelines, and an innovative parallel data flow scheme that requires no global communication, except at the pixel level. This leads to local, fixed bandwidth interconnections and has the benefits of high memory bandwidth, real-time data input, modularity, and scalability. We have simulated the architecture and have implemented a working prototype of the complete hardware on a configurable custom hardware machine. Our results indicate true real-time performance for high-resolution datasets and linear scalability of performance with the number of processing pipelines
We present two implementations of the Cube-4 volume rendering architecture on the Teramac custom computing machine. Cube-4 uses a sliceparallel ray-casting algorithm that allows for a parallel and pipelined implementation of ray-casting with tri-linear interpolation and surface normal estimation from interpolated samples. Shading, classification and compositing are part of rendering pipeline. With the partitioning schemes introduced in this paper, Cube-4 is capable of rendering large datasets with a limited number of pipelines. The Teramac hardware simulator at the Hewlett-Packard research laboratories, Palo Alto, CA, on which Cube-4 was implemented, belongs to the new class of custom computing machines. Teramac combines the speed of special-purpose hardware with the flexibility of general-purpose computers. With Teramac as a development tool we were able to implement in just five weeks working Cube-4 prototypes, capable of rendering for example datasets of 1283 voxels in 0.65 seconds at 0.96 MHz processing frequency. The performance results from these implementations indicate real-time performance for high-resolution data-sets.
This paper presents a novel approach to assist the user in exploring appropriate transfer functions for the visualization of volumetric datasets. The search for a transfer function is treated as a parameter optimization problem and addressed with stochastic search techniques. Starting from an initial population of (random or pre-defined) transfer functions, the evolution of the stochastic algorithms is controlled by either direct user selection of intermediate images or automatic fitness evaluation using user-specified objective functions. This approach essentially shields the user from the complex and tedious \trial and error" approach, and demonstrates effective and convenient generation of transfer functions.
In this paper we present our research efforts towards a scalable volume rendering architecture for the real-time visualization of dynamically changing high-resolution datasets. Using a linearly skewed memory interleaving we were able to develop a parallel dataflow model that leads to local, fixed-bandwidth interconnections between processing elements. This parallel dataflow model differs from previous work in that it requires no global communication of data except at the pixel level. Using this dataflow model we are developing Cube-4, an architecture that is scalable to very high performances and allows for modular and extensible hardware implementations.
This paper describes a high-performance special-purpose system, Cube-3, for displaying and manipulating high- resolution volumetric datasets in real-time. A primary goal of Cube-3 is to render 512^3, 16-bit per voxel, datasets at about 30 frames per second. Cube-3 implements a ray-casting algorithm in a highly-parallel and pipelined architecture, using a 3D skewed volume memory, a modular fast bus, 2D skewed buffers, 3D interpolation and shading units, and a ray projection cone. Cube-3 will allow users to interactively visualize and investigate in real-time static (3D) and dynamic (4D) high-resolution volumetric datasets.
In this paper we present a technique for the interactive control and display of static and dynamic 3D datasets. We describe novel ways of tri-linear interpolation and gradient estimation for a real-time volume rendering system, using coherency between rays. We show simulation results that compare the proposed methods to traditional algorithms and present them in the cantezt of Cube-B, a special-purpose architecture capable of rendering 5123 16-bit per voxel datasets at over 20 frames per second.
VolVis is a diversified, easy to use, extensible, high performance, and portable volume visualization system for scientists and engineers as well as for visualization developers and researchers. VolVis accepts as input 3D scalar volumetric data as well as 3D volume-sampled and classical geometric models. Interaction with the data is controlled by a variety of 3D input devices in an input device-independent environment. VolVis output includes navigation preview, static images, and animation sequences. A variety of volume rendering algorithms are supported, ranging from fast rough approximations, to compression-domain rendering, to accurate volumetric ray tracing and radiosity, and irregular grid rendering.
This paper describes a high-performance special-purpose system, the Cube-3 machine, for displaying and manipulating high-resolution volumetric datasets in real-time. Cube-3 will allow scientists, engineers, and biomedical researchers to interactively visualize and investigate their static high-resolution sampled, simulated, or computed volumetric dataset. Furthermore, once acquisition devices or mechanisms are capable of acquiring a complete high-resolution dynamic dataset in real-time, Cube-3, tightly coupled with them, will be capable of delivering real-time 4D (spatial-temporal) volume visualization, a task currently not possible with present technologies.