GPUs are very powerful
GPUs
are very powerful in computations which can be mapped as SIMD
algorithms. GPUs provide better performance than todays CPUs, but on
the other hand programs are very limited in functionality, complexity
and memory access. Flexibility
Ray tracing allows us to trace individual or unstructured
groups of rays. This provides for efficient computation
of just the required information, e.g. for sampling
narrow glossy highlights, for filling holes in image-based
rendering, and for importance sampling of illumination.
Eventually this flexibility is required if we want to achieve
interactive global illumination simulations based on ray
tracing. Occlusion Culling and Logarithmic Complexity
Ray tracing enables efficient rendering of complex scenes
through its built in occlusion culling as well as its logarithmic
complexity in the number of scene primitives. Using
a simple search data structure it can quickly locate the
relevant geometry in a scene and stops its front to back
processing as soon as visibility has been determined. This
approach to process geometry on demand stands in strong
contrast to the ìsend all geometry and discard at the endî
approach taken by current triangle rasterization hardware. Efficient Shading
With ray tracing, samples are only shaded
after visibility has been determined. Given the trend
toward more and more realistic and complex shading, this
avoids redundant computations for invisible geometry. Simpler Shader Programming
Programming shaders that
create special lighting and appearance effects has been at
the core of realistic rendering. While writing shaders (e.g.
for the RenderMan standard) is fairly straightforward,
adopting these shaders to be used in the pipeline model of
rasterization has been very difficult. Since ray tracing is
not limited to this pipeline model it can make direct use
of shaders. Correctness
By default ray tracing computes physically
correct reflections, refractions, and shading. In case the
correct results are not required or are too costly to compute,
ray tracing can easily make use of the same approximations
used to generate these effects for rasterization-based
approaches, such as reflection or environment maps. This
is contrary to rasterization, where approximations are the
only option and it is difficult to even come close to realistic
effects. Parallel Scalability
Ray tracing is known for being trivially
parallel as long as a high enough bandwidth to the
scene data is provided. Given the exponential growth of
available hardware resources, ray tracing should be better
able to utilize it than rasterization, which has been difficult
to scale efficiently. However, the initial resources
required for a hardware ray tracing engine are higher than
those for a rasterization engine. Coherence
Coherence is the key to efficient rendering. Due to the low
coherence between rays in traditional recursive ray tracing
implementations, performance has been rather low.
| | Random memory access
Simple
recursive approach tends to randomized memory querying. Primary rays
express quite good degree of coherency, but secondary rays are nearly
random in general scenes. Random memory access is very limeted on
todays GPUs. This requires more complex algorithms and data structures
to reduce this problem. Coherent algorithms are complex
Current
GPU architectures are limited in number instructions per program
kernel. Current GPU programs must be light-weight to perform in one
pass. If more complex computations are required, program computation
must be broken into more kernels which are executed consequentialy as
multi-pass algorithms. This leads to more bandwidth requirements and
bigger overhead because intial and final stage must be executed in
every kernel. That is because program state must be saved and restored
between each two kernels. Data are saved to video memory (VRAM) which
utilizes bus between GPU and VRAM. Algorithms are limited in general functionality
Because
of SIMD architecture some operations common on CPUs are not available
on GPUs. For example some basic algorithms like sorting are not
possible efectivelly because of random memory access restrictions.
These algoritms are possible but with cost of massive multi-pass
solution which is unusable due to performance overhead reasons. There
is a need to find SIMD replacements of these basic algorithms and apply
them effective in GPU. Advanced accelerating structures are complex
The
great afford was invested into developing sophisticated data structures
for accelerating ray-scene traversal. Many advanced data structures and
algorithms were presented. In today's GPUs most of them are unusable
because of GPU restrictions. The challenge is to find simple enough but
efective enough data structure which can be used for accelerating rays
in GPUs. This will be allways trade-off between program or data
structure complexity and brute-force raytracing. GPUs are not designed for raytracing
Current
rasterization specialised GPUs are not designed for complex raytracing.
The task is to reformulate raytracing to adapt existing GPU
programability and features. This will probably need to define some
presumptions on scene types or limits to general raytracing
functionality. Dynamic scene updates are problematic
To
compute rays in GPU all needed scene data must be present in fast VRAM.
It is not effective to send scene data each frame to the GPU. When
displaying dynamic scenes some incremental updating of data structure
in GPU must be developed. This increases the complexity of algorithms. Technical problems
Modern
GPUs are evolving very fast (in terms of transistors counts and
performance faster than Moore's law). There can be situation hardware
supports new features early but driver or software API functionality is
delayed or incorrect. This is another cave-eat for GPU programmers. |