This past week was spent in learning and setting up CUDA, stabilizing and rendering using my existing photon tracer and writing a simple CUDA ray tracer.
For the existing photon tracer (ktracer) I wrote last semester for CIS-660 (Advanced Topics in Computer Graphics), it is able to photon trace any given triangular mesh scene in 3DS format. It is able to merge multiple 3ds files into a single scene. It uses hierarchical uniform grid as acceleration structure, supports multi-threaded rendering and uses SIMD intrinsic on low-level computation. For photon mapping part, irradiance caching is done on the photons so that final gathering could be greatly accelerated. Here's some result of the current ktracer:
To showcase the ability to take advantage of the uniform grid, two car models and a house model is added to the scene. Now with 65K triangles and the same rendering settings, the above image is rendered in 27.43s.
The existing ktracer provides a baseline performance. It is still far from being interactive. In order to be interactive, the bottom scene needs to be rendered in ~1fps. In other words, without final gathering and using only one sample per pixel, we should get at least 50fps. This will be the performance target for my senior project.
For the rest of the week, I read a number of CUDA/raytracing tutorials. Here's a list of them that I found particularly helpful:
CUDA, Supercomputing for the Masses (http://www.drdobbs.com/high-performance-computing/207200659;jsessionid=XNULCSWKTFGIZQE1GHPCKH4ATMY32JVN)
Triers CUDA Raytracing Tutorial (http://cg.alexandra.dk/2009/08/10/triers-cuda-ray-tracing-tutorial/) They provided some great skeletal code for starting to write a ray tracer. Part of my project is based on the CUDA/OpenGL code of the tutorial.
NVIDIA CUDA C Programming Guide. This is the official guide with up-to-date information.
I set up my newly built test bench with i7-950 processor, 4GB memory and GeForce GTX 470 graphics card. Then I downloaded and installed CUDA Toolkit and GPU Computing SDK 3.2 RC. I also found that the latest 1.5 RC version of Parallel NSight, which supports Visual Studio 2010, is open to Beta program participants. So I applied to their program and soon got admitted. In my opinion, the folks in NVIDIA have done a really good job integrating with VS2010. Also, with a host and a target machine, we can enable remote debugging of CUDA programs with easy operations such as setting breakpoints, a very neat feature.
After quite some time of coding, I finished the initial ray tracer that is able to load OBJ files with MTL support. It currently doesn't have any acceleration structures, but for the Cornell Box scene, it renders it very smoothly at over 60 fps. I started off porting the 3DS loader from my previous renderer to the new one, but ended up getting all kinds of compiler errors. Eventually, I gave up that idea and instead wrote an OBJ loader and used 3DS MAX to export the file to OBJ format. A screenshot and a video below showcases the current renderer:
Here is a video showing some real-time camera movement: