Ocean simulation sample in C++ AMP
Hi, my name is Kevin Gao and I am a SDET on the C++ AMP team. In this blog post, I am sharing an ocean simulation sample.
The simulation theory comes from the paper, “Simulating Ocean Water”, written by Jerry Tessendorf. The original sample was written in HLSL by NVIDIA, and I ported the code to use C++ AMP.
The UI layer of the app depends on the DXUT framework which is released with the DirectX 11 SDK. The app renders a 512*512 point in the ocean and when you run it there is a “Toggle wireframe” option to see how it renders.
The class ocean_simulator has a global object pointed by g_pocean_simulator. It initializes the program and does the calculation of each point per frame. The ocean_simulator::update_displacement_map method is called once per frame, and it calls the function, update_spectrum in ocean_simulator.cpp that calculates the wave spectrum for every frame. Then ocean_simulator::update_displacement_map also calls csfft512x512_plan::fft_512x512_c2c_amp to calculate the height of each point. The fft_512x512_c2c_amp method is a parallel fast Fourier transform (FFT). Its parameter src is an array_view that specifies the x and y of the spatial spectrum of each point. The dst parameter is an array_view that is the output. That method calls csfft512x512_plan::radix008A_amp six times, which itself calls the parallel_for_each. After the height of each point has been calculated, the displacement function is called. The displacement function is UpdateDisplacementPS which is in a shader file: ocean_simulator_vs_ps.hlsl. Then the rendering part starts to work, which I will not touch on here.
You can download the ZIP file that contains the sample project attached to this blog post. Refer to the README.txt for known issues. The issues will be fixed in the next release. To build this project you need Visual Studio 11.
Comments
Anonymous
March 17, 2012
How does this compare in FPS terms to the original NVIDIA sample?Anonymous
March 20, 2012
Hi Mahesh, the DirectCompute/HLSL implementation (done by NVIDIA) is what Microsoft’s implementation of C++ AMP builds on. So naturally, while our goal is to strive for parity, we can’t always achieve that. Having said that, we measured the perf and found our implementation to be faster(!), which doesn’t make a lot of sense (to me at least), so I think we’ll need to look into why that is instead of sharing those numbers directly. If you want to compare the two implementations on your machine with your graphics card (whether it is from AMD or NVIDIA or someone else), please try our implementation from above (in release mode obviously) and the HLSL/DirectCompute implementation from the CUDA SDK (“C:ProgramDataNVIDIA CorporationNVIDIA GPU Computing SDK 4.1DirectComputebinwin64Release”). Have fun!