Exercise code about stencil computation
Todo
1. 7-point 3D stencil for jacobi iterator
2. Geratate GPU elapsed time to measure performance
3. Use shared memory to improve computation performance
4. Use constant memory
5. Use 3D texture memory to implement 3D stencil
6. Attempt to use more than one stream to compute stencil
7. Add argument input support for blocks, grids and timetiling
8. Add file input/output support
9. Implement any size input to computation