2009-03-13 21:52:13
So, I've been trying my hand at writing a physics engine recently. Not that there aren't plenty of good ones out there and available, but strictly for personal growth. My little dream is to implement everything necessary, and then implement everything necessary again using the fragment shader in OpenGL. CUDA could be used...I have an NVIDIA 9800GT, but CUDA isn't available everywhere GLSL is and, to the best of my knowledge, there isn't any physics engine out there that leverages OpenGL specifically.

Originally I wanted to use the vertex shader and glFeedBack to accomplish this...but glFeedBack is slower than...well, very slow. Moral of the story is, use the fragment shader, because there is no other way to get data out of OpenGL 2.0

My little debug output below is from a fragment shader that does 1000 div ops to a quad that fills the screen. I iterate 100 times between start and end marks, clearing the frame buffer each time AND reading the top-left 100x100 pixels. I wanted to get a clear indicator of what kind of through-put I could expect. The results were...promising. I need to next update my benchmark to include loading textures. My thought is that I can load all the relevant data for each of my objects into a set of textures and then access them in the pixel shader. I can use a few different shader programs to perform the various operations I'll need.

It may be that I'll have to pause physics simulations, though, if the view area drops below the dimensions I'd need to calculate all objects. Eh, it's a work in progress though...I'll figure it out.

{later} After some googling, it looks like I could just use a frame buffer object. It should be widely available enough and probably isn't drifting out of specification any time soon.

* before you mention it...yes, I know that NVIDIA has cards that can out perform 17.2G. I'm impressed nevertheless.


trace: new dim 944x479 (in "c:usersschangdocumentsvisual studio 2008projectsopengl physics testsjvsphysicsGLContext.h" at line 85)
trace: new dim 1280x698 (in "c:usersschangdocumentsvisual studio 2008projectsopengl physics testsjvsphysicsGLContext.h" at line 85)
trace: SelectTest: listbox index:2 (in ".OpenGL Physics Tests.cpp" at line 249)
debug: Factory::getSimulator passing back the default simulator (in ".Factory.cpp" at line 97)
debug: Factory::getCollisionDetector returning the AABBCollisionDetector (in ".Factory.cpp" at line 64)
trace: JvsPhysicsTest::JvsPhysicsTest holding the simulator pointer (in ".JvsPhysicsTest.cpp" at line 29)
trace: GLSLBenchTest::GLSLBenchTest beginning (in ".GLSLBenchTest.cpp" at line 22)
trace: Ready for OpenGL 2.0 (in ".GLSLBenchTest.cpp" at line 27)
trace: creating shader program to run on fragments (in ".GLSLBenchTest.cpp" at line 50)
trace: (1) : warning C7011: implicit cast from "int" to "float"
(1) : warning C7011: implicit cast from "int" to "float"
(1) : warning C7502: OpenGL does not allow type suffix 'f' on constant literals
(in ".GLSLBenchTest.cpp" at line 266)
trace: Fragment info
-------------
(1) : warning C7011: implicit cast from "int" to "float"
(1) : warning C7011: implicit cast from "int" to "float"
(1) : warning C7502: OpenGL does not allow type suffix 'f' on constant literals
(in ".GLSLBenchTest.cpp" at line 284)
trace: creating display list of a plane (in ".GLSLBenchTest.cpp" at line 68)
trace: SelectTest: set currentTest=GLSLBenchTest (in ".OpenGL Physics Tests.cpp" at line 269)
trace: max texture size:3379 (in ".GLSLBenchTest.cpp" at line 108)
trace: starting opengl at 5621 (in ".GLSLBenchTest.cpp" at line 125)
trace: ending opengl at 10814 (in ".GLSLBenchTest.cpp" at line 153)
trace: max texture size:3379 (in ".GLSLBenchTest.cpp" at line 108)
trace: starting opengl at 10819 (in ".GLSLBenchTest.cpp" at line 125)
trace: ending opengl at 16003 (in ".GLSLBenchTest.cpp" at line 153)
trace: max texture size:3379 (in ".GLSLBenchTest.cpp" at line 108)
trace: starting opengl at 16003 (in ".GLSLBenchTest.cpp" at line 125)
trace: ending opengl at 21196 (in ".GLSLBenchTest.cpp" at line 153)

fragment shader program
+======================
void main()
{
float f = 1000;
vec4 a = vec4(0.0, 0.0, 0.0, 1.0);
for(int i=0;i a.r=600.0f/gl_FragCoord.x;
gl_FragColor=a;
}

1280x698 screen dimension = 893440 pixels
893440 * 100 clear/rewrites = 89344000 pixels
89344000 pixels x 1000 divide operations in the fragment shader = 89344000000 operations
21196 end clock - 16003 start clock = 5193 milliseconds = 5.193 seconds
89344000000 operations / 5.193 = 17204698632 ops per second

OR 17.2 gigaflops!!!!





today i determined that each component still seems to be limited to 8 bytes...so i'll have to approach each axis as an independent iteration. fortunately, most physics calculations can be performed on each axis independently. the challenge will be reducing redundant calculations in the fragment shader. the essential sequence is sum forces, calculate velocities, calculate new positions, detect collisions, back-step to just at collision, then finally resolve collision response. detect collisions and back-step are the time consuming parts. well, as far as i've discovered so far...i can imagine lots of time syncs. i'd like to do fluid and cloth simulations.

so for each of those steps above, i'll have a pass for each of the 3 linear axis and each of the 3 rotational axis. 6 passes for each single pass step. for back-stepping (if i don't do ccd) then that will be 6 passes for each object in collision as many times as is necessary. i could max that at 10 steps, though, and then just take the deepest two interpenetrating faces.

i guess i could also interpolate axis values based on the modulus of the x coordinate in glFragCoord...that would probably be good enough. so, if x%3==1, then i'm on the y axis. 0 for x-axis, and 2 for z-axis.

type conversions are starting to worry me, though. the code that's been running here with only shader warnings (yes, i know i should reduce them...i just don't care to yet. i'm still in exploration) doesn't compile on my lappy.




{later} Turns out, the more recent video cards support 32-bit floating point textures via an extension. GL_RGBA32F_ARB, it seems, is what i want to use. I found a great tutorial on gpgpu.org, so I'm just going to go with that till I find a reason to depart.
Love _is_ acceptance and appreciation.
New 16
By Jon Schang

It's another experimental song, for me anyways. A mixture of some very impending doom sounding guitar and acoustic anticipation buildment.

All vocals in these are really just filler. My primary focus when writing these was music. I know, that doesn't explain "New 15"...I just happen to like that one.