What does Flip really do?

BlitzMax Forums/BlitzMax Programming/What does Flip really do?

Fry Crayola(Posted 2006) [#1]
I was playing about with some code, having a look to see where the framerate was taking its biggest hits.

I was a bit surprised to find that the "hit" came at Flip (even when set to Flip 0), rather than any of the drawing operations. In fact, everything else was virtually instant until flip came along.

Intriguingly, the more I "draw" earlier in the loop, the longer Flip takes. This is the bit I'm interested in. I'd have thought the act of drawing to the backbuffer would be the time consuming task, with the Flip taking the same length of time regardless of how many objects you've drawn.

It seems not.

I'd love to know what Flip's doing, to see if I can understand what's going on a little better.


H&K(Posted 2006) [#2]
OK, Flip can do one of two things, swap the backbuffer for the frount buffer imediatly Flip 0, or wait for a screen refresh (often 16Milli) Flip 1.

So do flip -1 and see what happens

Flip swap the front and back buffers of the current graphics objects.

If sync is 0, then the flip occurs as soon as possible. If sync is 1, then the flip occurs on the next vertical blank.

If sync is -1 and the current graphics object was created with the Graphics command, then flips will occur at the graphics object's refresh rate regardless of whether or not the graphics hardware supports such a refresh rate.

If sync is -1 and the current graphics object was NOT created with the Graphics command, then the flip will occur on the next vertical blank.


If Flip 1,0 and -1 are still giving a big hit, then post some code, cos maybe its a graphics card issue

Or And Im not sure about this, so we'll wait for the big boys, but if you have done lots of Drawpoly and stuff maybe Opengl and Directx doesnt do any of the drawing untill flip. That is that maybe we are just telling the card what we want, and then flip is waiting for the card to do it. I agree it would be nice for a better breakdown, but normaly flip 0 makes things lots and lots faster for me


Dreamora(Posted 2006) [#3]
Fry: Flip always takes the same amount of time. But your drawing commands force uploads of textures to the graphic card, depending on how much you draw, the longer that takes.
And as you've notices, this is done on flip which itself holds the rendering command. (Blitz3D has seperated commands for that, flip and RenderWorld)


Fry Crayola(Posted 2006) [#4]
Here's some code that perfectly illustrates it.

SuperStrict

Graphics 800,600, 0, 0
SetBlend alphablend

While Not KeyHit(KEY_J)
		
	Cls
	
	'Draw
	Local start:Int = MilliSecs()
	For Local count:Int = 1 To 1
		DrawRect 10, 10, 600, 480
	Next
	Local endtime:Int = MilliSecs()
	Print "Draw Time: "+(endtime - start)
	
	'FLIP	
	Local flipstart:Int = MilliSecs()
	Flip -1
	Local flipend:Int = MilliSecs()
	Print "Flip Time: "+(flipend - flipstart)
	
Wend


On the system I'm using at the moment (P4 2.2GHz, 32MB shared video RAM, 736MB RAM) that tells me that Flip is taking 8 millisecs, while the drawing takes 0. Debug mode off.

If I increase the number of rectangles to draw to 10, the drawing time remains 0 while the Flip increases to 34. 100 rectangles, you get a draw time of 0, a flip time of 292.

It's a strange result, as though everything's just waiting to be drawn at the flip instead of drawn to the backbuffer when I call DrawRect().

Edit: Which Dreamora seems to be pretty much confirming. Aha!

This does make it difficult to isolate which drawing commands are particularly slow, however. It seems I can't "time" anything individually, as they'll all come up pretty quickly and the effects won't show until flip.

Still, nice to know how Flip works.


bradford6(Posted 2006) [#5]
here it is...straight from the graphics.bmx file

Function Flip( sync=-1 )
	RunHooks FlipHook,Null
	If sync<>-1
		_driver.Flip sync
		Return
	EndIf
	If _graphics<>_exGraphics Or Not _softSync
		Local sync=False
		If _gDepth sync=True
		_driver.Flip sync
		Return
	EndIf
	_syncTime:+_syncPeriod
	_syncAccum:+_syncFrac
	If _syncAccum>=_syncRate
		_syncAccum:-_syncRate
		_syncTime:+1
	EndIf
	Local dt=_syncTime-MilliSecs()
	If dt>0
		Delay dt
	Else
		_syncTime:-dt
	EndIf
	_driver.Flip False
End Function




SpaceAce(Posted 2006) [#6]
On my machine (AMD Athlon 64 X2 Dual Core 4600+, 2 gigs of RAM, GeForce 7900 GS video card), I get a draw time of 1 and a flip time of 45 with 1,000 triangles whether debug is on or off. The flip time definitely increases a lot relative to the draw time as I add triangles.

This is kind of weird: when I set the program to use 10,000 triangles, things start getting really wonky. DrawTime varies wildly from 8 to the high 300s with the FlipTime hovering in the 200-400 range. I took out the while/wend and all the Print commands and timed just the Flip portion of the code: with 10,000 triangles, it varies from as low as 79ms to as high as 450ish, mostly staying in the 400 area, but often slipping into the 150s. That's odd.

Flip -1, Flip 0 and Flip 1 don't show much difference, neither does fiddling with the refresh rate in the Graphics command.

SpaceAce


Dreamora(Posted 2006) [#7]
No thats actually not that odd after all. Because you most likely do not do the 2D stuff batched ie texture after texture so the drawing order has a very large impact on the draw time needed.


Fry Crayola(Posted 2006) [#8]
The instant you mentioned Blitz3D, it all clicked for me.

Placing quads and textures was never a lengthy exercise, but RenderWorld was. Sometimes I forget that Max2D uses 3D stuff.


H&K(Posted 2006) [#9]
just telling the card what we want, and then flip is waiting for the card to do it
Im getting dead good at guessing these arnt I. I should win a prize ;)


ImaginaryHuman(Posted 2006) [#10]
Sometimes the graphics card doesn't start drawing all of the things you've told it to draw until you try to Flip, which attempts to flush the graphics queue first before flipping. See glFlush() for example, as a way of getting the card to finish drawing sooner in OpenGL.

The flip, which is the copy from the backbuffer or the hardware pointing to the next buffer, etc, usually should take the same amount of time each frame. If it's not, then Flip() is doing more than just switching the buffer into view. Also bear in mind the wait for the vertical blank comes into play and can waste lots of time in some situations.

I don't entirely understand what Dreamora was saying about having to upload textures etc... presumably that's only when you run out of texture memory.


Leiden(Posted 2006) [#11]
Doesn't Flip just tell the program to copy the backbuffer to the frontbuffer. And Flip with 0 just means draw to the frontbuffer immediately?


Dreamora(Posted 2006) [#12]
AD: On OpenGL thats right. DX7 one is set to DX texture manager. Don't know if this one keeps all the stuff alive on VRAM if you don't use a texture for a few flips.

But even then, texture rebind isn't a "cheap" thing as well (isn't it the most "expensive" state switch in OpenGL?)


Fry Crayola(Posted 2006) [#13]
Is it a concern that this is rather slow? 100 rectangles and a framerate of 3-4 fps doesn't seem so great to me. Is this just down to me rebuilding the world each frame, rather than just moving the polygons around, or is there something else?

On other, more powerful machines I do get better framerates but I'm struggling to see why this particular computer is throwing up stuff that would make Freescape blush, even if it is an entirely hypothetical and unused situation. SpaceAce above managed to get a framerate of about 50-55fps using his beast of a machine and 1000 triangles.


Dreamora(Posted 2006) [#14]
1000 rects shouldn't be a problem, I get even higher framerates using my old notebook and my unoptimized chaotic particle system.

But there are 2 things that have a heavy impact due to the 2D in 3D way:

1. The larger the image, the more time it takes. So you should try to use textures of the same size as real 2D images would be (64x64, 128x128 in max) if you use them on sprites and not just as a single background image

2. If possible, make sure that all drawing commands of the same image are placed with each other. But don't waste to much time on organizing that. Same goes for setscale etc. The less changes you have, the better the performance.


As a side note: If possible use OpenGL or the user created DX9 driver. The DX7 is 30-60% slower than those two and the more you push through the worse its performance gets!


simonh(Posted 2006) [#15]
It's slow because each rectangle is being sent to the video card one by one. This is a waste of GPU power - if you were to send all the rectangles together in one go, the GPU could draw them all in a fraction of the time it takes to draw them all separately.


simonh(Posted 2006) [#16]
Here's the GL code for DrawRect in glmax2d.bmx:

		glBegin GL_QUADS
		glVertex2f x0*ix+y0*iy+tx,x0*jx+y0*jy+ty
		glVertex2f x1*ix+y0*iy+tx,x1*jx+y0*jy+ty
		glVertex2f x1*ix+y1*iy+tx,x1*jx+y1*jy+ty
		glVertex2f x0*ix+y1*iy+tx,x0*jx+y1*jy+ty
		glEnd

As you can see only the single quad is drawn inbetween glBegin and glEnd. If you were to fit more glVertex commands in there you could get a speed gain.

Using vertex arrays would offer another speed gain, replacing all the glVertex calls with a single pointer to an array.


Fry Crayola(Posted 2006) [#17]
How does this talk of textures translate to the Max2D commands?

In my current project, I'm drawing a panel made up of nine separate images - 8 border pieces (4 corners, 4 stretched sides) and one central rectangle (which can be just a rectangle, or an image drawn using DrawImageRect).

These are 5x5 pixel images.

How would I maximise speed with this? 8x8 images? 4x4? I presume that the system will create the quad and then apply the image as a texture to that quad, as in Blitz3D?