Fast writing to a screen buffer
Blitz3D Forums/Blitz3D Beginners Area/Fast writing to a screen buffer
| ||
Hi, basically trying to write a 'list' of 'pixels' to the screen buffer. After doing a LockBuffer, is using WritePixelFast the fastest possible way. ( what exactly does LockBuffer do? ) Is there a way to find the address of the buffer and write directly? I'm assuming that WritePixel is slower but why have both? Also after initialising the screen to 1024,768,16 do I need to do a LockedFormat() to find out the r,g,b format of the pixels. Mine says 5,6,5 but must I check in case other peoples computer says otherwise? Am I right in thinking alpha ( what ever it is ) is not used in 16bit screens? I'm using B+ Thanks in advance Marg |
| ||
if your using WritePixelFast, you dont need to use LockedFormat() WritePixelFast is not fast enough for real time effects by anymeans, the only really fast way is using B+. |
| ||
The fastest way to write to a buffer would probably be using LockedPixels and a CopyBank loop to transfer a line at a time from a native blitz array []. LockedFormat is for use with LockedPixels not LockBuffer, for you situation I would test if mode is not 5,6,5 and do a tidy exit, I would guess only one form of legacy matrox user is going to suffer. Optional 640x480 support will give you most dramatic speedup of course. |
| ||
I am having the same problem. Even using a locked buffer, and readpixelfast on a 320x240 image buffer, my frame rate creeps below 2-3 fps. Here is what i'm doing: 1) Make a simple project with a light, camera, and a cube at 320x240 resolution 2) put the following code in the main loop Just a single loop using readpixelfast lockbuffer backbuffer() For x = 0 to GraphicsWidth() For Y = 0 to GraphicsHeight() thisPixel = ReadPixelFast(x,y,backbuffer()) next next unlockbuffer backbuffer() Why is it that a single pass through all pixels at such a low resolution would completely kill my framerate when I've seen so many cool demos with copper effects, other effects which were likely done by modifying a buffer, etc. which run just fine. I mean for any type of effect like that you will have to at least loop through all pixels once with read and/or writepixel. The effect I want to do will need at least 3 such loops per render. Is there anyghing I can do to speed it up? |
| ||
You are reading outside the screen i think. You should change that to:lockbuffer backbuffer() For x = 0 to GraphicsWidth()-1 For Y = 0 to GraphicsHeight()-1 thisPixel = ReadPixelFast(x,y,backbuffer()) And $FFFFFF next next unlockbuffer backbuffer() Having said that, this isn't the fastest way. skidracer explains a faster solution above. |
| ||
can anybody explain to me what locked pixels are and why they are faster? I've heard of locking a buffer, but not locking pixels... Also, you can't always use banks, arrays, etc, bucause for some things you have to grab the backbuffer pixels to modify them. I must be missing something, what is it? |
| ||
lockbuffer backbuffer() For x = 0 to GraphicsWidth() For Y = 0 to GraphicsHeight() thisPixel = ReadPixelFast(x,y,backbuffer()) next next unlockbuffer backbuffer() For one thing, you are calling the GraphicsHeight() function every loop of your x loop (which would be 320 times). What a waste of processiing time. You only need to find out the width of the screen once. It certainly would be a lot faster to just call it one time before your loop and save it to a variable. Your code would look something like: lockbuffer backbuffer() height = GraphicsHeight() - 1 For x = 0 to GraphicsWidth() - 1 For Y = 0 to height thisPixel = ReadPixelFast(x,y,backbuffer()) next next unlockbuffer backbuffer() And that's about as fast as ReadPixelFast is going to get. Reads from the graphic card are pathetically slow, but writes are extremely fast. So, if you can store the image in a bank or array (or even another imagebuffer as long as you aren't doing any Read-ing) and then copy it to the backbuffer() with your changes, you will see a huge performance increase. |
| ||
LockedPixels are faster because Blitz gives you direct access to the block of memory used for the buffer (in the form of a bank). The downside to this is that you have to work out what pixel format is used in that memory block and adjust it to whatever you want to do with it. Complicated but not that complicated. Read/WritePixelfast on the other hand do all that work for you. That is also why it is slower. |
| ||
would it be any faster to create a texture with the 256 flag and then use readpixel with it? |
| ||
Are LockedFormat and LockedPixels undocumented commands/functions? I can't find them in the online docs. (And if they are undocumented, where can I find a document?) |
| ||
Wolron - Just looked in the online docs - LockedFormat and LockedPixels are in BlitzPlus only. Thats too bad! I really needed the extra speed for my project - and it has to be in 3d! Now that I'm dealing with this issue, it would be nice if we had commands for: RenderWorld to any buffer (not just backbuffer) RenderWorld to a bank (this would be sweet and make things like what i'm trying to do soooo fast!) along with the LockedFormat/LockedPixels commands that are already in BlitzPlus! FYI: I've written an anaglyph rendering engine (the 3D that can be seen through red/blue 3d glasses). So far, it does work real time, but only at about 10 FPS, with some really wicked optimizations in there. I'm being killed by the readpixelfast section, whith such code removed, framerate increases 10+ fold. The engine is complete, but a bit too slow for any serious projects. Hopefully we will see some of the above requests in the future and I can get this thing to run 30 FPS or so fullscreen. Anyways, below is a screenshot of the anaglyph enging running. Of course, seeing it in real time (through red/blue glasses) is a much better experience than a still image. ![]() |
| ||
I would suggest rendering with blackbackground into 256x256 viewports (left adjacent to right), then use 2 CopyRects after each RenderWorld to place renders into 2 textures created with flags=256, then compose the final back buffer using scene with 2 renders set to additive blend mode, with each slide set to relevant filter color. |
| ||
thanks, i'll give that a try. I'm pretty new to this per-pixel stuff, but its getting to be pretty entertaining |
| ||
For skidracer, Lets say I'm doing a starfield with stars all over the screen. Would it be best to do LockedPixels to return the bank address. Is this just like the start of an array, the array being the hidden screen? Then use PokeShort( bank, offset ) for each star. (16bits). Not sure about CopyBank. For general sprites would the normal sprite plotting commands be fastest? Thanks Marg |
| ||
the array being the hidden screen? Yes, but the concepts of the memory being locked and at the other end of an AGP bus with caches that prefer sequential to random write accesses (and a multiple worse for reads) seems a bit much to get into. A few bold claims that pbly would stand up to testing: * The LockedPixels / LockBuffer command itself will be the biggest cripple as they block the processor while waiting for graphics card to finish (for example the most recent flip or cls command issued) * Once you get enough stars on the screen running the entire star field in ram from a CreateBank and copying sequentially in place of your cls will probably become superior performer. |