Message-ID: <3377240C.1B82@silesia.top.pl> Date: Mon, 12 May 1997 16:07:08 +0200 From: Michal MIME-Version: 1.0 To: djgpp AT delorie DOT com Subject: Re: Alignment References: <199705120300 DOT NAA14525 AT solwarra DOT gbrmpa DOT gov DOT au> Content-Type: text/plain; charset=iso-8859-2 Content-Transfer-Encoding: 7bit Precedence: bulk Leath Muller wrote: > Aaah, custom code for fixed texture sizes; I have an affine mapper which > can run a lot faster than 9 cycles/pixel - if I just do affine mapping, I > can map at around 5 cycles per pixel. I also use any size textures; there > is no restriction on the width and height - although the larger the texture > the larger the cache to store it (naturally) > Actually no, I put offset of texture and deltas wich are constant for the whole triangle and differend in differend triangles into the inner. Is your inner doing gouraug shading, and uses more then 8bit for fractal parts of u,v, light value and their deltas? Maybe your inner is unrolled? In my inner I have to calculate 3 addresses (including pixel address, just increment but it changes the register) and it makes it imposible to run at thet speed (if unroll it) becouse if AGI steel (don't know if I've typed that right), address have to be ready 1 clock earlyer, if not it will couse one clock delay. > > Chris Hecker wrote a series of texture mapping articles in Game Developer > Magazine and is the best source on doing fast texture mapping that I have > seen to date... > Never heard about him, almost all code I use is invented by me. > > No, my lighting isn't perspective correct as the point of Gouraud shading > is all about... ie: 2 fdivs per scan line and you can't notice the difference. You cann't only in static frames, with movement it's easyly seen (see Tomb raider). 2 fdives?? - only one fmul, as you have 1/z just 1 fmul to get right light value every 8 pixels. > I also did lighting in the second loop because I have to do 3 different > table lookup and it was easier to do it in a second loop so as to have use > of more registers. Also, its advantagous in respect to MMX processing, as > I can't use MMX stuff and FPU stuff effectively and efficiently together. Are you useing MMX? > I think we are using subtly different methods for custom engines, so we are > going to have different methods of course... > My lightning is perspectiv correct gouraud shading. If your lightning isn't prespectiv correct useing second loop makes sense. > > No, I load a cache line every affine loop; Aren't you using unrolled affine loop (for 8 pixels)? > currently 16 pixels per scan line. > And because the cache load is intertwined with other stuff happenings, it > loads the cache line for free thus no cache misses. > As you load only 32 bytes you can't be shore that all of the addresses your inner is going to use will be in cashe, so it only decreases the number of ceshe misses. > > Ok, here is they key: "I calculate deltas for first 8 pixels every scanline" > Which makes your method a _lot_ slower for obvious reasons... now, > the method I use I only have to calculate the deltas once every POLYGON... :) > Once I have these, the inner loop screams because I never again have to > calculate the deltas... I think I have the deltas at around 96 odd cycles > for the U, V, R, G and B calculations; thus my overhead is around 96 cycles > per polygon... > Haw can you calculate deltas for the whole triangle when they're different for each 8 pixels of every line in triangle? > > Your doing stuff per scanline, I do stuff per polygon; huge difference... > I don't undersand the idea of calculating deltas once per traingle in perspeciv correct mapping. By deltas I mean hier deltas used for linear interpolation of u,v and light value every 8 pixels. I've read somethink about MMX and discovered that first MMX instruction after a FPU instruction and vice versa makes a very long delay. Due to this it wouldn't make any sense to use FPU and MMX both in perspectiv correct inner. If you would like to use MMX for perspective correct mapping you would have to make division in integer and I don't think I can be faster. Is there a mailing list that talks about texture mapping? I have one more problem in texture mapping, maybe you (or somebody else who is reading this) can help me. I use doubles to represent triangle's vertexes in the screen. I'm doing it to prevent trinagle from shacking up and down (like in Tombraider or Magic Carpet). Let's say that this is about affine mapping (it's simpler than perspectiv correct). Now the problem: how to calculate deltas for edge and line interpolation of u and v that thay would prevent pattern inside triangle from haveing texels from wrong side of texture. I know that I can use margins (don't know if I've used a proper word) in both texture coordinates assignet to triangle's vertexes and in texture itself. But sine I want my textures to reapeat in triangle (only some of them) and I have 64x64 textures which in low resolutions with small triangles have real big deltas and can't use both of them. I've solved this problem when using integers to represent triangle's vertexes and even with doubles but the pattern inside the triangle were shaching (not the triangle). I'm preventing triangle from shacking like this: dx21 - is delta x on edge between verextes 1 and 2 dx13 -||- dx23 -||- x1 is one side of triangle x2 is other side So: dx21 = (x1-x2)/(y1-y2) dx13 = (x1-x3)/(y1-y3) dx23 = (x2-x3)/(y2-y3) x1 = x1 x2 = x1 Hier is how I modifi x1 and x2 to prevent triangle from shacking triangle in y axis (x is authomatic when useing doubles): x1 -= (y1-int(y1))*dx21 x2 -= (y1-int(y1))*dx13 now: dv13,du13,dv,du = ???? I use the same method for modifing values of u and v (in highest vertex) for edge interpolation. I DON'T say that I can't calculate deltas but I'm asking if somebody has met problem like this and knows the solution. Maybe there are some decuments that talks about it or maybe you know how this problem is solved in comercial products?