Author Topic: INFO: Draw calls not geometry as a bottleneck in NWN.  (Read 2867 times)

Legacy_OldTimeRadio

  • Hero Member
  • *****
  • Posts: 2307
  • Karma: +0/-0
INFO: Draw calls not geometry as a bottleneck in NWN.
« on: April 24, 2014, 02:56:36 am »


               

Sometime around 2010, I started getting the idea that maybe it was time for me to either give up modding and, say, learn a new language or find a new engine to mod with.  Since I knew it wasn't going to be easy to just put the game down, I decided to devise a little test with the Aurora engine which I was sure it was going to fail: I'd see what the limits of the engine really were and then I could hang my departure on that.  "It wasn't me giving up, I tried to accomplish X and Y and they just couldn't be done so I had no other choice!" was the reasoning I was trying to set up.


 


So I picked a pretty high bar, almost from the start.  Importing Fallout 3 placeables.  Turned out that was pretty damned easy.  So I raised the bar higher and tried importing skinmesh creatures from Titan Quest and F3.  Nope, worked fine.  I didn't bother to polish them off but, technically, there was nothing prohibiting it.  Then, just to get it over with, I set the bar insanely high: Import the entire town of Megaton from Fallout 3 into a tileset as one giant tile group.  While it didn't come out perfect, there was a day about six weeks later where I was walking through Megaton with my NWN character in amazement. 


 



 


It got even crazier from there.  If you guys have seen the Spelljammer merchant video of mine from X-Fire (produced by Silverblade), that's about where I had to admit defeat. 


 


That's full-on 3D render-quality art running smoothly in NWN on a mediocre NVidia graphics card.


 


I was confused as hell, BTW, about how all this could be done.  Took me a long time to figure out why.  I've posted about it before but, in a nutshell, it's a combination of not using shadows, compiling your models (preferably with the Bioware internal model compiler tool) and making them static.


 


However, there was still the question of why it was so common to find myself (on the same testing machine) sometimes bogged down in areas that other people had created with relatively normal-looking custom content or even official content.


 


I think I've finally come to a decently-defendable conclusion as to why that is, too: Draw calls.  As in, too many of them.  With the help of gDEBugger (an OpenGL debugger), I whipped out this quick video showing how an individual video frame of NWN is put together, piece by piece, and hopefully raise awareness about draw calls and how, as CC makers, we want to reduce the number of draw calls as much as possible in our creations. 


 



 


You can learn more about Texture Atlases here or Googling around.  Nice free texture atlas generator (that I mention in at the link in the previous sentence) here.  I don't think I got this down to working under GMax, but I did contact the creator and he was kind enough to give me permission to do so.  If someone is hot to trot about that kind of reworking, let me know. 


 


Oh, for anyone who either understands my little video about them or who already knows about them, I'd love to hear ideas about how draw calls might be reduced.



               
               

               
            

Legacy_Shadooow

  • Hero Member
  • *****
  • Posts: 7698
  • Karma: +0/-0
INFO: Draw calls not geometry as a bottleneck in NWN.
« Reply #1 on: April 24, 2014, 03:25:02 am »


               

so thats why TNO is so slow! (Sorry Pstemarie Im not a fan of TNO '-_-')


 


Anyway so combinig similar meshes is a good practice right? What about that lightning issue a Zaharatrustra revealed? Doesnt this conflict?



               
               

               
            

Legacy_henesua

  • Hero Member
  • *****
  • Posts: 6519
  • Karma: +0/-0
INFO: Draw calls not geometry as a bottleneck in NWN.
« Reply #2 on: April 24, 2014, 03:31:30 am »


               

I agree with you, OTR, that Draw Calls are the problem. I raised it in a thread once because it was the bottle neck I ran up against when I was learning how to model and make games using Unity3d.


 


How do you solve this for a tileset like TNO? It seems like it would be a monumental effort to fix TNO. I think collapsing meshes in each tile would go a long ways. But what about automating the creation of texture atlases. Is this possible?


 


(I would love it if someone would take this on. '<img'> )



               
               

               
            

Legacy_OldTimeRadio

  • Hero Member
  • *****
  • Posts: 2307
  • Karma: +0/-0
INFO: Draw calls not geometry as a bottleneck in NWN.
« Reply #3 on: April 24, 2014, 04:15:19 am »


               

@Shadoow - Other than the shadows (see here, search down to "Shadows are not just worse than you thought"), yeah.  What gets you is when every tile has a bunch of different meshes, each with different textures.  But there's a trade off between the variability of individual tiles and the ability to optimize larger things like groups.  If you made an entire 16x16 tile group where, say, the visible ground (not the WOK) is unsliced, the entire visible ground is laid down in 1 draw call.  I believe I did that with the ground (and possibly other things) on the Megaton proof of concept,   As far as the lighting goes, that's an entirely different issue.


 


BTW, notice how the video I did of the tileset was just a relatively zoomed-in portion of one?  With unlocked cameras and large view distances, people wind up multiplying the work that has to be done per frame.  Or adding a gazillion placeables.  Each of those get one draw call, as well.  I don't want to unnecessarily inculcate placeables, though.  View distance is more of a problem, IMO.


 


@Henesua - Yeah, this does come up a lot in things like Unity or anyone who's developing for a mobile platform, too, where draw calls are sometimes at a fixed premium.  "How do you solve this for a tileset like TNO?"  That's a great question.  I'm not so sure there is an easy answer.  Like, maybe it can be partially optimized but unless you're willing to chuck Features and Terrain, you're still going to have the issue...with any tileset.  I agree that examining meshes for optimization on each tile (and if you really want to get fancy, probably on each group) could yield useful results.  And that consolidation could be automated.  As far as automating texture atlas creation, and I say this as a lover of automation when it yields something useful, I don't think you could meaningfully automate that process and be happy with what came out the other end.  Again, not without tossing Features and Terrain to the wind.  But I really don't know.


 


I'm actually hoping (someday) to see what I could do in the way of a tileset, either Titan Questy or Fallouty which used just 3 x 3 or 4 x 4 groups and which would be mostly bare but which would have single-mesh/single-texture (atlased, probbaly) placeables which would cover the area (so, 3 x 3 or 4 x 4), "fit" that group's walkmesh, and which were a single mesh.  Again, you have to give up some of the customization but I'd hope to make it back with single placeable faux "groups".  I'm trying to kind of go through the mental exercise of how much awesome you could squeeze out of something so limited and I'm not done yet.  It could either be limited & pretty nifty or embarrassingly limited, I dunno.



               
               

               
            

Legacy_BelowTheBelt

  • Hero Member
  • *****
  • Posts: 699
  • Karma: +0/-0
INFO: Draw calls not geometry as a bottleneck in NWN.
« Reply #4 on: April 24, 2014, 05:51:21 am »


               

How big is the file for Megaton compared to, say, the TNO tileset?  Would it be feasible to create and include lots of these 'artsets' in a module? 



               
               

               
            

Legacy_OldMansBeard

  • Full Member
  • ***
  • Posts: 245
  • Karma: +0/-0
INFO: Draw calls not geometry as a bottleneck in NWN.
« Reply #5 on: April 24, 2014, 08:59:49 am »


               

Interesting.


 


If two meshes in a models have different bitmaps, would combining the bitmaps into one larger one, just so that the meshes could be combined, ever be worthwhile? What would be the trade-off?


 


Can you imagine a software tool (CM4 ?) that fiercely optimised models to minimise draw calls, and if so, what would you want it to do?


               
               

               
            

Legacy_Pstemarie

  • Hero Member
  • *****
  • Posts: 4368
  • Karma: +0/-0
INFO: Draw calls not geometry as a bottleneck in NWN.
« Reply #6 on: April 24, 2014, 01:03:18 pm »


               


so thats why TNO is so slow! (Sorry Pstemarie Im not a fan of TNO '-_-')


 


Anyway so combinig similar meshes is a good practice right? What about that lightning issue a Zaharatrustra revealed? Doesnt this conflict?




 


*Blowing a BIG raspberry at ShaDoOoW*


 


I guess I'm just fortunate because I have NO issues with TNO as far as speed goes. Yes, it takes slightly longer for the area to load but, once loaded, the area runs smoothly. However, part of this may be the way in which I build. I try to minimize placeables in outside areas and prefer to keep my area sizes under the 16x16 suggested limit. 


 


That being said, this is most assuredly a great angle for a discussion and I'd love to see if some of the concepts being discussed here actually pan out. 


 


However, one thing we all need to remember is that NWN was designed over a decade ago. Rendering technology was not even close to what it is now and I can't believe that the more modern GPUs and CPUs we have today can't handle anything we have (and will have) thrown at them through the NWN engine. Time and again the engine has proven more robust than anyone can imagine and certainly has proven itself more than capable of meeting our needs.



               
               

               
            

Legacy_Estelindis

  • Hero Member
  • *****
  • Posts: 935
  • Karma: +0/-0
INFO: Draw calls not geometry as a bottleneck in NWN.
« Reply #7 on: April 24, 2014, 02:00:09 pm »


               


Interesting.


 


If two meshes in a models have different bitmaps, would combining the bitmaps into one larger one, just so that the meshes could be combined, ever be worthwhile? What would be the trade-off?


 


Can you imagine a software tool (CM4 ?) that fiercely optimised models to minimise draw calls, and if so, what would you want it to do?




 


What about an automated shadowbox creator?  Turn off shadows on rendered meshes.  Create duplicate, simplified, "fiercely optimised," non-rendered meshes that cast shadows, and, wherever possible, combine them with each other.  Would this help or hurt?  Would it even be possible?


 


I must say: when I started working on tilesets, I didn't see the point of shadowboxes at all, but I've since been won over to them.  As it stands, they're currently what I plan to do with my elven treetop city tileset.  The shadowboxes just don't need to be anything close to as complicated as the visible meshes, particularly since some visible meshes need to have extra faces solely in order to texture them in the way that you want, not because the geometry needs them per se.  This wouldn't be an issue for shadowboxes, which can be much simpler.



               
               

               
            

Legacy_MerricksDad

  • Hero Member
  • *****
  • Posts: 2105
  • Karma: +0/-0
INFO: Draw calls not geometry as a bottleneck in NWN.
« Reply #8 on: April 24, 2014, 02:04:56 pm »


               


How big is the file for Megaton compared to, say, the TNO tileset?  Would it be feasible to create and include lots of these 'artsets' in a module? 




This is along the lines of one of the CCC ideas I posted last year -- "Fantastic Locations". If we relied on locations that were not so generic, and much more specific, then play-time in normal modules could be reduced to just fighting in specific combat oriented areas with a very specific shape, much like the layout of late 1990's Final Fantasy games. The "rooms" were very complex looking, and unique. They also left out a lot of travel time between places by omitting every single little "what is between these two areas".


A single tileset can hold many thousands of individual tiles, and an equal number of groups. It would certainly be feasible to make your entire world out of a single tileset. I'd done just that in 2006. Or you could still group certain bits together within individual tilesets. There is no best way to do that, except in considering what is required by your own module.


 


Edit: as for my first paragraph:


 


I fully realize that a lot of us (including myself) like to have large expansive worlds like they had in Baldur's Gate. And I know Rolo and team are playing around with that system which makes use of just the opposite of what I am suggesting. But there is certainly no reason we can't use both. I would!



               
               

               
            

Legacy_MerricksDad

  • Hero Member
  • *****
  • Posts: 2105
  • Karma: +0/-0
INFO: Draw calls not geometry as a bottleneck in NWN.
« Reply #9 on: April 24, 2014, 02:12:17 pm »


               


What about an automated shadowbox creator?  Turn off shadows on rendered meshes.  Create duplicate, simplified, "fiercely optimised," non-rendered meshes that cast shadows, and, wherever possible, combine them with each other.  Would this help or hurt?  Would it even be possible?


 


I must say: when I started working on tilesets, I didn't see the point of shadowboxes at all, but I've since been won over to them.  As it stands, they're currently what I plan to do with my elven treetop city tileset.  The shadowboxes just don't need to be anything close to as complicated as the visible meshes, particularly since some visible meshes need to have extra faces solely in order to texture them in the way that you want, not because the geometry needs them per se.  This wouldn't be an issue for shadowboxes, which can be much simpler.




 


For my trees script I built both a concave and convex hull creator. Both are fairly fast, and I think a little logic to pair them would make a really good tool for use with creating a generic shadow builder. For many objects in a common tile scene, I think the concave hull system would work great if it was optimized down edge-wise. The basic math alone already keeps all those things that cause shadow artifacts from appearing from even happening. The only issue I can see is of what order do we make the script group objects. How do we determine which one should be casting a shadow on another? Where is that math? With that, I could create the logic to automatically group certain objects (by some kind of weight system) and then merge their meshes, and then simply create their unit's shadow hull.


               
               

               
            

Legacy_MerricksDad

  • Hero Member
  • *****
  • Posts: 2105
  • Karma: +0/-0
INFO: Draw calls not geometry as a bottleneck in NWN.
« Reply #10 on: April 24, 2014, 02:17:02 pm »


               

If you haven't already, check out the texture atlases used by these games:


 


Fate


Kingsroad (facebook game)


I'll add more...



               
               

               
            

Legacy_Estelindis

  • Hero Member
  • *****
  • Posts: 935
  • Karma: +0/-0
INFO: Draw calls not geometry as a bottleneck in NWN.
« Reply #11 on: April 24, 2014, 02:21:49 pm »


               


For my trees script I built both a concave and convex hull creator. Both are fairly fast, and I think a little logic to pair them would make a really good tool for use with creating a generic shadow builder. For many objects in a common tile scene, I think the concave hull system would work great if it was optimized down edge-wise. The basic math alone already keeps all those things that cause shadow artifacts from appearing from even happening. The only issue I can see is of what order do we make the script group objects. How do we determine which one should be casting a shadow on another? Where is that math? With that, I could create the logic to automatically group certain objects (by some kind of weight system) and then merge their meshes, and then simply create their unit's shadow hull.




 


Speaking as someone who has absolutely no idea how to go about such a thing, that sounds... good?   '<img'>



               
               

               
            

Legacy_henesua

  • Hero Member
  • *****
  • Posts: 6519
  • Karma: +0/-0
INFO: Draw calls not geometry as a bottleneck in NWN.
« Reply #12 on: April 24, 2014, 02:49:30 pm »


               

OldMan'sBeard: If you can automate UV Mapping then it seems to me that combining meshes that use the same texture would be useful.


 


And what about automating texture atlases? Perhaps you could count textures which are smaller or equal to 512 x 512, and if you reach 4, these could be combined into 1 texture at 1024 x 1024, and each of the meshes that uses those textures could be remapped to the new texture.



               
               

               
            

Legacy_MerricksDad

  • Hero Member
  • *****
  • Posts: 2105
  • Karma: +0/-0
INFO: Draw calls not geometry as a bottleneck in NWN.
« Reply #13 on: April 24, 2014, 02:53:34 pm »


               

Where is the cut-off point for usefulness? The ones in kingsroad are massive and basically encompass an entire area


Edit: I'm thinking each specific group could have it's own single texture file.



               
               

               
            

Legacy_OldTimeRadio

  • Hero Member
  • *****
  • Posts: 2307
  • Karma: +0/-0
INFO: Draw calls not geometry as a bottleneck in NWN.
« Reply #14 on: April 24, 2014, 03:06:51 pm »


               

@BelowTheBelt - The proof of concept Megaton hak is about 130 megs.  The version I'm using is not as optimized as I thought and appears to be put together like an ordinary tile group.  I had optimized at least the ground (making it all one mesh) in another version but I had an extremely unfortunate event happen with my Max 5 which left me scrambling to recover what I could.


 


You can download the Megaton proof of concept here.  Just create a new 16 x 16 area and place the "Crater, Town 01 (16x16)" group.


 


'<img'>


 


@OldMansBeard - Well, what I'd really like to accomplish with this post is get 5-10 more people downloading the gDEBugger and playing around with it to be able to peer-review my assertions and also see what their experience provides in the way of ideas on how to approach issue. 


 


"If two meshes in a models have different bitmaps, would combining the bitmaps into one larger one, just so that the meshes could be combined, ever be worthwhile?"  I'm assuming you mean "if two meshes in the same model", the answer is a qualified "yes".  It totally depends on what that model is.  If it's a tile with lots of meshes mesh on it which use the same texture, then absolutely.  However, what if it's a tile with just 3 different meshes with 3 different textures?  I suppose those meshes could be consolidated into one mesh and their textures into one texture, but one would have to look at how much savings you were actually getting on the deal.  The larger/more complex the mesh is, the better the savings are.  If you had a 3 x 3 sailing ship group all sliced up into 10 meter chunks, you're going to have at least 3 different draw calls (one for each chunk) which, since they're not the WOK, don't need to be sliced.  Three chunks for that example is (IMO) a lowball estimate.


 


Since I deal more with creatures than tilesets, let me give an illustration which is a little less ambiguous or qualified.  Below are four creatures.  From back to front: Beggar (non-dynamic NPC), Dynamic NPC with a conventional robe, Male Dynamic NPC with full-body robe, and Female Dynamic NPC with full body robe and cloak.  Here's the draw call breakdown for them:


 


1TMzOL9.jpg


 


NOTE: These numbers may double with environment mapping turned on.  In that case, I've noticed a silvered (envmapped) version of the mesh being drawn in one draw call and then the textured version on the next.


 


A big part of this focus on draw calls isn't so much how one object affects NWN gameplay but how the large number of meshes impact the performance, overall.  I'm a big fan of stuffing as much as I can get away with into the game so, NPC-wise, I've looked at things which affect performance- like how much overhead pathing consumes in NPC-dense situations.  In the same way that EffectCutSceneGhost() relieves CPU overhead, the above screenshot indicates that skin-meshed NPC's (especially completely skinmeshed NPC's) would consume dramatically less drawcalls to put on the screen in comparison to the non-dynamic but still part-based ones we have now.


 


"What would be the trade-off?"  In situations where multiple meshes on the same object used the same texture (and there was no problem with consolidating them), I'm guessing there wouldn't be.  However, in situations where you were pushing to consolidate textures (say, for a tile), then you're going to trade off extra texture memory (worst case: single, unique texture for each tile) for lower draw calls.  There are a lot of variables in that, though, like how big an area is involved, how much stuff is in it (also using up texture memory), and how much texture memory is available on the video card that's displaying it.


 


"Can you imagine a software tool (CM4 ?) that fiercely optimised models to minimise draw calls, and if so, what would you want it to do?"  I can, but what I'd really like is experts like you and others becoming familiar with the situation and coming up with your own approaches.  I am extremely time-limited in real life right now and I'd really like to see what conclusions people with more experience (especially when it comes to tilesets) than I come to after looking at the situation, themselves.


 


To be honest, I'm mainly trying to get the word out so that people (including me) might implement better practices going forward.  I haven't really spent much time thinking about how to retro-mod the existing content to be more efficient.  The only two instances are NPC optimization and VFX optimization.  Some of the Bioware effects create a mass of chunks and other things I suspect momentarily blow up the draw call count for the frames for which they exist.  Most of their effects are almost unnoticeable (given how briefly they exist) and at some point I might try to take an axe to them.