Author Topic: NWScript profiling observations  (Read 326 times)

Legacy_Zarathustra217

  • Sr. Member
  • ****
  • Posts: 322
  • Karma: +0/-0
NWScript profiling observations
« on: July 23, 2012, 12:25:28 pm »


               Inspired by various other topics (here and here) I started more systematically profiling functions to get a better idea of how they actually performed. I'll share it here in the hope it may help others - and that they perhaps will contribute with similar too.

Testing setup

The testing setup was our main PW module (~900 areas, ~3000 scripts) using NWNx and the profiling plugin. The testing computer was a Intel i7 920 running Windows 7.

Functions were run in their own scripts, generally running only them where possible. The scripts were tested by running the script 1000 times in a loop (using the TMI plugin to allow for that), and this process was repeated 10 times in small intervals. This was done by the following script:

void main()
{
    int nCount=0;
    while(nCount<1000)
    {
        ExecuteScript(scriptname,OBJECT_SELF);
        nCount++;
    }
}
           

Of course, the profiling isn't always accurate, meaning that the results shouldn't be considered objective, but rather indicative. If you have comments on how to improve the testing setup, I would be very interested in hearing your ideas.

Anyway, here goes:

GetModule()

I used this function to have a sort of baseline, assuming this function was very fast. The script looked as follows:

'test_getmodule'

void main()
{
    GetModule();
}

Results:

test_getmodule - 3 msec for 10000 calls =
0.0003 ms/call

Remarks:

Safe to say, this function is fast - to the point that the performance impact can barely be measured.

ExecuteScript()

Tested this functing using two scripts, the first running the second, being the test_getmodule above. The other script looked as follows

'test_exec'

void main()
{
    ExecuteScript("test_exec_2", OBJECT_SELF);
}

Results:

test_exec - 3049 msec for 10000 calls =
0.3 ms/call

Remarks: I was somewhat surprised to see the performance impact here given that searching for a file is usually quite fast. However, the performance impact is likely affected negatively to a significant degree by the (all too) large number of scripts in the module that was used as testing setup. It may be that the performance impact is much less in smaller modules.

GetHasSpellEffect()

Tested this to compare cycling effects and checking for a spell effect with GetEffectSpellType. The scripts were run on a creature that had the spell applied as well as a few others.

'test_spellfx_1'

void main()
{
    if(GetHasSpellEffect(SPELL_AID))
    {

    }
}

'test_spellfx_2'

void main()
{
    effect eTemp=GetFirstEffect(OBJECT_SELF);
    while(GetIsEffectValid(eTemp))
    {
        if(GetEffectSpellId(eTemp)==SPELL_AID)
        {
            return;
        }
        eTemp=GetNextEffect(OBJECT_SELF);
    }
}

Results:

test_spellfx_1 1 msec for 10000 calls =
0.00001 ms/call
test_spellfx_2 139 msec for 10000 calls = 0.00139 ms/call

Remarks:

GetHasSpellEffect is very fast, apparently faster than GetModule(). Consequently, there's no reason to cycle effects even if you are looking for several spell effects at once. You might even consider making custom spell IDs to apply effects you use and test for frequently.

Script Size

Not a function per se, but wanted to see how script size alone affected CPU cycles. I did this by making a script that made a creature run every AI event if a variable was set, but not setting it enabled. The script size ended up being 172kb compiled.

'test_script_size'

#include "rl_inc_ai_events";

void main()
{
    if(GetLocalInt(OBJECT_SELF, "SOMETHINGRANDOM"))
    {
        AIOnAttacked();
        AIOnDamaged();
        AIOnHeartBeat();
        AIOnDeath();
        AIOnRest();
        AIOnSpellCastAt();
        AIOnPerception();
        AIOnConversation();
        AIOnUserDefinedEvent();
        AIOnSpawn();
    }
}
}

(Note: The creature had no variables set on it. The event functions points to modified versions of the default AI events.)

Results:

test_spellfx_1 1857 msec for 10000 calls =
0.1857 ms/call

Remarks:

This is a bit thought provoking given the very large AI scripts and how frequently they are run.

User-defined events

Again, not really a function, but wanted to test this no less. This was done by making a script that signaled a user defined event on a creature that had a custom OnUserDefined script specified. The interest in this function was spawned by noticing that both ExecuteScript and script size had a noticeable impact on performance, so using user defined events could help reduce size by moving functions into other scripts without having to use the ExectueScript command.

(Note: It's unknown whether the profiler will actually capture the true performance impact of this command since SignalEvent() does not wait for the result process to finish before the script finishes. Others with better insight in the NWN core would be better at telling what applies here than me.)

'test_ud_event'

void main()
{
    SignalEvent(OBJECT_SELF, EventUserDefined(5000));
}

'rl_ai_onuser'

void main()
{
    int nEvent=GetUserDefinedEventNumber();
    if(nEvent==5000)
    {
        GetModule();
    }
    return;
}

Results:

test_ud_event 187 msec for 10000 calls =
0.00187 ms/call
rl_ai_onuser 1743 msec for 10000 calls = 0.1743 ms/call

Remarks:

It seems that it's the GetUserDefinedEventNumber that's being the performance hog here, which is a bit surprising. It may be possible to overcome that with setting a variable on the object that pass the event number rather. Further testing would be needed to assess that.

Update: Did the further testing and it turns out that it wasn't GetUserDefinedEventNumber but the very script being run at all. An empty user defined event script yielded more or less same results.
               
               

               


                     Modifié par Zarathustra217, 23 juillet 2012 - 12:12 .