Author Topic: Efficiency thread  (Read 480 times)

Legacy_Shadooow

  • Hero Member
  • *****
  • Posts: 7698
  • Karma: +0/-0
Efficiency thread
« on: April 09, 2014, 09:32:59 am »


               

Starting this thread for advices and questions about scripting efficiency.


 


I assume this might also turn out int the question about method used to generate the results since in past, several scripters like FunkySwerve, The Krit or LightFoot8 used different methods than what I used now. And their results were suprisingly different so I expect those who provided different results will argue about the reliability of this method. Well go on.


 


Previously, to compare efficiency, peoples used the built in profiler or nwnx profiler or even special nwnx plugin called nwnx_time. I went different direction and using my insert sort alghoritm and the TOO MANY INSTRUCTION error as a base. I take my most efficient alghorithm add the code I want to compare and make a two versions. Then I create a list with 550 numbers and try to sort it. If at least one algorithm prints "SUCCESS" I add one more number, to the point it throws TMI (this gives me the maximum number of elements sorted by one of the alghorithms). Then Im removing number of numbers untill the second algorithm passes (which gives me maximum number of elements sorted by the second alghorithm). Then I compare the difference.


 


As a first try, I tried to comfirm the previous knowledge about OBJECT_SELF as there were in past two different results, one by The Krit believing, its better to assign OBJECT_SELF into variable and the Lightfoot8 who says its not true. So I made my own test based on my scripting algorithm, once I assigned OBJECT_SELF to the variable, once not - see image.


 


efficiency_objself.jpg


Result is that both algorithms sorted exactly the same number of elements, 556. This means that The Krit is wrong. Given how many iterations the script performs, even if there was minor drain by using the OBJECT_SELF macro, it would be noticeable, very. But it is not, thus the only explanation is that the OBJECT_SELF macro is very efficient and there is no need to assign it into variable (which if done inside loop would be quite less efficient for sure).


 


This of course doesnt mean that you have to go back and rewrite all scripts where you assigned the OBJECT_SELF into variable, in 99% of cases the difference is zero. This is rather theoretical issue than practical one.


 


 


Another test I made was to confirm the FunkySwerve discovery about GetWaypointByTag being less efficient than GetObjectByTag. Which I believed till now as my test disproved this! See image:


efficiency_gobtgwbt.jpg


Result is that sort25 (the one with GetWaypointByTag) was able to compare 531 elements till TMI, sort24 (GOBT) only 522. A huge difference I say. However I admit in this case the testing is somewhat difficult as it might be different with more areas, more objects in module. Ive tested it twice (with same result) in these environemnts.


1: Clean module - single area, around 40 placeables, 1PC, 2waypoints.


2: Clean module - single area, around 40 placeables, 1PC, 40waypoints.


 


I guess this should be better to test also in a way that the waypoint test_wp will be in hundreth area, not in a first one. So I make no closure yet, waiting for conformation/explanation of these results.



               
               

               
            

Legacy_leo_x

  • Sr. Member
  • ****
  • Posts: 403
  • Karma: +0/-0
Efficiency thread
« Reply #1 on: April 09, 2014, 11:39:06 am »


               

I'd hypothesize from this that the performance difference between the two, in terms of NWScript instruction count, is because GetObjectByTag has a second default parameter that is being pushed on to the stack even when not supplied by the user.


 


It doesn't really say anything about the difference between GetObjectByTag and GetWaypointByTag as implemented in the game engine.  Logically, the latter should be faster because it vastly cuts down the search space, but it looks like NWN keeps an array of CNWSTagNode structs that include a tag and an object id.  So GetObjectByTag can just iterate through never touching the the GameObject array.  In the case of GetWaypointByTag it has to do all the work GetObjectByTag does and has to look up the GameObject to ensure that it's really a waypoint.  It might not be a huge overhead but if you got a cache miss or something, that would hurt.


 


Edit: I've thought more about this and since you're going to use that waypoint object anyway and thus you'll have to pay the cost of looking up its GameObject regardless, I really doubt there is any real difference between the two.  Maybe I'm missing something tho. 



               
               

               


                     Modifié par leo_x, 09 avril 2014 - 11:06 .
                     
                  


            

Legacy_Shadooow

  • Hero Member
  • *****
  • Posts: 7698
  • Karma: +0/-0
Efficiency thread
« Reply #2 on: April 09, 2014, 12:18:43 pm »


               

Hmm you are right. The reason why GWBT threw TMI later is because of that second parameter for the GOBT.


 


Is also a question how much is TMI relevant factor for efficiency. From my earlier tests it seems there is a correlation between number of instructions and instruction speed (aka in few cases even when instruction count was definitely lesser it still threw TMI sooner). But cant be sure how large this correlation was.


 


Question therefore is - when is the scripter concerned about builtin function speed at all? Its always in a miliseconds which is not noticeable unless it hits TMI. I really wonder because I would like to think Im experienced scripter and yet I was never concerned about a speed of certain functions like GOBT/GWBT.


 


Or what is efficiency - speed of the script execution, or how many instructions can script perform?



               
               

               
            

Legacy_henesua

  • Hero Member
  • *****
  • Posts: 6519
  • Karma: +0/-0
Efficiency thread
« Reply #3 on: April 09, 2014, 04:25:12 pm »


               Some questions


Do all instructions have the same execution duration?

If not, how much of a difference is required to be significant?


In terms of machine code instructions ( which I assume we are discussing), there are probably significant differences in number depending on which nwscript functions is used (as your results suggest). Has anyone studied this before and shared results?


............

I know that your post implies these questions, ShaDoOoW. I wanted to clarify and amplify them because I am curious.
               
               

               
            

Legacy_FunkySwerve

  • Hero Member
  • *****
  • Posts: 2325
  • Karma: +0/-0
Efficiency thread
« Reply #4 on: April 09, 2014, 11:08:14 pm »


               


Some questions


Do all instructions have the same execution duration?




Not remotely. That's the problem with this way of testing 'efficiency'. All you're testing is instruction counts.


 


This should be fairly simple to demonstrate, though it's pretty obvious on the face of it if you've done this testing before. It's incredibly easy to blow past a TMI limit with a loop that you won't even feel in the module - something like this:


 


int nX;


for (nX = 0; nX < 2000000000, nX++) {SetLocalInt(GetModule(), "Test", nX);}


 


Now compare to a script that does object creation but does not TMI. You can get a huge processing hiccup with no TMI. Of course, that may not be the greatest example, since it conflates other performance issues like object insertion with varying cpu costs of instructions, but it does handily point out that TMI simply does not measure performance. It's only a safeguard that bioware put in to prevent runaway scripts.


 


Funky



               
               

               
            

Legacy_FunkySwerve

  • Hero Member
  • *****
  • Posts: 2325
  • Karma: +0/-0
Efficiency thread
« Reply #5 on: April 09, 2014, 11:12:20 pm »


               


As a first try, I tried to comfirm the previous knowledge about OBJECT_SELF as there were in past two different results, one by The Krit believing, its better to assign OBJECT_SELF into variable and the Lightfoot8 who says its not true. So I made my own test based on my scripting algorithm, once I assigned OBJECT_SELF to the variable, once not - see image.


 




I can confirm the results of your tests, if not the manner of testing. Acaos had heard the same thing, though I don't know if from the same source. He tested and disconfirmed. OBJECT_SELF vs assignment makes no difference in terms of efficiency.


 


Funky


               
               

               
            

Legacy_Lightfoot8

  • Hero Member
  • *****
  • Posts: 4797
  • Karma: +0/-0
Efficiency thread
« Reply #6 on: April 14, 2014, 07:18:19 am »


               

Some questions

Do all instructions have the same execution duration?
If not, how much of a difference is required to be significant?

In terms of machine code instructions ( which I assume we are discussing), there are probably significant differences in number depending on which nwscript functions is used (as your results suggest). Has anyone studied this before and shared results?

............
I know that your post implies these questions, ShaDoOoW. I wanted to clarify and amplify them because I am curious.


No all instructions do not have the same duration. I feel that Funky did a good job of showing that with the example of SetLocalint vs CreateObject.

How much of a difference is required to be significant? LoL I think we have been fighting to find an answer to that question since the first day I posted on the forums here. I doubt that we will come up with a concise answer in this thread over and above what we already have form the others.

I also agree that this thread is more about instruction count then it is about efficiency.

So when we talk about instruction count in NWN what are we talking about?

We are of course talking about VM instructions, Or Virtual Machine Instructions. The Virtual Machine being the part of the NWN engine that executes NWN compiled Scripts.

Every nwn scripting statement can take anywhere from 0 to a couple hundred VM instructions to execute. You my already be questioning how useful a statement that uses 0 VM instructions can be. Well outside of compiler directives that are not really statements, I can only think of one type of statement that has 0 VM instructions. That type is constant declarations,
example: const int nStable = 1;

But since this type of statement does not generate any code and can not be placed in a code block you can view it as a compiler directive if you like. After all it reacts more like a directive then a statement anyway.

Getting back on track.

Lets look at a script to see why there was no difference between shadows results using OBJECT_SELF vs a var with OBJECT_SELF assigned to it.

Here is a simple script that does, well nothing. It simply has everything striped away except what we want to look at.

void main()
{
  object oSelf = OBJECT_SELF;
  oSelf;
  OBJECT_SELF;
}
 
and here is the Compiled code generated from the above script, color coded to match the statement that generated it. ( decompiled with nwn explorer and trimed down to just what is needed,) 
 
 08  T 0000004D
 0D  JSR fn_00000015
 13  RETN
 15  RSADDO
 17  CONSTO 00000000
 1D  CPDOWNSP FFFFFFF8, 0004
 25  MOVSP FFFFFFFC
 2B  CPTOPSP FFFFFFFC, 0004
 33  MOVSP FFFFFFFC

 39  CONSTO 00000000
 3F  MOVSP FFFFFFFC

 45  MOVSP FFFFFFFC
 4B  RETN
 
 
the numbers at the being of each line is  simply the offset into the file in hex.     The first 8 bytes that are simply not shown by NWN explorer discribe the file and are ".ncsv1.0 "  the basic header to say that this is a .ncs version 1.0 file.   Just trying to explaine why the nwn explorer shows all of the .ncs files starting at offset 0x08 instead of 0x00
 
line 0x08   just like the first 8 bytes is also not code or an VM instruction.   It is simply states the size of the .ncs file.   in this case the file is 0x4D bytes long.  That includes the first 8 bytes not shown at the top all the way through the return that starts at 0x4B.   The return instruction is two bytes long making the last byte in the file at 0x4C   giving us a file that is 0x4D bytes long. ( 0x00 - 0x4C) 
 
that brings us to the first VM instruction in the file at offset  0x0D. 
 
0D  JSR fn_00000015
 
With this first VM instruction being feed into the VM several thing happen.  that make it necessary to quickly explain a couple parts of the VM.   The first thing is the Instruction Pointer (IP)  The IP  points to the instruction to be executed by the VM.  It would be pointing at 0x0d when it executed this instruction.  any time an instruction is executed the IP is increased by the length of the instruction being executed, so that the IP will then point to the next instruction.  So our IP is 0x0D before this instruction is executed.  As soon as the JSR  starts to execute the IP is increassed to 0X13 to point at the RETN instruction.  
 
Boy going to be a quick script...   well that bring me to the next part that I need to explain.  The NWN VM has an array that is allocated as a Return stack.    It holds the places for the IP to be restored to anytime a return is executed.    
 
Ok,  back to explaining this first instruction.   JSR is the mnemonic(short hand)  for Jump to SubRoutine.
00000015 is the operand ( parameter if you want to look at it as a function) that the JSR  code is going to operate on.   So we have a VM instruction that tells VM to jump to the instruction at offset 0x15.   This instruction will add the current value of the IP to the return stack, The current value is 0x13 pointing at the RETN  instruction since this instruction has already started executing.   It will also move 0x15 into the IP so that the instruction at 0x15 will be the next instruction executed.    And this by the way is the Jump into the "main" Function.   just about every NWN script will start this way.  
 
skipping the RETN at offset  0x13 for now since it is not being executed yet anyway.   Simpler to just follow the course of execution.   So our next instruction being executed is at.    0x15.  
 
      
15  RSADDO
 
By the color you can tell that this is part of the VM instructions for the object oSelf = OBJECT_SELF;  Statement.   Well We have hit the time when I have to explain another Part of the VM  in order to Have this statement make since.  
 
VM Stack:  A stack is basically a scratch pad for a program to store data in.   It is called a stack because of the way the data is stored.  when something is added to a stack it is viewed as throwing the data on top of the stack.   when data is removed from the stack it is viewed as being pulled or Poped  from the top of the stack.    However since the nwn VM lacks the traditional Push and Pop  commands that may not be a bit confusing.   Lets just say the the stack consists of two things
1) a Data array of Dwords that the code uses to stores data .  
2) a Stack Pointer (SP)  that points to the top of the stack.  
 
   This brings us back to  RSADDO:( Reserve Object Space on Stack) it is the code issued for the "object' part of our statement.  It increases the stack pointer by 4 and typecasts this dword in the stack as an object data type.  It also Gives the initial value of OBJECT_INVALID to the object.  Anytime the label oSelf is used in our script it is this location, just now added to the stack, that will be accessed.  the next instruction is:
 
17  CONSTO 00000000
 
This one is  Place Constant Object ID Onto the Stack.   This is the code issued bye "OBJECT_SELF".  It will increase the Stack pointer by 4  and add the ID for OBJECT_SELF  into the new top dword on the stack.   At this point we have two dwords on the stack.   The Stack pointer is pointing at the value on top, which is the ID of the object this script is running on.   Under that is the  Reserved dword for the oSelf object.  .... next instruction.   
 
 
1D  CPDOWNSP FFFFFFF8, 0004
 
This instruction Is Copy Down Stack Pointer,   It takes two operands The first one FFFFFFF8 ( or -8) is the location to copy the top of the stack to.   The second on is how many bytes to copy off the top of the stack.   So we are copying the top 4 bytes on the stack Top Dword into the location 8 bytes down in the stack.   This is the dword just under the dword on top of the stack or the dword that has been reserved for our oSelf label.   So the statement reads: Copy OBJECT_SELF constant just pushed onto the stack into the Dword reserved as the oSelf label.  This by the way is the instruction that was issued by the "=" assignment operator.  ....  
 
 
25  MOVSP FFFFFFFC
 
Ok lets look at our stack before I give what this statement does.  so far our stack has 2 dwords on it that has been added by the code.   the bottom one is the dword that was reserved for the oSelf var currently holding the ID for the object that this script is running on.  the top one also holds the same object ID from when the OBJECT_SELF constant was pushed onto the stack.   well guess what,  the Top byte was just scratch paper,   We now hit the ; End of the statement,  Clean up time.   our VM instruction is Move SP FFFFFFFC   or Move SP -4.   This will decrease the SP pointer by 4 effectively removing the top dword from the stack.   This leaves us with only one dword  that this code has placed on the stack, That of our oSelf object.   
 
oSelf is still on the stack because its Scope or name space has not yet ran out.   As soon as it does it also will be removed.

 
 2B  CPTOPSP FFFFFFFC, 0004
 
Here  we have the code that is given by the oSelf part of the next statement.   it is Copy To Top of Stack.   It has Two operands  -4 and  4.   So we have start at byte -4 from the top of the stack ( That would be the oSelf reserved dword) and copy 4 bytes to the top of the stack.  This statement also increase the stack pointer by the number of bytes copied to the stack.   So we once again have two dwords on the stack that we have added.   and oSelf has been placed on the top of the stack for use.   Well guess what we dont use it for anything!!!

 33  MOVSP FFFFFFFC
 
; end of statement : cleanup!  Decrease SP by 4 removing that copy of oSelf we copied to the top of the stack for use. 

 
 
 39  CONSTO 00000000
 3F  MOVSP FFFFFFFC
 
Here is the OBJECT_SELF: statement.   pretty much the same as above.   Just instead of moving (Copying ) the oSelf from a stack location to the top of the stack  we are pushing a constant onto the stack and again doing nothing with it and removing it.   

 45  MOVSP FFFFFFFC

Well here we are at last at the closing braked ')' for our main function. Sadly this is the end of the scope for our oSelf var it just has no meaning after this. Yep that is right decrease the SP by another 4 removing oself from the stack.

 
 
 4B  RETN
 That closing bracket also gives an implied Return statement.   This statement will simply load the IP with the value last placed into the return array.  Of course it will also remove that value from the array.   If you remember from above that value was 0x13,   Therefore the next instruction executed is.  
 
13  RETN
 
Poping yet another return value off the return stack and letting execution to return to whatever code started running this script to begin with.  
 
well it is getting late and I need to wrap this up.   Just to summarize   Both OBJECT_SELF and an Object Var  will use two VM instructions every time they are used.    setting  OBJECT_SELF to a var has an extra set up over head of  5 VM instructions for assignment and destruction.  
 
Really not Much of an over head.   but you never know when that 5 instructions may make a difference.   lol  
 
Hope I did not ramble to much. 
Heading off for some rest now. 
L8
               
               

               
            

Legacy_Shadooow

  • Hero Member
  • *****
  • Posts: 7698
  • Karma: +0/-0
Efficiency thread
« Reply #7 on: April 20, 2014, 05:51:41 pm »


               


Not remotely. That's the problem with this way of testing 'efficiency'. All you're testing is instruction counts.


 


This should be fairly simple to demonstrate, though it's pretty obvious on the face of it if you've done this testing before. It's incredibly easy to blow past a TMI limit with a loop that you won't even feel in the module - something like this:


 


int nX;


for (nX = 0; nX < 2000000000, nX++) {SetLocalInt(GetModule(), "Test", nX);}


 


Now compare to a script that does object creation but does not TMI. You can get a huge processing hiccup with no TMI. Of course, that may not be the greatest example, since it conflates other performance issues like object insertion with varying cpu costs of instructions, but it does handily point out that TMI simply does not measure performance. It's only a safeguard that bioware put in to prevent runaway scripts.


 


Funky




You are right.


 


There are two efficiency fields.


 


1) Number of instruction.


2) Speed of the functions/script.


 


My code only proves the first and using TMI is for speed efficiency comparsion is not possible. Though in some cases like the OBJECT_SELF, the result is proving the speed too - if it was a function it would take much more instructions causing TMI sooner, but it didnt. Lightfoot8 diagnosis proves it perfectly.


 


Still TMI is good method to comparse scripting techniques. Often scripters writes "it can be simplified to..." but often they are wrong and their simplification instead takes more instructions than previously. Number of instructions is not a negligible factor.


 


The clearest example of this are composite statements using 'logical or' or 'logical and'. Its maybe nice it fits one line this way but it takes more instructions to perform.



               
               

               
            

Legacy_Lightfoot8

  • Hero Member
  • *****
  • Posts: 4797
  • Karma: +0/-0
Efficiency thread
« Reply #8 on: April 20, 2014, 06:39:51 pm »


               

The clearest example of this are composite statements using 'logical or' or 'logical and'. Its maybe nice it fits one line this way but it takes more instructions to perform.




If I understand you correctly, you are saying that :


void main()

{

  int a,b,c;

  if (a == b )

  {

    if (a==c)

    {

      "do this";

   }

  }

}


would compile with less instructions then.


void main()

{

  int a,b,c;

  if (a==b && a==c)

  {

    "do this";

  }

}


I myself see them a the same thing. Even in the number of instructions executed.


I will explain that a little better now and in the process prove myself either right or wrong. It wold not be the first time the VM proved me wrong if it does.


Ill be back with comparisons of the two compiled codes in a bit.


edit: if you place ) right after b you get 'B)' oop.



               
               

               
            

Legacy_Shadooow

  • Hero Member
  • *****
  • Posts: 7698
  • Karma: +0/-0
Efficiency thread
« Reply #9 on: April 20, 2014, 06:49:26 pm »


               

Yes thats it. Based on the results of TMI with my sort algorithm. The difference in number of iterations in my cycle when I used the logical OR/AND was overwhelming.


 


also code like


 


int something()


{


 if(a > 0) return TRUE


return b > 0;


}


 


should be better than


 


int something()


{


return a > 0 || b > 0;


}



               
               

               
            

Legacy_MrZork

  • Hero Member
  • *****
  • Posts: 1643
  • Karma: +0/-0
Efficiency thread
« Reply #10 on: April 20, 2014, 07:08:15 pm »


               

They are different when testing using the same value sets for a and b? Not a test to do with random a and b in each run unless the test is run many thousands of times, otherwise getting lucky can have an impact. Interesting.



               
               

               
            

Legacy_Shadooow

  • Hero Member
  • *****
  • Posts: 7698
  • Karma: +0/-0
Efficiency thread
« Reply #11 on: April 20, 2014, 07:12:55 pm »


               


They are different when testing using the same value sets for a and b? Not a test to do with random a and b in each run unless the test is run many thousands of times, otherwise getting lucky can have an impact. Interesting.




nvm


 


Well lets see what Lightfoot can digg out.



               
               

               


                     Modifié par Shadooow, 20 avril 2014 - 06:13 .
                     
                  


            

Legacy_Lightfoot8

  • Hero Member
  • *****
  • Posts: 4797
  • Karma: +0/-0
Efficiency thread
« Reply #12 on: April 20, 2014, 08:40:41 pm »


               

First lets look at the two If statements;
void main()
{
  int a,b,c;
  if (a == b )
  {
    if (a==c)
    {
      "do this";
   }
  }
}

 

it compiles to:


   Spoiler
   



 

Here is the code with comments.



   Spoiler
   


 


Analysis:


 4 instructions executed before the first 'If' statement is executed.  


    JSR and adding the vars.  


 


3 instructions after the the 'If' Statements.   


  removing the vars and return, return.  


 


12 instructions 


   if  a==b and  a==c are both  TRUE.  


 


9 instructions   


  if only a==b is TRUE 


 


5 instructions 


  if only  a==c is TRUE or both are FALSE. 


 


 


Here is the first half.   be back with the second half.   


 



               
               

               
            

Legacy_Lightfoot8

  • Hero Member
  • *****
  • Posts: 4797
  • Karma: +0/-0
Efficiency thread
« Reply #13 on: April 20, 2014, 09:47:08 pm »


               Hmm,  This is an eye opener.   NWScript compiler lets me down again.  
 
Second script.
 
void main()
{
  int a,b,c;
  if (a==b && a==c)
  {
       "do this";
  }
 
}
 
compiler to:

   Spoiler
   



with comments:

   Spoiler
   



Analysis:
4 instructions executed before the first 'If' statement is executed.
JSR and adding the vars.

3 instructions after the the 'If' Statements.
removing the vars and return, return.

12 instructions // two of them are the code body so in effect 10 instructions
if a==b and a==c are both TRUE.

10 instructions
if only a==b is TRUE

6 instructions
if only a==c is TRUE or both are FALSE.


Summery:
The Double If statement is better by only one instruction In both cases if the result ends up being false. They are equal if the result ends up being TRUE.

One thing I will point out here. This result came with only two comparisons separated with an && There should be a gain in the instruction savings that longer the comparison chain gets. mainly due to the fact that there will only be one if wrapper around the TRUE clause. Therfore only one jump over the existent/nonexistent False clause.

Either way It is not the result I expected. For now I would have to look at Instruction count on a case by case basis, In rearguards to comparison chains.
               
               

               
            

Legacy_Lightfoot8

  • Hero Member
  • *****
  • Posts: 4797
  • Karma: +0/-0
Efficiency thread
« Reply #14 on: April 21, 2014, 02:11:17 am »


               

For shadows lets take this script that will use the custom functions.    This will also let us show how the functions are custom functions are placed into the script. 

 



int Something1(int a, int b );
 
int Something2(int a, int b )
{
   return a > 0 || b > 0;
}
 
void main()
{
  Something1(5,6);
  Something2(6,5);
}
 
int Something1(int a,int b )
{
   if(a > 0) return TRUE;
   return b > 0;
}


Now I placed one function above the main body and the other after the main with a header above just to show the difference.  Also keep in mind that the functions here are custom functions.   Internal functions are not handled  the same way. 

 

Here is the compiled code.

 



00000008 42 00000119              T 00000119
0000000D 1E 00 00000008           JSR fn_00000015
00000013 20 00                    RETN
00000015 02 03                    RSADDI
00000017 04 03 00000006           CONSTI 00000006
0000001D 04 03 00000005           CONSTI 00000005
00000023 1E 00 00000028           JSR fn_0000004B
00000029 1B 00 FFFFFFFC           MOVSP FFFFFFFC
0000002F 02 03                    RSADDI
00000031 04 03 00000005           CONSTI 00000005
00000037 04 03 00000006           CONSTI 00000006
0000003D 1E 00 0000007C           JSR fn_000000B9
00000043 1B 00 FFFFFFFC           MOVSP FFFFFFFC
00000049 20 00                    RETN
0000004B 03 01 FFFFFFFC 0004      CPTOPSP FFFFFFFC, 0004
00000053 04 03 00000000           CONSTI 00000000
00000059 0E 20                    GTII
0000005B 1F 00 0000002C           JZ off_00000087
00000061 04 03 00000001           CONSTI 00000001
00000067 01 01 FFFFFFF0 0004      CPDOWNSP FFFFFFF0, 0004
0000006F 1B 00 FFFFFFFC           MOVSP FFFFFFFC
00000075 1D 00 0000003C           JMP off_000000B1
0000007B 1B 00 FFFFFFFC           MOVSP FFFFFFFC
00000081 1D 00 00000006           JMP off_00000087
00000087 03 01 FFFFFFF8 0004      CPTOPSP FFFFFFF8, 0004
0000008F 04 03 00000000           CONSTI 00000000
00000095 0E 20                    GTII
00000097 01 01 FFFFFFF0 0004      CPDOWNSP FFFFFFF0, 0004
0000009F 1B 00 FFFFFFFC           MOVSP FFFFFFFC
000000A5 1D 00 0000000C           JMP off_000000B1
000000AB 1B 00 FFFFFFFC           MOVSP FFFFFFFC
000000B1 1B 00 FFFFFFF8           MOVSP FFFFFFF8
000000B7 20 00                    RETN
000000B9 03 01 FFFFFFFC 0004      CPTOPSP FFFFFFFC, 0004
000000C1 04 03 00000000           CONSTI 00000000
000000C7 0E 20                    GTII
000000C9 03 01 FFFFFFFC 0004      CPTOPSP FFFFFFFC, 0004
000000D1 1F 00 00000014           JZ off_000000E5
000000D7 03 01 FFFFFFFC 0004      CPTOPSP FFFFFFFC, 0004
000000DF 1D 00 00000016           JMP off_000000F5
000000E5 03 01 FFFFFFF4 0004      CPTOPSP FFFFFFF4, 0004
000000ED 04 03 00000000           CONSTI 00000000
000000F3 0E 20                    GTII
000000F5 07 20                    LOGORII
000000F7 01 01 FFFFFFF0 0004      CPDOWNSP FFFFFFF0, 0004
000000FF 1B 00 FFFFFFFC           MOVSP FFFFFFFC
00000105 1D 00 0000000C           JMP off_00000111
0000010B 1B 00 FFFFFFFC           MOVSP FFFFFFFC
00000111 1B 00 FFFFFFF8           MOVSP FFFFFFF8
00000117 20 00                    RETN

 

And again with comments.   

 



// Standard entry.   Script is 0x119 long and jump to main function at 0x15.
00000008 42 00000119              T 00000119
0000000D 1E 00 00000008           JSR fn_00000015

// Return to whatever called us.
00000013 20 00                    RETN



//Here is the main function. See Note 1.  
  //Something1(5,6); See Note 2.  
    //Reserve space on the stack for the return value.
    00000015 02 03                    RSADDI
    // Push Arg 2.  the Constant integer 6 is put on top of the stack.
    00000017 04 03 00000006           CONSTI 00000006
    // Push Arg 1.  The Constant integer 5 is put on top of the stack.
    0000001D 04 03 00000005           CONSTI 00000005
    // Jump to Subrutine at 0x4B.  this is where Something1 was encoded at.
    00000023 1E 00 00000028           JSR fn_0000004B
    // remove the return value from the stack. see Note 3.
    00000029 1B 00 FFFFFFFC           MOVSP FFFFFFFC




  //Something2(6,5);  See note 2
    // Push/reserve storage for an integer return value.
    0000002F 02 03                    RSADDI
    //push Arg 2.  a constant integer of 5 onto the top of the stack.
    00000031 04 03 00000005           CONSTI 00000005
    //push Arg 2. a constant integer of 5 onto the top of the stack.  
    00000037 04 03 00000006           CONSTI 00000006
    // Jump to the subroutine at 0xB9. Where Something2 was encoded.
    0000003D 1E 00 0000007C           JSR fn_000000B9
    // Remove the return value from the stack.
    00000043 1B 00 FFFFFFFC           MOVSP FFFFFFFC

  //Return from Main function.
  00000049 20 00                    RETN
// end of main

//function Something1
  //if(a > 0) return TRUE;  
    // copy to top of stack, stack pointer -4, lenght 4.  
    // In english we are making a copy of arg 1 'a' and puting it on the top of the stack.
    0000004B 03 01 FFFFFFFC 0004      CPTOPSP FFFFFFFC, 0004
    // push  constant 0 onto the stack.
    00000053 04 03 00000000           CONSTI 00000000
    // Do a greater then comparison with the top two integers on the stack,
    // leaving the results in there place.      
    00000059 0E 20                    GTII
    // If the result is FALSE jump over the TRUE case to 0x87.
    0000005B 1F 00 0000002C           JZ off_00000087

 // True case
   //Return TRUE;
    //push the constant 1 (TRUE) onto the stack.
    00000061 04 03 00000001           CONSTI 00000001
    // copy the top of stack down to SP - 16 , 4 bytes long
    // in English, move the 1 we just pushed onto the stack into the dword reserved
    // for the functions return value.  
    00000067 01 01 FFFFFFF0 0004      CPDOWNSP FFFFFFF0, 0004
    //  decreese the stack pointer by 4 removing the 1 we just pushed onto it.
    0000006F 1B 00 FFFFFFFC           MOVSP FFFFFFFC
    //  Jump to the function  exit code at 0xB1
    00000075 1D 00 0000003C           JMP off_000000B1

    // This is the normal Exit code For the TRUE case
    // It will never execute due to the code that was given for the RETURN statement
    0000007B 1B 00 FFFFFFFC           MOVSP FFFFFFFC
    00000081 1D 00 00000006           JMP off_00000087

  //Return b >0;
    // Copy Arg 2 to the top of the stack.  
    00000087 03 01 FFFFFFF8 0004      CPTOPSP FFFFFFF8, 0004
    // push 0 onto the stack.
    0000008F 04 03 00000000           CONSTI 00000000
    // Do a gerater then comp on the two integers on the top of the stack.
    //leaving only the results on  the top in there place.
    00000095 0E 20                    GTII
    //   move the result into the space reserved for the return value.
    00000097 01 01 FFFFFFF0 0004      CPDOWNSP FFFFFFF0, 0004
    // Remove the result from the top of the stack.  SP -4.
    0000009F 1B 00 FFFFFFFC           MOVSP FFFFFFFC
    // Jump to exitcode.  
    000000A5 1D 00 0000000C           JMP off_000000B1

    // This is the normal clean up code for the statment.  
    // Will not execute du to the return statement taking care of it.
    000000AB 1B 00 FFFFFFFC           MOVSP FFFFFFFC

    // exit clean up.   Remove function Args from the stack.  SP - 8
    000000B1 1B 00 FFFFFFF8           MOVSP FFFFFFF8
    000000B7 20 00                    RETN
//End of something1




//function Something2
  000000B9 03 01 FFFFFFFC 0004      CPTOPSP FFFFFFFC, 0004
  000000C1 04 03 00000000           CONSTI 00000000
  000000C7 0E 20                    GTII
  000000C9 03 01 FFFFFFFC 0004      CPTOPSP FFFFFFFC, 0004
  000000D1 1F 00 00000014           JZ off_000000E5
  000000D7 03 01 FFFFFFFC 0004      CPTOPSP FFFFFFFC, 0004
  000000DF 1D 00 00000016           JMP off_000000F5
  000000E5 03 01 FFFFFFF4 0004      CPTOPSP FFFFFFF4, 0004
  000000ED 04 03 00000000           CONSTI 00000000
  000000F3 0E 20                    GTII
  000000F5 07 20                    LOGORII
  000000F7 01 01 FFFFFFF0 0004      CPDOWNSP FFFFFFF0, 0004
  000000FF 1B 00 FFFFFFFC           MOVSP FFFFFFFC
  00000105 1D 00 0000000C           JMP off_00000111
  0000010B 1B 00 FFFFFFFC           MOVSP FFFFFFFC
  00000111 1B 00 FFFFFFF8           MOVSP FFFFFFF8
  00000117 20 00                    RETN
// END of Something2

Note:


1   Yes the main in this case is still the first thing compiled into the script even though the function is declared above the main.  The reason is simple.  If the compiler placed every script into the code as it ran across it.   every function in the source would end up in the ncs file.    This would be a problem whenever large include files where added to the script,  The include being above the main would mean every function in the include would end up in the .ncs file.   That however does not happen.  when the main is compiled the functions that are needed are logged,   then added to the compiled code as needed. 


 


2,  when a function is invoked  three thing happen: 


    1)   If it has a return value ( it is not a void function) space is reserved on the stack to place the return value.


          This  data reserved spot will be on the top of the stack once the function is finished running. 


 


    2)  Any arguments that the functions has, If any, will be pushed onto the stack in reverse order.  


         Arg 2 will be pushed before Arg 1 ect...    The arguments will be removed from the stack by the function.


 


    3) A JSR Jump to SubRoutine  will be called to execute the function.


 


3  Yes once again our value on the stack is just being removed without anything happening to it.   In this case it is the return value from the function.   if the code was more like    nRvalue  = Something1(5,6);   The the value would have been copied into the location reserved for nRvalue before being removed from the stack.   


 


 


 


 EDIT: Running out of time.  I will finish up with comments is Something2 at a latter time.