Author Topic: Which is 'Cleaner' and why?  (Read 1078 times)

Legacy_meaglyn

  • Hero Member
  • *****
  • Posts: 1451
  • Karma: +0/-0
Which is 'Cleaner' and why?
« Reply #30 on: May 29, 2012, 03:07:01 pm »


               BTW, Lightfoot, what are you using to disassemble the compiled script?
               
               

               
            

Legacy_Lightfoot8

  • Hero Member
  • *****
  • Posts: 4797
  • Karma: +0/-0
Which is 'Cleaner' and why?
« Reply #31 on: May 29, 2012, 11:58:54 pm »


               

meaglyn wrote...


BTW, Lightfoot, what are you using to disassemble the compiled script?


nwnexplorer_163

The opcodes are listed on  This web page.  according to Skywing,   he  based it on Torlack's original work. Though I do not know who Torlack is. I am sure someone around here does.  

As for  really digging into the VM. I use ollydbg version 1.10


Hmm, I was going to comment on the one part of your post I dissagreed with. But it seem like you have removed that part of it. Oh well,'Posted
               
               

               


                     Modifié par Lightfoot8, 29 mai 2012 - 11:59 .
                     
                  


            

Legacy_meaglyn

  • Hero Member
  • *****
  • Posts: 1451
  • Karma: +0/-0
Which is 'Cleaner' and why?
« Reply #32 on: May 30, 2012, 04:26:51 pm »


               

Lightfoot8 wrote...

nwnexplorer_163


Thanks! I didn't realize nwexplorer could do that...

Hmm, I was going to comment on the one part of your post I dissagreed with. But it seem like you have removed that part of it. Oh well,'Posted



I don't think I actually remove anything from my post. I just added the edits with some clarification...
               
               

               
            

Legacy_Lightfoot8

  • Hero Member
  • *****
  • Posts: 4797
  • Karma: +0/-0
Which is 'Cleaner' and why?
« Reply #33 on: May 30, 2012, 05:13:05 pm »


               
Quote


Hmm, I was going to comment on the one part of your post I dissagreed with. But it seem like you have removed that part of it. Oh well,'Posted



I don't think I actually remove anything from my post. I just added the edits with some clarification...

[/quote]


You didn't,  I just misread  what you wrote the first time.    It sounded like you where saying that constants where allocated a memory location on the stack the first time I read it.  After reading it again, That is not what you said.  So No dissagreement, just bad reading on my part.
               
               

               
            

Legacy_Axe_Murderer

  • Full Member
  • ***
  • Posts: 199
  • Karma: +0/-0
Which is 'Cleaner' and why?
« Reply #34 on: May 30, 2012, 08:42:41 pm »


               I'll throw in my 2 cents worth.

To answer about the repeated lines. The ones with a semicolon at the end are called prototypes. They are almost always optional. In this case, I put them in to provide some help in the script editor. When you edit a script in NWN, and it #include's a library, all the functions, constants, and variables defined in the library will show up in the help panels on the right side of the script editor interface while you are authoring the script. Even if the script you're working on is another library script. All the entries from all the librarires it #include's will appear there. Those entries you see there in bold typeface are derived from libraries included by the script you are editing, including the libraries those libraries in-turn include and so on. So there's a chain effect in play. The non-bold entries are standard core things which are always available to every script independent of any libraries the script may or may not be #including.

This built-in editor help allows you to click or double-click on one of the entries in the help list. If you single-click, the help panel displays essentially a block of code comments which typcially are used to describe the function, constant, or variable and how to use it. Parameter limits or values and so forth. So the information is obtainable during your edit session right there where and when you need to know it. If you double-click an entry, the editor will copy it directly into your script for you right at the current cursor location so that you need not type it in. Helpful for avoiding spelling errors or typos, particularly on entries with long or wierd names.

Anyway, the block of comments that gets displayed in the editor's help panel when you click, say, a bold-faced function name in the list, those comments come directly out of the library source file. The way the editor finds the comment block to display when you click it is to look in the library for the function's prototype line. When it finds that, it copies all the consecutive comment lines that immediately preceed the prototype in the file, and that's what it displays. If no prototype is found for the function in the library, like when the function has a definition in the library but no associated prototype line, then no comments will be displayed in the editor help panels, and, in fact, the function name itself will not appear in the help list at all to click on in the first place. Useful for hiding functions designed to be used only by other library functions within the library, but not meant for general purpose use in any ol' script outside of the library. Every function in a library can be used anywhere, there is no way to prevent their use. But avoiding the help list by not including a prototype and help comments is a means of discouraging the use of functions not really designed for general purpose use, but which are still required by other functions in the library. On the other hand, you could specify a prototype and explicitly tell them not to use it for general purpose right there in the help comment block to accomplish the same thing...maybe more effectively even. Trade-off, of course. Is it better that they don't (easily) know about the function or is it better that they know about it and have been clearly instructed not to use it?



Now a little more about prototypes and their purpose. The function itself must have a definition. Also called an implementation. This defines the code which the function executes when you call it from some other place in the script. It also identifies the number and types of parameters (pieces of information) that the function needs to be told in order to do its thing when it gets used. And it specifies the name of the function and the type of data it computes. Before you can use a function within a script, it must have already been previously defined. So the compiler can understand what the line which calls the function means when it does so. In order to understand the function call, and therefore ensure it is being used correctly, the compiler must already know that the function exists, what its name is, as well as all the datatypes of every one of it parameters.

Function definitions consist of two parts. The first part is the header. It specifies the function name, type of data it returns, and name plus datatype of each parameter it needs to operate, known as it's parameter list. The second part is the function body. It consists of all the lines of code which implement the function behavior and computations, based on the parameters, and it is surrounded by curly braces {} to delineate it. In NWN it is illegal syntax to try and declare or define a function within another function's body. They have to be completely separately defined. All the stuff the compiler needs to know in order to translate a call to a function is present in the function header. At compile time, when compiling a call to some function found in some line, it is unaware of and unconcerned with the function's body at that point. Only the header information is necessary to know in order to determine if the function is being called correctly. This is handy because it provides for the existence of prototypes.

The prototype is just a function header with a semicolon at the end. I like to call them by their old name, the function declaration. A main purpose of this line is to provide some convenience to the organization of your script. Since functions have to be known before they can be used, you cannot call a function from your main() function, or anywhere else for that matter, unless it is physically defined above that point somewhere in the script. The prototype allows you to "declare" the existence of a function, to the compiler, before it's definition is known. Before the function's body is ever encountered. Since the prototype is nothing more than a copy of the function's header, it also contains all the information the compiler needs to know in order to successfully translate a line of code which calls the function. This allows the scripter to put a prototype line(s) at the top of the script to announce that a function definition(s) exists somewhere in the script, then have the main function code block, then have the function definition(s) appear at the bottom. Some people like the main script function to be the first code block you see in a script. Some like it at the end. And a few don't care if it's buried somewhere in the middle amist all the function definitions. Anyway, that is basically what prototypes are typically used for. Combine it with the way the NWN script editor handles its online help, and it makes sense to always have a prototype line, with a comment block describing the associated function immediately preceeding it, in your library scripts for every function defined in there that you want to appear in the editor's online help when that library is being #included by the script being edited. Personally, I like to keep the function prototype, comment block and definition all located together within my library scripts. Like my previous example shows.

Note that it is absolutely vital that the function header and function prototype match up precisely should you choose to use prototypes. If you go in and change a function header, to add or remove a parameter for example, or change its return type, or even its name, and there exists somewhere the prototype line for that function, which you neglect to change in exactly the same way, you will not be able to compile your script. One reason I keep them as close together as I can. Easier maintainability.

There is also a technical advantage that prototypes provide which resolves a difficult coding dilemma...albeit an extreemly rare one. Suppose you want to write two functions A and B such that A needs to call B and B needs to call A in order for them both to operate correctly. How could you write that? If you defined A before B then the compiler will get to the line inside A where B is called and choke because it doesn't recognize B yet. It hasn't encountered B's definition yet. So if you then put B first you get the same problem when it gets to the line inside B where A gets called. It seems there is no way to compile such code. This is the only situation where a prototype is actually required. Normally they are entirely optional. In this case, you could prototype B before A's definition, then have B's definition appear after A's and the compiler will know what B is all about when it gets inside A and comes across a call to it since it will have already seen B's prototype earlier. Since A can now get compiled before B is encountered, by the time it gets to compiling the line inside B where A is called, A will have already been compiled and is known. So the translation can now succeed because of the use of a prototype. The technique is called a "forward declaration" or sometimes a "forward definition" , a "forward prototype" or even just a "forward". There's lots of confusing terminology around this stuff. I personally prefer the declaration/definition or declaration/implementation terminology cuz it makes the most sense to me. One declares existence, the other defines or implements the behavioral code. Some say prototype/declaration, prototype/definition, definition/implementation, etc... Some even refer to the function header in it's definition as its declaration. In most cases it's very clear when any distinction is made what is being referred to.

Therefore, the library code I posted earlier could have been written like this:
[quote]
// Blah blah comments describing the function
int HasDoneThisBefore( object oPC )
{ return GetLocalInt( oPC, GetTag( OBJECT_SELF ));
}

// Blah blah comments describing the function
void SayThisOnce( object oPC, string sObservation)
{
  blah blah function definition
}
[/quote]
Functionally this is absolutely equivalent. Syntactically it is probably the simplest format. It might even be considered more readable. However by doing it without prototypes, neither HasDoneThisBefore nor SayThisOnce would appear in the editor help when authoring a script which #include's the library. And that was the whole point of making the library to begin with. To make these functions easy to use anywhere. You would have to remember the functions exist, their names, and their parameters in order to use them in such a script. Or you'd have to open up the library and look it up to remind yourself and know for sure. Even though the comments are still there.



Onto variables...
Variables get allocated when defined (e.g. int X;) and released when the block they are defined in ends (i.e. when the "}" is encountered). It may seem dumb to define a variable inside a loop, but if you come from old school like me where memory was often more scarce or significant than CPU performance, you learn there are times when it can make sense. Here's a quick example:
[quote]
void main()
{ // 80 ints defined here
  for( x = 0; x < LIMIT; x++ )
  { // 80 strings defined here
    // 5 lines of code here
  }
  // 600 lines of code here
}
[/quote]
Now you look at this and you might be tempted to move those 80 string variables out of the loop to just after the 80 ints are defined. Why recreate 80 strings every time through that loop when you can do them all once and just reuse them over and over? Actually I can identify two good reasons not to do that. First, look at it from a memory usage perspective. The 80 ints exist from the start of the script until the end. Thats when the block they are defined within ends ("}"). The strings exist only while the loop runs, then the block they are defined in ends (i.e. the loop's "}" ) and they disappear until next iteration. When all iterations are done, they are no longer taking up space. Space that those 600 lines may need. And strings are expensive memory-wise. Move them out and you hold onto all that space for the rest of the script and you know ahead of time you won't reference any of it.
Now suppose the value given to the constant called LIMIT is like 3. So the loop only iterates 3 times. If you move those strings outside the loop, your memory requirement for the whole script, during most of its execution period anyway, is elevated significantly to hold onto 80 strings it will never use again. All to save a few cpu cycles out of a 3 iteration loop. Could be significant in a script that runs frequently. I'm not knocking it, just sayin there is at least one good reason not to. Now if LIMIT is set to 1000 it could be a whole different story. It is important to evaluate the expected behavior of the script in general operation conditions to make an informed value judgement about whether or not you get any bang outta doing stuff like that.

Coming from the background I do, I have always tended to define variables as close to the point of first use as possible. Then I back them out of loops later when testing shows that the code where I've done that is dragging and could use a CPU boost. So I err in favor of low memory use. Its a habit. Also I find code much easier to read when the variables are defined very close to where they are used. Nothing worse than trying to read through some long loop to determine what it's actually doing, and stumbling across a variable whose value is defined three pages away but never used until now. Might be more efficient, but it can also hamper readability greatly.

The second good reason not to move them has to do with how they are being used. When you move them out of the loop, they maintain their contents between loop iterations. Leave them inside and they get re-initialized to the empty string every iteration. Now if the algorithm requires them all to start out the loop as empty strings, then by moving them outside the loop you have eliminated the automatic blanking that redefining them inside gives you. So for the algorithm to now work correctly once again, you must add 80 assignment statements inside the loop, to be performed on every iteration, in order to return the values of the strings back to their expected initial value. And you've now lost the CPU gain you were going for by moving them outside in the first place. In fact, you've made the overall performance of the loop and by extension the script worse by a lot. And that's the code which repeats that you're messing with. Where a performance hit will be felt the worst.

There are very few blanket "good performance" rules in programming. You always have to find the trade-offs and make the decision which works best the greatest majority of the time. Testing is the best way to identify bottlenecks. Not sifting through source code armed with a theory.
               
               

               


                     Modifié par Axe_Murderer, 30 mai 2012 - 08:17 .
                     
                  


            

Legacy_Axe_Murderer

  • Full Member
  • ***
  • Posts: 199
  • Karma: +0/-0
Which is 'Cleaner' and why?
« Reply #35 on: May 30, 2012, 09:13:10 pm »


               As far as this bit goes:
[quote]
{ int x = 1;
  { int x = 2;
  }
}
[/quote]
While it is interesting and informative to know how it works, if you ever see anything where different variables in the same function are defined to use the exact same name, dump the script right away. Whoever wrote it has no more sense than a parent who gives all his children the same name. You never see it. It's too dumb to contemplate. The script is not trustworthy.
               
               

               


                     Modifié par Axe_Murderer, 30 mai 2012 - 08:13 .
                     
                  


            

Legacy_Lightfoot8

  • Hero Member
  • *****
  • Posts: 4797
  • Karma: +0/-0
Which is 'Cleaner' and why?
« Reply #36 on: May 30, 2012, 09:48:32 pm »


               

Variables get allocated when defined (e.g. int X;) and released when the block they are defined in ends (i.e. when the "}" is encountered). It may seem dumb to define a variable inside a loop, but if you come from old school like me where memory was often more scarce or significant than CPU performance, you learn there are times when it can make sense. Here's a quick example:


[quote]
void main()
{ // 80 ints defined here
  for( x = 0; x < LIMIT; x++ )
  { // 80 strings defined here
    // 5 lines of code here
  }
  // 600 lines of code here
}
[/quote]


Your second point of them having to be re-initialized every iteration is a good.  And something I had not thought about.  

The First point however could be handles this way. 



[quote]
void main()
{ // 80 ints defined here
    { 
        // 80 strings defined here
        for( x = 0; x < LIMIT; x++ )
        {             
              // 5 lines of code here
        }
    }
    // 600 lines of code here
  }
[/quote]

EDIT;  OOps had the bracket in the wrong spot.   
               
               

               


                     Modifié par Lightfoot8, 30 mai 2012 - 09:15 .
                     
                  


            

Legacy_Axe_Murderer

  • Full Member
  • ***
  • Posts: 199
  • Karma: +0/-0
Which is 'Cleaner' and why?
« Reply #37 on: May 31, 2012, 12:01:16 am »


               Yes. Well done.

Both those examples are pretty unrealistic though. I was trying to make a point about thinking beyond the structure into the realm of usage. Not that structure is not still very important, but situations exist that are not always apparent until you examine usage as well.
               
               

               


                     Modifié par Axe_Murderer, 30 mai 2012 - 11:54 .
                     
                  


            

Legacy_WhiZard

  • Hero Member
  • *****
  • Posts: 2149
  • Karma: +0/-0
Which is 'Cleaner' and why?
« Reply #38 on: May 31, 2012, 12:48:30 am »


               

Axe_Murderer wrote...
While it is interesting and informative to know how it works, if you ever see anything where different variables in the same function are defined to use the exact same name, dump the script right away.


Okie doke, dumping x2_s0_blckblde.  Oops, now my characters cannot cast Black blade of disaster. '<img'>
               
               

               
            

Legacy_Axe_Murderer

  • Full Member
  • ***
  • Posts: 199
  • Karma: +0/-0
Which is 'Cleaner' and why?
« Reply #39 on: May 31, 2012, 01:13:38 am »


               lol. And that one isn't very trustworthy because of it, is it?

Alternative, fix it yourself. Always its an unintentional bug I assure you. Nobody does it on purpose.
               
               

               


                     Modifié par Axe_Murderer, 31 mai 2012 - 11:58 .
                     
                  


            

Legacy_Leurnid

  • Sr. Member
  • ****
  • Posts: 473
  • Karma: +0/-0
Which is 'Cleaner' and why?
« Reply #40 on: May 31, 2012, 08:24:31 am »


               

Axe_Murderer wrote...

As far as this bit goes:

[quote]
{ int x = 1;
  { int x = 2;
  }
}
[/quote]
While it is interesting and informative to know how it works, if you ever see anything where different variables in the same function are defined to use the exact same name, dump the script right away. Whoever wrote it has no more sense than a parent who gives all his children the same name. You never see it. It's too dumb to contemplate. The script is not trustworthy.


[quote]
{ int GeorgeForeman = 1;
  { int GeorgeForeman = 2;
  }
}
[/quote]

http://en.wikipedia...._Foreman#Family