[Home] [Downloads] [Search] [Help/forum]


Register forum user name Search FAQ

Gammon Forum

[Folder]  Entire forum
-> [Folder]  MUSHclient
. -> [Folder]  General
. . -> [Subject]  profiling for optimization

profiling for optimization

It is now over 60 days since the last post. This thread is closed.     [Refresh] Refresh page


Posted by Fiendish   USA  (2,514 posts)  [Biography] bio   Global Moderator
Date Thu 13 May 2010 05:32 PM (UTC)

Amended on Thu 13 May 2010 05:35 PM (UTC) by Fiendish

Message
I've bumped into a performance wall recently, where the Lua plugins I want to make are limited less by what I can write than by the speed at which the script interpreter can process. I have a strong feeling that some optimizations can be done to improve performance, but, to paraphrase the old saying, optimizing before profiling is stupid, because you don't know what you're doing until you know what you're doing.

So what do people do when they need to figure out the specific choke points in their scripts? It seems that standard profilers just won't work most of the time without a significant amount of effort, because scripts require special MUSHclient-specific data structures and routines.

Is there a possibility of profiling mode for plugins?

https://github.com/fiendish/aardwolfclientpackage
[Go to top] top

Posted by Nick Gammon   Australia  (22,973 posts)  [Biography] bio   Forum Administrator
Date Reply #1 on Thu 13 May 2010 09:53 PM (UTC)

Amended on Tue 26 Nov 2013 03:31 AM (UTC) by Nick Gammon

Message
To answer the direct question first, I added profiling by putting the following code at the end of the script part of my mapping plugin:


-- set up profiling

local profile = {}
local myGetInfo = GetInfo -- make local for speed and efficiency

-- stub function to profile an existing function

local function make_profile_func (fname, f)
  profile [fname] = { func = f, elapsed = 0, count = 0 }
 
  return function (...)
     local start = myGetInfo (232) -- high-precision timer
     profile [fname].count = profile [fname].count + 1
     local results = { f (...) }  -- call original function
     -- calculate how long it took
     profile [fname].elapsed = profile [fname].elapsed + myGetInfo (232) - start
     return unpack (results)  -- return results 
  end -- function
end -- make_profile_func

-- replace world functions by profiling stub function

for k, f in pairs (world) do
  if type (f) == "function" then
    world [k] = make_profile_func (k, f)
  end -- if
end -- for

require "pairsbykeys"

-- show profile, called by alias

function show_profile (name, line, wildcards)

  print (string.rep ("-", 20), "Function profile - alpha order", string.rep ("-", 20))
  print ""
  print (string.format ("%25s %8s %8s", "Function", "Count", "Seconds"))
  print ""
  for k, v in pairsByKeys (profile) do
    if v.count > 0 then
      print (string.format ("%25s %8i %8.4f", k, v.count, v.elapsed))
    end -- if
  end -- for
  print ""
  
  local t = {}
  for k, v in pairs (profile) do
    if v.count > 0 then
      table.insert (t, k)
    end -- if used
  end -- for
  
  table.sort (t, function (a, b) return profile [a].elapsed > profile [b].elapsed end )
  
  print (string.rep ("-", 20), "Function profile - time order", string.rep ("-", 20))
  print ""
  print (string.format ("%25s %8s %8s", "Function", "Count", "Seconds"))
  print ""
  for _, k in ipairs (t) do
      print (string.format ("%25s %8i %8.4f", k, profile [k].count, profile [k].elapsed))
  end -- for
  print ""
  
end -- show_profile


Inside the <aliases> ... </aliases> part I added an alias to call the function that displays the results:


<alias
   match="profile"
   enabled="y"
   sequence="100"
   script="show_profile"
  >
</alias>  



What the code does is replace each world.xxx function in the script space with a "stub" that works out how long that function took to run (using the high-resolution timer) and also how often it is called.

The "profile" alias prints out the current profile information. The results look like this, after walking around a bit:


-------------------- Function profile - alpha order --------------------

                 Function    Count  Seconds

               CallPlugin       24   0.0076
          ColourNameToRGB    11411   0.0842
                  DoAfter        1   0.0000
           DoAfterSpecial        1   0.0000
                  GetInfo      361   0.0029
        GetNoteColourFore        1   0.0000
              GetPluginID        1   0.0000
            GetPluginInfo        7   0.0000
        GetPluginVariable       24   0.0004
              GetVariable        5   0.0000
              IsConnected        2   0.0000
        IsPluginInstalled        1   0.0000
                     Send        1   0.0011
        SetNoteColourFore        2   0.0000
                SetStatus        4   0.0004
                     Trim        1   0.0000
         WindowAddHotspot    14147   0.1644
           WindowCircleOp    25452   0.3548
             WindowCreate      181   0.2555
        WindowDragHandler      180   0.0017
               WindowFont        2   0.0000
           WindowFontInfo      717   0.0243
               WindowInfo      181   0.0026
               WindowLine    30708   0.4177
           WindowPosition        1   0.0000
             WindowRectOp     1432   0.0094
               WindowShow      180   0.0022
               WindowText      721   0.0055
          WindowTextWidth     1437   0.0107
             WorldAddress        1   0.0000
                WorldPort        1   0.0000

-------------------- Function profile - time order --------------------

                 Function    Count  Seconds

               WindowLine    30708   0.4177
           WindowCircleOp    25452   0.3548
             WindowCreate      181   0.2555
         WindowAddHotspot    14147   0.1644
          ColourNameToRGB    11411   0.0842
           WindowFontInfo      717   0.0243
          WindowTextWidth     1437   0.0107
             WindowRectOp     1432   0.0094
               CallPlugin       24   0.0076
               WindowText      721   0.0055
                  GetInfo      361   0.0029
               WindowInfo      181   0.0026
               WindowShow      180   0.0022
        WindowDragHandler      180   0.0017
                     Send        1   0.0011
                SetStatus        4   0.0004
        GetPluginVariable       24   0.0004
              GetVariable        5   0.0000
            GetPluginInfo        7   0.0000
           WindowPosition        1   0.0000
        SetNoteColourFore        2   0.0000
               WindowFont        2   0.0000
                  DoAfter        1   0.0000
              IsConnected        2   0.0000
           DoAfterSpecial        1   0.0000
                     Trim        1   0.0000
              GetPluginID        1   0.0000
        IsPluginInstalled        1   0.0000
             WorldAddress        1   0.0000
        GetNoteColourFore        1   0.0000
                WorldPort        1   0.0000


In this particular case, you can see that WindowLine and WindowCircleOp have the highest execution time (and are also called the most often).

You could use something like this (in each plugin) to see if there is a particular function call that takes longer than you expect. You could adapt these lines:


for k, f in pairs (world) do
  if type (f) == "function" then
    world [k] = make_profile_func (k, f)
  end -- if
end -- for


... to also profile other functions (eg. string functions).

Having presented that (which you could usefully turn into a module if you are going to use it everywhere), I have to say that in testing my various GUI demonstrations I didn't notice any slowdown, even with the mapper, and other windows all being displayed at once.

For example: Playing a MUD game with a MMO-style look to it

and: Playing a MUD game with MUSHclient

Also note that the profiling code I describe above will itself slow the execution of the plugin down, naturally.

Some tips that might help keep things running smoothly:


  • Don't use the Repaint function - that makes the client do a lot of work, when Redraw will usually work just as well

  • Use Lua variables instead of MUSHclient variables, where possible

  • If you are writing to an SQLite database, writing more than once thing at once, put them in a transaction

  • Don't unnecessarily create or draw to miniwindows, if you can just re-use an existing one. Once drawn to, the window is just "blitted" to the output window.

  • Functions like ColourNameToRGB should not really be called frequently (as I did above) because the colours should be converted from names to numbers once, not every time through an inner loop.

  • Be cautious of using a design that tracks the movement of the mouse over every pixel (as others have done) but rather use hotspots as they were intended to be used.

- Nick Gammon

www.gammon.com.au, www.mushclient.com
[Go to top] top

Posted by Twisol   USA  (2,257 posts)  [Biography] bio
Date Reply #2 on Thu 13 May 2010 11:49 PM (UTC)

Amended on Thu 13 May 2010 11:51 PM (UTC) by Twisol

Message
*jaw drops* That's awesome. I definitely think that should become a module.

EDIT: Except it would be easier to use debug.sethook to set a call-hook. Time to roll up the sleeves...

'Soludra' on Achaea

Blog: http://jonathan.com/
GitHub: http://github.com/Twisol
[Go to top] top

Posted by WillFa   USA  (525 posts)  [Biography] bio
Date Reply #3 on Fri 14 May 2010 02:23 AM (UTC)
Message
Quote:

Functions like ColourNameToRGB should not really be called frequently (as I did above) because the colours should be converted from names to numbers once, not every time through an inner loop.


Use the colour_names table, as Lua table lookups are really cheap.
[Go to top] top

Posted by Nick Gammon   Australia  (22,973 posts)  [Biography] bio   Forum Administrator
Date Reply #4 on Fri 14 May 2010 10:03 PM (UTC)
Message
Here is the above code, tidied up a bit, and turned into a module (save as profile.lua in your "lua" subdirectory):


--[[

profile.lua

Written by Nick Gammon
Date: 15th May 2010

Designed to do function profiling of various inbuilt functions, such as the ones in the 
world table, and various other ones like sqlite3, progress, etc.

To use:

require "profile" 

This will replace all functions in the tables WANTED_TABLES (below) with "stub" functions
that will time how many times they are called, and for how long, using the high-precision
timer (accessed by GetInfo (232)).

To access the results, simply look up the contents of the profile table (which is a local variable
below) or use this alias:

<aliases>
 <alias
   match="profile"
   enabled="y"
   sequence="100"
   script="show_profile"
  >
 </alias>  
</aliases>
  


--]]


-- set up profiling

local profile = {}
local myGetInfo = GetInfo -- make local for speed and efficiency

local WANTED_TABLES = { "world", "utils", "bit", 
                        "sqlite3", "progress", "rex", 
                        "io", "lpeg" } 
                       
-- stub function to profile an existing function

local function make_profile_func (fname, f)
  local t = { elapsed = 0, count = 0 }
  profile [fname] = t
 
  -- this is the replacement function
  return function (...)
     t.count = t.count + 1          -- invocations
     local start = myGetInfo (232)  -- high-precision timer
     local results = { f (...) }    -- call original function
     t.elapsed = t.elapsed + myGetInfo (232) - start  -- time taken
     return unpack (results)   
  end -- function
end -- make_profile_func

-- replace world functions by profiling stub function

local max_name_length = 0

for _, tbl in ipairs (WANTED_TABLES) do
  for k, f in pairs (_G [tbl] ) do
    if type (f) == "function" then
      local name = tbl .. "." .. k
      if tbl == "world" then
        name = k
      end -- if
      max_name_length = math.max (max_name_length, #name)
      _G [tbl] [k] = make_profile_func (name, f)
    end -- if
  end -- for
end -- for each wanted table

require "pairsbykeys"

-- show profile, called by alias

function show_profile (name, line, wildcards)

  local function heading (s)
    print (string.rep ("-", 20), "Function profile - " .. s .. " order", string.rep ("-", 20))
    print ""
    print (string.format ("%" .. max_name_length .. "s %12s %12s %12s", 
           "Function", "Count", "Seconds", "Average"))
    print ""
  end -- heading
  
  heading "alpha"
  for k, v in pairsByKeys (profile) do
    if v.count > 0 then
      print (string.format ("%" .. max_name_length .. "s %12i %12.4f %12.6f", 
             k, v.count, v.elapsed, v.elapsed / v.count))
    end -- if
  end -- for
  print ""
  
  local t = {}
  for k, v in pairs (profile) do
    if v.count > 0 then
      table.insert (t, k)
    end -- if used
  end -- for
  
  -- sort into descending order by elapsed time
  table.sort (t, function (a, b) return profile [a].elapsed > profile [b].elapsed end )
  
  heading "time"
  for _, k in ipairs (t) do
      print (string.format ("%" .. max_name_length .. "s %12i %12.4f %12.6f", 
             k, profile [k].count, profile [k].elapsed, profile [k].elapsed / profile [k].count))
  end -- for
  print ""
  
end -- show_profile


Now all you have to do is:


require "profile"


... and add the alias to call show_profile.

This version also handles more tables (see WANTED_TABLES above), and shows the average time take for calling each function.

- Nick Gammon

www.gammon.com.au, www.mushclient.com
[Go to top] top

Posted by Fiendish   USA  (2,514 posts)  [Biography] bio   Global Moderator
Date Reply #5 on Sat 15 May 2010 05:41 PM (UTC)

Amended on Sat 15 May 2010 05:42 PM (UTC) by Fiendish

Message
Nick Gammon said:

To answer the direct question first, I added profiling by putting the following code at the end of the script part of my mapping plugin:
Cool stuff. I'll take a look at it more closely later.

Nick Gammon said:

Some tips that might help keep things running smoothly:

*Don't use the Repaint function - that makes the client do a lot of work, when Redraw will usually work just as well

*Use Lua variables instead of MUSHclient variables, where possible

*If you are writing to an SQLite database, writing more than once thing at once, put them in a transaction

*Don't unnecessarily create or draw to miniwindows, if you can just re-use an existing one. Once drawn to, the window is just "blitted" to the output window.

*Functions like ColourNameToRGB should not really be called frequently (as I did above) because the colours should be converted from names to numbers once, not every time through an inner loop.

*Be cautious of using a design that tracks the movement of the mouse over every pixel (as others have done) but rather use hotspots as they were intended to be used.


All good advice. Unfortunately the bogdowns I'm worried about tend to be caused by looping through large amounts of data a bunch of times with significant table creation, insertion, and deletion. I think I need to find a primer on Lua optimization.

https://github.com/fiendish/aardwolfclientpackage
[Go to top] top

Posted by Twisol   USA  (2,257 posts)  [Biography] bio
Date Reply #6 on Sat 15 May 2010 06:41 PM (UTC)
Message
Roberto Ierusalimschy's article on optimization from Lua Programming Gems can be found online, as a matter of fact:

http://www.lua.org/gems/sample.pdf

'Soludra' on Achaea

Blog: http://jonathan.com/
GitHub: http://github.com/Twisol
[Go to top] top

Posted by WillFa   USA  (525 posts)  [Biography] bio
Date Reply #7 on Sat 15 May 2010 08:47 PM (UTC)
Message
And instead of Nick's code, there's http://luaprofiler.luaforge.net/ that was written by the Kepler Project.
[Go to top] top

Posted by Nick Gammon   Australia  (22,973 posts)  [Biography] bio   Forum Administrator
Date Reply #8 on Sat 15 May 2010 11:08 PM (UTC)
Message
Fiendish said:

All good advice. Unfortunately the bogdowns I'm worried about tend to be caused by looping through large amounts of data a bunch of times with significant table creation, insertion, and deletion. I think I need to find a primer on Lua optimization.


Without knowing more about the data, it is possible switching to SQLite databases may help. For example, in my mapping plugin, instead of storing thousands of rooms as Lua table entries, I use the database instead. Were I to attempt to re-do things like spellups, or inventory plugins, I would be looking at doing a similar thing there too.

- Nick Gammon

www.gammon.com.au, www.mushclient.com
[Go to top] top

The dates and times for posts above are shown in Universal Co-ordinated Time (UTC).

To show them in your local time you can join the forum, and then set the 'time correction' field in your profile to the number of hours difference between your location and UTC time.


23,120 views.

It is now over 60 days since the last post. This thread is closed.     [Refresh] Refresh page

Go to topic:           Search the forum


[Go to top] top

Quick links: MUSHclient. MUSHclient help. Forum shortcuts. Posting templates. Lua modules. Lua documentation.

Information and images on this site are licensed under the Creative Commons Attribution 3.0 Australia License unless stated otherwise.

[Home]


Written by Nick Gammon - 5K   profile for Nick Gammon on Stack Exchange, a network of free, community-driven Q&A sites   Marriage equality

Comments to: Gammon Software support
[RH click to get RSS URL] Forum RSS feed ( https://gammon.com.au/rss/forum.xml )

[Best viewed with any browser - 2K]    [Hosted at HostDash]