-.- --. .-. --..

Mysterious extra calls to String#size in Rubinius, or is it?

This is a story of that time I banged my head against the table.

I was writing a procedure for finding out the number of times the method String#size was called during the runtime of a snippet. When the snippet was run using Rubinius, it resulted in a count of n+1 where n was the result under MRI (both 2.0). After fiddling with the code for a bit, I realized that the “require” call in Rubinius internally calls the method String#size. I forgot that much of Rubinius is written in Ruby.

Run 1

The following is a simpler version of the call counter I implemented.

# counter.rb

# For readability, we will have the incrementer code inside
# a lambda. More improvements, if made in the future, will
# confined to this block
CALL_COUNT_INCREMENT = lambda do
  @counter ||= 0
  @counter += 1
end

# For now, we'll hard-code the method we want to inspect.
method_to_be_checked = "String#size"

# The modification we make to the String#size code where
# we shove in the call to the CALL_COUNT_INCREMENT lambda
class String
  alias_method :old_size, :size

  def size
    CALL_COUNT_INCREMENT.call
    self.old_size
  end
end

# This file is 'required' before the snippet we want to measure.
# We will use  the at_exit {} block to specify the number of times
# the method was called during the runtime.

at_exit {
  puts "String#size was called #{@counter} times"
}

The snippet I wanted to measure was in another file and the above counter.rb file was required in the snippet file:

# snippet.rb

require_relative "checker.rb"

(1..100000).each do |x|
  x.to_s.size if x.odd?
end

When the file is run using ruby snippet.rb, it results in the following output:

MRI: String#size was called 50000 times.
rbx: String#size was called 50001 times.

Note that the snippet has nothing to do with threading-related errors.

After some digging around and blaming the interpolation (#{}) literal initially, I later realized I was wrong. I added some print statements here and there and realized that the @counter was getting incremented right before the (1..100000).each {} block.

Run 2

Instead of including the counter.rb file, I copied the contents and pasted them into snippet.rb and removed the require clause. Boom!

MRI: String#size was called 50000 times.
rbx: String#size was called 50000 times.

Just as it should be.

Are we there yet?

I did see a call to #size method in the puts method and to test the effect of that call, I added another call to puts inside the at_exit{} block like so:

at_exit {
  puts "String#size was called #{@counter} times"
  puts "String#size was called #{@counter} times"
}

Output:

(Rubinius)

String#size was called 50000 times
String#size was called 50001 times

(MRI)

String#size was called 50000 times
String#size was called 50000 times

In Rubinius, IO#puts has a call to the String#suffix? method which has a call to String#size and so, our counter is incremented after the first puts call. Refer to Rubinius to the IO and String classes.

So, what’s the point?

I always wondered why tools like ruby-prof or rblineprof were unavailable for Ruby implementations other than MRI and how difficult it would be to port those to other implementations. Now I guess the above behaviour can be cited as one reason. I now realize that performance monitoring applications are not at all straight-forward and that these kinds of subtleties should be kept in mind.

Also, this is fun.