-.- --. .-. --..

Ruby, Trace, Leave, Oh my!

Ruby has introduced the TracePoint API with the release of Ruby 2.0. Earlier, there was a similar tool that let you track operations in any Ruby program called set_trace_func. However, using it was really slow and ugly. This new TracePoint API aims to replace it.

What is “tracing”? Why is set_trace_func slow? Why is TracePoint better?

To understand how these two commands work, we need to understand how the compiler compiles a Ruby program and generates code that the VM then executes. The program that we will be using is:

class A
  def to_string
    to_s
  end
end

A.new.to_string

Dissassembly

Let’s have a look at the output of the compiler for the Ruby program mentioned above. To get the disassembled output, the command to run is:

method = <<-EOF
class A
  def to_string
    to_s
  end
end

A.new.to_string
EOF

puts RubyVM::InstructionSequence.new(method).disasm

The output of the command is:

== disasm: <RubyVM::InstructionSequence:<compiled>@<compiled>>==========
0000 trace            1                                               (   1)
0002 putspecialobject 3
0004 putnil
0005 defineclass      :A, <class:A>, 3
0009 pop
0010 trace            1                                               (   7)
0012 getinlinecache   19, <ic:0>
0015 getconstant      :A
0017 setinlinecache   <ic:0>
0019 send             :new, 0, nil, 0, <ic:1>
0025 send             :to_string, 0, nil, 0, <ic:2>
0031 leave
== disasm: <RubyVM::InstructionSequence:<class:A>@<compiled>>===========
0000 trace            2                                               (   1)
0002 trace            1                                               (   2)
0004 putspecialobject 1
0006 putspecialobject 2
0008 putobject        :to_string
0010 putiseq          to_string
0012 send             :"core#define_method", 3, nil, 0, <ic:0>
0018 trace            4                                               (   5)
0020 leave                                                            (   2)
== disasm: <RubyVM::InstructionSequence:to_string@<compiled>>===========
0000 trace            8                                               (   2)
0002 trace            1                                               (   3)
0004 putself
0005 send             :to_s, 0, nil, 24, <ic:0>
0011 trace            16                                              (   4)
0013 leave                                                            (   3)

Note that the disassembled output is just a human-readable output.

The lines starting with a number represent a single VM instruction each. These instructions are implemented inside a file called insns.def. The lines that start with == disasm represent what are known as labels. These labels are useful in identifying which Ruby method call or instruction the corresponding set of VM instructions pertain to. One stack-frame in Ruby corresponds to the set of instructions between — and including — a trace and leave. These two instructions are appended to ends of the stackframe by the compiler, and they perform certain specific functions.

Observe the trace instructions here and the number after that instruction. This number is the decimal representation of a Ruby event. Erm…event, you say?

Events and trace

Ruby emits an event every time the VM’s state changes. The full list of events in Ruby (version 2.1.0-p0) consists of 25 events and are defined inside of ruby.h. However, the set of events that corresponding to the example Ruby program above are a subset of the following set:

1
2
3
4
5
6
7
8
9
10
11
12
/* traditional set_trace_func events */

 #define RUBY_EVENT_NONE      0x0000
 #define RUBY_EVENT_LINE      0x0001
 #define RUBY_EVENT_CLASS     0x0002
 #define RUBY_EVENT_END       0x0004
 #define RUBY_EVENT_CALL      0x0008
 #define RUBY_EVENT_RETURN    0x0010
 #define RUBY_EVENT_C_CALL    0x0020
 #define RUBY_EVENT_C_RETURN  0x0040
 #define RUBY_EVENT_RAISE     0x0080
 #define RUBY_EVENT_ALL       0x00ff

Each of the RUBY_EVENT_* constants correspond to a hexadecimal number. Using this list, we can understand what the trace instruction arguments in the disassembled output correspond to.

In the above example, there are a total of 8 trace instructions. Let’s go through them one-by-one.

If you were thinking that the trace instruction might be an event emitting routine, it’s kind of the reverse in reality. When ever the VM encounters a trace event, all the event hooks that were defined for that event get executed. Open up the file insns.def and locate the trace instruction definition. There’s just one call — well, ignoring all the Dtrace-specific methods — which is, EXEC_EVENT_HOOK.

All trace instructions with an argument of 1 correspond to a RUBY_EVENT_LINE event. This event is indicates that a new line of Ruby code has been encountered. All event hooks registered for the event RUBY_EVENT_LINE are executed when the VM encounters this particular trace event.

You can see that, in each of the labels, there is a minimum number of one RUBY_EVENT_LINE event. An argument of 2 corresponds to the event RUBY_EVENT_CLASS and an argument of 4 corresponds to the event RUBY_EVENT_END. That means, inside the class:A label, this is what happens:

  1. Event signalling a class entry.
  2. Event signalling a new ruby line that has been encountered.
  3. Event signalling that the end statement for the class definition has been reached.

Finally, in the last label (to_string), the trace arguments of 8 and 16 correspond to RUBY_EVENT_CALL and RUBY_EVENT_RETURN.

leave

The leave VM instruction simply pops the stack frame and is essential to start of execution of the next instruction. There will be only one leave per instruction sequence but there can be multiple trace instructions as we already saw in the example above.

Usage

The events emitted by the trace instruction can be utilized by some tools like set_trace_func and TracePoint. Both set_trace_func and TracePoint API register event hooks that just call the Proc that you pass in.

Examples of TracePoint and set_trace_func usage can be seen at the RubyDoc website.

Here lies the reason why set_trace_func is slow. As mentioned earlier, trace instruction simply executes all the events that are registered. Both set_trace_func and TracePoint API register event hooks based on the configuration your provide to them. The usage of these two libraries can be found on the Rubydoc website. The way you use them differs slightly. In the case of TracePoint, the event to listen to is specified upfront — that said, if no event is specified, the library adds a hook pertaining to all events — and in the case of set_trace_func a proc is passed as an argument. This argument gets called every time a Ruby event occurs. That’s right, for every event, the argument gets called. In addition to that, the binding information when the event is instantiated is also computed and that slows down the tracing even more. In the case of TracePoint though, you get access to a tracepoint object that can be used to fetch the data that’s required and it won’t compute the extra information unless it’s requested. This not only improves the API, it also improves the performance of tracing.

Disabling trace

The trace instruction generation can be disabled by adding the following lines to the program and eval-ing the code under question. For example, in the case of the program at the beginning, to disable trace instruction, this does it:

RubyVM::InstructionSequence.compile_options = {
  trace_instruction: false
}

method = <<-EOF
class A
  def to_string
    to_s
  end
end

A.new.to_string
EOF

puts RubyVM::InstructionSequence.new(method).disasm

The output of this program would be:

== disasm: <RubyVM::InstructionSequence:<compiled>@<compiled>>==========
0000 putspecialobject 3                                               (   1)
0002 putnil
0003 defineclass      :A, <class:A>, 0
0007 pop
0008 getinlinecache   15, <is:0>                                      (   7)
0011 getconstant      :A
0013 setinlinecache   <is:0>
0015 opt_send_simple  <callinfo!mid:new, argc:0, ARGS_SKIP>
0017 opt_send_simple  <callinfo!mid:to_string, argc:0, ARGS_SKIP>
0019 leave
== disasm: <RubyVM::InstructionSequence:<class:A>@<compiled>>===========
0000 putspecialobject 1                                               (   2)
0002 putspecialobject 2
0004 putobject        :to_string
0006 putiseq          to_string
0008 opt_send_simple  <callinfo!mid:core#define_method, argc:3, ARGS_SKIP>
0010 leave
== disasm: <RubyVM::InstructionSequence:to_string@<compiled>>===========
0000 putself                                                          (   3)
0001 opt_send_simple  <callinfo!mid:to_s, argc:0, FCALL|VCALL|ARGS_SKIP>
0003 leave

As you can see, this does not have the trace instruction.

Summary

There are a lot of tooling support features that are being added to the latest Ruby. In fact, there are new TracePoint events added to Ruby 2.2.0 that were not present on 2.1.0. Tooling is a really important part for the growth and stability of a language. TracePoint’s C-API helps authors who write profilers like stackprof, ruby-prof to use a much simpler API and support more internal events which are accessible only via the C-API. And there has been a steady improvement on the quality and performance of the profilers which is always a good thing.