rollen.io

Binding my time: a Heisenbug story

2026-02-17

I recently debugged a failing test where the issue itself was trivial, but the investigation of the bug was surprisingly gnarly because inspecting the bug using a debugger changed its behavior1.

What?

I've created a reproduction of the issue below, but the actual class and methods have been modified, so don't get hung up on the exact semantics of the class. The gist of it is that we have a Rails model with a method that checks some preconditions, returns early if the preconditions are met, and otherwise goes on to perform additional logic.

# == Schema Information
#
# Table name: orders
#
#  id                   :bigint           not null, primary key
#  canceled_at          :datetime
#  delivered_at         :datetime
class Order < ApplicationRecord
    def update_status
      return if canceled_at.present? || delivered_at.present?
    
      # process rest of order, potentially setting the canceled_at or
      # delivered_at field.
    end
end

The failing test indicated that the delivered_at field was not being updated when we expected that it should be, so I added a debugging statement inside the method and reran the test.

Nothing. The breakpoint didn't trigger and the test was still failing. I moved the breakpoint to the very top of the method and started walking through the code, but the method immediately returned from the precondition check. Normal enough, but when I then reran the test again to check which of the two preconditions failed, things got really weird.

(ruby) canceled_at.present?
false

(ruby) delivered_at.present?
false

With some trepidation, I updated the code and ran again:

def update_status
  should_return = canceled_at.present? || delivered_at.present?
  debugger
  return if should_return

  # process rest of order
end
(ruby) canceled_at.present?
false

(ruby) delivered_at.present?
false

(ruby) should_return
true

Oh dear.

The chasm opened up. I was staring into the abyss, and the abyss was staring back at me. What had seemed like a simple investigation into a failing test now had me questioning my sanity. How could (false || false) == true? How do you even debug that?

My mind was spinning.

We must have done something silly and overridden the || method somewhere in the code… But then how would anything work? And indeed, || is not even a method that can be overridden, it's a built-in operator.

Okay, so there must be a ruby bug! But in the OR operator? C'mon… And in any case, running false || false or even canceled_at.present? || delivered_at.present? returned false, as expected.

While flailing endlessly around for a logical explanation, I was also instinctively poking around with the debugger, looking for a clue, and while stepping deeper into the internals of Rails, I found one:

[34, 43] in ~/.rbenv/versions/3.3.6/lib/ruby/gems/3.3.0/gems/activerecord-8.1.2/lib/active_record/attribute_methods/read.rb
    34|       end
    35|
    36|       # This method exists to avoid the expensive primary_key check internally, without
    37|       # breaking compatibility with the read_attribute API
    38|       def _read_attribute(attr_name, &block) # :nodoc:
=>  39|         @attributes.fetch_value(attr_name, &block)
    40|       end
    41|
    42|       alias :attribute :_read_attribute
    43|       private :attribute
=>#0    ActiveRecord::AttributeMethods::Read#_read_attribute(attr_name="delivered_at", block=#<Proc:0x0000000127bf13c8 /Users/rollen/....) at ~/.rbenv/versions/3.3.6/lib/ruby/gems/3.3.0/gems/activerecord-8.1.2/lib/active_record/attribute_methods/read.rb:39
  #1    Order::GeneratedAttributeMethods#delivered_at at ~/.rbenv/versions/3.3.6/lib/ruby/gems/3.3.0/gems/activemodel-8.1.2/lib/active_model/attribute_methods.rb:273
  # and 33 frames (use `bt' command for all frames)
(ruby) @attributes.fetch_value(attr_name, &block)
2026-02-15 01:56:45.107254000 UTC +00:00

A-ha! So the delivered_at field is not nil, but actually has a set date-time. This explains why should_return is true, but then why would delivered_at.present? return false?

Some more poking around in the debugger pointed to an answer:

[2, 11] in ~/src/heisenbug/app/models/order.rb
     2|
     3| class Order < ApplicationRecord
     4|   def update_status
     5|     should_return = canceled_at.present? || delivered_at.present?
     6|
=>   7|     debugger
     8|
     9|     return if should_return
    10|
    11|     # do stuff
=>#0    Order#update_status at ~/src/heisenbug/app/models/order.rb:7
  #1    <main> at (heisenbug):2
  # and 31 frames (use `bt' command for all frames)
(rdbg) info locals    # command
%self = #<Order id: 1, canceled_at: nil, delivered_at: "2026-02-15 01:56:45...>
should_return = true
delivered_at = nil

So the delivered_at column does indeed contain a value, but the local variable delivered_at is nil. Some more flailing investigation uncovered the culprit later in the method3:

def update_status
    return if canceled_at.present? || delivered_at.present?
    
    # …
    delivered_at = schema.delivered_at
    if delivered_at < 5.minutes_ago
        # …
    end
    update_columns(delivered_at: delivered_at)
    # …
end

New theory: the local variable that we define later in the method must somehow be hoisted up by the debugger to the interactive scope and shadowing the method call even before the actual assignment to delivered_at had run. Testing the theory, I renamed the delivered_at local variable and re-ran the debugger:

(ruby) delivered_at
2026-02-15 01:56:45.107254000 UTC +00:00
(ruby) canceled_at.present? || delivered_at.present?
true
(ruby) should_return
true

Success! The code behaved exactly as intended, and I was now quickly able to fix the behavior leading to the broken test. But, of course, the story doesn't stop there; I desperately needed to know

WHY??

Though confusing, it's important to note that the behavior I ran into here is not a bug or even unintended behavior in the ruby debugger. Indeed, though not explicitly called out, this behavior is exemplified in the README of the debug repository:

$ cat target.rb                        # Sample program
require 'debug'

a = 1
b = 2
binding.break                          # Program will stop here
c = 3
d = 4
binding.break                          # Program will stop here
p [a, b, c, d]

$ ruby target.rb                       # Run the program normally.
DEBUGGER: Session start (pid: 7604)
[1, 10] in target.rb
      1| require 'debug'
      2|
      3| a = 1
      4| b = 2
=>    5| binding.break                 # Now you can see it stops at this line
      6| c = 3
      7| d = 4
      8| binding.break
      9| p [a, b, c, d]
     10|
=>#0    <main> at target.rb:5

(rdbg) info locals                     # You can show local variables
=>#0    <main> at target.rb:5
%self => main
a => 1
b => 2
c => nil
d => nil

Notice that even though the execution has stopped on line 5, c and d are still initialized with value nil. Dig deeper through the debugger code, and you'll find that this behavior comes from Kernel#binding, a method that extracts all local variables and other information from the current scope into an instance of the Binding class. Even more powerfully, a Binding can be passed to eval to evaluate arbitrary code within the context of that binding:

class User
  def initialize(name, position)
    @name = name
    @position = position
  end

  def get_binding
    binding
  end
end

user = User.new('Joan', 'manager')
template = '{name: @name, position: @position}'

# evaluate template in context of the object
eval(template, user.get_binding)
#=> {:name=>"Joan", :position=>"manager"}

This is of course extremely powerful behavior for building a debugger. You can even build a very simple debugger that lets users evaluate code at any point in the program by sending the binding to irb4:

require 'irb'

def custom_debugger(context)
    IRB.setup(nil)
    workspace = IRB::WorkSpace.new(context)
    irb = IRB::Irb.new(workspace)
    IRB.conf[:MAIN_CONTEXT] = irb.context
    catch(:IRB_EXIT) { irb.eval_input }
end

def method_with_bug
    name = "My name"
    raise 'error!'
rescue
    custom_debugger(binding)
end

method_with_bug

Throw that code into debug.rb and run it, and you'll be placed in an irb session with the context from inside the method_with_bug method:

❯ ruby debug.rb
irb(main):001> name
=> "My name"

This only moves the question one step deeper: why does binding capture local variables that haven't been defined yet?

Unlike some other languages, ruby doesn't require local variables to be initialized before they can be assigned. Ruby also doesn't require parentheses to call methods, meaning that the identifier foo can either be a reference to a local variable or a method call. If the Ruby runtime had to make that determination every time it encountered an identifier, performance would be much slower, so Ruby instead creates all local variables in a scope when parsed, not when assigned. The Ruby documentation provides another example making this behavior clear:

a = 0 if false # does not assign to a

p local_variables # prints [:a]

p a # prints nil

So there's the full story: the delivered_at attribute was shadowed by the delivered_at local variable in our debugging scope, because binding (which is used by ruby/debug to enable powerful debugging) captures all local variables, even those that haven't been assigned yet. This capture happens at parse time, not runtime, for performance reasons.

And the moral of the story is: don't shadow your variables, kids.

[1] This class of bug is called a Heisenbug, after the Heisenberg Uncertainty Principle2, in which observing a particular aspect of a (quantum) system invariably alters its state. ↩︎

[2] More accurately the Observer Effect, which is often conflated with the Heisenberg principle. The latter states that the product of the standard deviations of position and momentum has a lower bound, so if you try to reduce your uncertainty of one of the two, your uncertainty of the other will increase to compensate.

This sounds (and is!) mysterious, but my favorite way of conceptualizing it is to instead think of position/wave-length uncertainty, where the behavior is much more clear. If you think of the first extreme, an infinitely repeating periodic wave, it's very easy to calculate its wavelength, but what is its position, if it extends to infinity? And at the other extreme, if you think of squishing the wave down to a single point, it's very easy to locate it in space, but what is its wavelength?

And yes, this is a footnote to a footnote, and a physics digression in a coding article, but it's my blog and I do what I want, damn it! Now where was I? ↩︎

[3] Sorry for burying the lede if you were reading this as an exercise and trying to guess the answer! In my defense, while shadowing an attribute with a local variable seemed perhaps slightly odious, I never expected that it would lead to this level of confusion while debugging. ↩︎

[4] Code taken from this great video by @CarlosPohloddev ↩︎

Comment via email Back to main page