Mary Rose Cook's notebook

The public parts of my notebook.

Systematic debugging

When I am debugging my code, I try to consciously go through the same process every time. I form a hypothesis about what the problem is. I run experiments to test it. If my experiments disprove it, I form a new one and run some new experiments. If my experiments support it to some nebulous level of satisfaction, I assume I have found the source of the bug.

Today, I followed a thread from an Alan Kay talk about the verbosity of programs to the Nile graphics language and started trying to learn OMeta, the language used for writing language parsers. As I began working, it occurred to me that debugging is almost indistinguishable from exploring a system. As I tried to learn how the OMeta grammar definition language works, I was really trying to learn the rules of the system. To do this, I was forming hypotheses and testing them. Which is to say: debugging is exactly the same as learning a system, except the system happens to be broken.

It occurred to me that a recapitulation of my early exploration of OMeta would be a good demonstration of the hypothesise/experiment technique.

I was reading this OMeta tutorial. It showed this example of an OMeta grammar:

ometa Integer {
  Dig =
   '0'|'1'|'2'|'3'|'4'|'5'|'6'|'7'|'8'|'9',
  Process =
    Dig Dig*
}

Integer.matchAll('42', 'Process');

From the tutorial, I knew that the code in braces was a grammar definition. I knew the top rule in the grammar was Process. I knew the last line was JavaScript code that took the grammar and tried to parse 42 with the grammar, starting from the Process grammar rule. I knew that the parse would succeed and return 2. I stopped reading the tutorial and started exploring the system.

My first thought was that it was strange that [2] was returned. 42 is an integer. Why wasn’t the whole thing returned? My first hypothesis was that the first Dig matched the 4 and the second Dig matched 2 and the * meant return and so only the second digit was returned. I tested this by removing the *.

Process =
 Dig Dig

input: 42

2 was returned. Note that, unlike the last time, the 2 was not wrapped in brackets. This disproved my hypothesis and also introduced the brackets as a new part of the system that I hadn’t noticed before.

My second hypothesis was not so much a hypothesis as a question. What happens if I add a letter to the end of the input to make it 42a? These types of questions are a little dangerous because they can be hard to distinguish from the act of just trying something random in the hope of fixing a bug. In this case, I wasn’t trying to fix a bug, so I was safe. And I feel like a question is legitimate if it is designed to gather more experimental evidence about the behaviour of the system.

I added the * back in and changed the input to 42a.

Process =
 Dig Dig*
input: 42a

The result was [2]. This suggested the parser didn’t have to consume the whole input and would just parse until its rule was satisfied. I took this as my new hypothesis.

This hypothesis was really two hypotheses. First, the parser would need to completely satisfy its rules in order to return something, and satisfaction of Dig Dig meant matching two digits. Second, the parser was OK with only consuming part of the input. I tested the first part of the hypothesis by parsing 4.

Process =
 Dig Dig*
input: 4

It returned []. This meant the first part of my hypothesis was wrong because there was no match error. In fact, it violated another latent assumption I’d made: * meant the rest of the matching input. I abandoned my working hypothesis. I thought for a second and remembered that, in regular expressions, * means zero or more. I removed the * from the Process rule and ran the grammar on the input 4.

Process =
 Dig Dig
input: 4

A “match failed” error was shown.

I put the * back in and ran the grammar on the input 4321.

Process =
 Dig Dig*
input: 4321

[3, 2, 1] was returned, supporting my latest hypothesis.

At this point, my mental model - my hypothesis about the whole system - had become more detailed and better substantiated.

I have more questions, but, at some point, I will run out of questions and will have proved my hypothesis about the system to that nebulous level of satisfaction and will start using my knowledge to build something. And, at some point, I will discover that this hypothesis is wrong and will start the hypothesise/experiment cycle again.

#notebook