Object Teams 2.1 Milestone 7 (finally) brings hot code replacement

As part of the Juno Milestone 7 also Object Teams has delivered its Milestone 7.

As the main new feature in this milestone hot code replacement finally works when debugging OT/J or OT/Equinox applications.

What took us so long to make it work?

  • Well, hot code replacement didn’t work out of the box because our load-time weaver only worked the first time each class was loaded. When trying to redefine a class in the VM the weaver was not called and thus class signatures could differ between the first loaded class and the redefined version. The VM would then reject the redefinition due to that signature change.
  • Secondly, we didn’t address this issue earlier, because I suspected this would be pretty tricky to implement. When I finally started to work on it, reality proved me wrong: the fix was actually pretty simple 🙂

In fact the part that makes it work even in an Equinox setting is so generic that I proposed to migrate the implementation into either Equinox or PDE/Debug, let’s see if there is interest.

Now when you debug any Object Teams application, your code changes can be updated live in the running debug target – no matter if you are changing teams, roles or base classes. Together with our already great debugging support, this makes debugging Object Teams programs still faster.

More new features can be found in the New&Noteworthy (accumulated since the Indigo release).

Builds are like real software – or even more so

Being a part-time release engineer for the Object Teams project I can only agree with every word Kim writes about the job, I wish I could hire her for our project 🙂

She writes:

“Nobody in needs to understand how the build works, they just need to push a button. That’s great. Until the day before a release when your build fails with a cryptic message about unresolved dependencies. And you have no idea how to fix it. And neither does anyone else on the team.”

That puts a sad smile on my face and I’d like to add a little quality metric that seems cruel for today’s build systems, but might actually be useful for any software:

No software can be better than its worst error message.

One extreme I experienced was in a PDE/Build-ant-build which I had to set to verbose to get any useful answer but then I had to find the relevant error message deeply buried in literally tens of megabytes of log output. Takes ages to browse that log file. Other tools rank towards the other end of the spectrum saying basically “it didn’t work”.

Why is the worst error message relevant? When you hit that worst message it’s close to saying “game over”. Especially when working on a build I’ve come to the point time and again where all my creativity and productivity came to a grinding halt and for days or weeks I simply made zero progress because I had no idea why that system didn’t work and what it expected me to do to fix the thing. Knock-out.

Obviously I hate that state when I make no progress towards my goal. And typically that state is reached by poor communication from some framework back to me.

Real coolness

I know people usually don’t like to work on improving error messages, but please, don’t think good error messages are any bit less cool than running your software on mars. On the one hand we try to build tools that improve developers’ productivity by a few percent and than the tool will give answers that bring that very productivity down to zero. That’s – inconsistent.

I’m tempted to repeat the p2 story here. Many will remember the merciless dump of data from the sat solver that p2 gave in its early days. Some will consider the problem solved by now. Judge for yourself: what’s the worst-case time a regular Eclipse user will need to understand what p2 is telling him/her by one of its error messages.

The intention of this post is not to blame any particular technology. The list would be quite long anyway. It’s about general awareness (big words, sorry 🙂 ).

Consider the worst case

Again, why worst case? Because the worst case will happen. And it’s enough if it hits you once to easily compensate all the time savings the tool otherwise brought to you.


Framework developers, tool smiths: let your software communicate with the user and let it be especially helpful when the user is in dire need of help.

One small contribution in this field I’d like to share with you: in the OTDT every error/warning raised by the compiler not only tries to precisely describe what’s wrong but it is directly linked to the corresponding paragraph in the language definition that is violated by the current code. At least this should completely explain why the current code is wrong. It’s a small step, but I feel a strong need for linking specific help to every error message.

But first, the software has to anticipate every single error that will occur in order to produce useful messages. That’s the real reason why creating complex software is so challenging. Be it a build system or the “real” software.

Be cool, give superb error messages!

Why I’m sometimes a bad bug reporter

OK, I use Eclipse for getting some work done. Eclipse is software so we know it contains bugs. Given that Eclipse is open source software, we all can only expect it to run smoothly if we diligently report back all errors we encounter (and provide steps on how to reproduce etc.). I know all that and I really want to be a good bug reporter because I really want a good experience using Eclipse because I really want to get the work done (and I may even want to sustain the impression that I’ve got everything under control and thus working with Eclipse is a delight).

A task

The other day, I was a very bad bug reporter, and only today I find some time to reason about what happened. This was my task: In preparing the initial contribution for the Object Teams Project I had to rename a bunch of things from org.objectteams to org.eclipse.objectteams. Simply, huh? Back in the days of emacs+bash I might have chosen to just use one big

find . -exec /bin/sed -i -e "s/org.objectteams/org.eclipse.objectteams/g" {} ;

and voila, if fortuna was with me, I might have been done at the return of that single command. But things were actually just a little bit more challenging, like, a few occurrences would have to remain unchanged plus while touching about every single file in our software I was going to also do some clean-up: rename some packages to, fixing some copyright headers, etc. Also I preferred to change one plug-in at a time which would mean that all references to plug-ins not yet processed should stay unchanged, too. Plus a few more little deviations from the grand search-and-replace-globally.

OK, since I like to see myself an Eclipse-wizard this was a nice challenge for its refactoring support. Plug-in by plug-in I renamed, I renamed packages with / or without subpackages, and after each step I wanted to see that compiler and PDE agree with all changes and signal that everything is still (or again) consistent, ready to be built, actually. Perhaps things started the get wrong when I estimated the effort as one or two hours. So, after a day or so, I wasn’t perfectly relaxed any more. My fault, should’ve known better about that estimate. BTW, one of the reasons it took so long was simply the size of my workspace in comparison to the power of my computer / hard-disk: every time I performed a rename with updates in non-Java files, I was nervously looking at the screen: “should I sit and wait for the preview page, or should I go to the kitchen, get a chocolate, coffee, just something?“. I did some emailing in parallel, but let’s just keep this: due to those response times
I wasn’t perfectly relaxed any more.

A story of bug reporting

What I was going to tell here is a story of bug reporting, because as a safe bet doing a real-life stress test to an Eclipse component should give you a good chance to discover and report a few bugs that have not yet been reported by others. And indeed, I was successful in discovering some bugs, in various components actually.

I think one of the first things that occurred was that the svn synchronize view would sometimes fail to open the compare editor, more precisely, I had to explicitly close the compare editor before comparing the next file. At first this really **** me off, because the error dialog was popping up in some kind of infinite loop. Fun!#$ Once I’d figure out how to work around this problem it soon became a habit to just close the compare editor before clicking the next. Next, the svn plugin made a refactoring fail, because it was trying to create a directory which the previous refactoring had already created. The most creative bug having to do with subversive was a chain-reaction of first failing to undo a refactoring and than during reporting this error blocking the UI of Eclipse so I could only kill the Eclipse process, insert a new coin and start a new game.

I don’t intend to blame a particular component. For clean-up of license headers I have a little home-grown plugin that I just wanted to quickly install into the running Eclipse instance, so I went for the cool new feature to export/install into the host. Oops, my plugin depends on another plugin that only exists in the workspace but not in the host, install failed for good reasons. I removed the dependency and tried again. Installation still failed for the same reason: the ghost of this removed dependency prevented installation into the host Eclipse. Oh, I should have incremented the version or let a version qualifier do this automatically, of course. Tried again, still failed. Tried something so slightly different I cannot recall, from there on it worked. Can I reproduce the two or three different levels of failure? I didn’t even take the time to think of it. Well I would’ve been disappointed without a bug from p2 in this list 😉 .

PDE did its share by reporting overlapping text edits in plugin.xml and therefore disabling its refactoring participant. What the **** caused those overlapping text edits, and how do I re-enable the refactoring participant to give it one more chance to behave well?

The list could go on if only I could remember. Instead I was happy to finish this 1.5 hours task after 2.7 days, ready to submit our initial code contribution, wow!

Looking back, I / we missed a great opportunity: we could have identified plenty of bugs in various components of Eclipse. With only a few more days of debugging I might have been able to present reproducing steps for all those bugs. And, if triaged and fixed by the corresponding devs, this might have contributed to M6 containing fewer of those bugs that just only occur in real world, never during testing. I failed, I managed only to submit two bug reports, with very little information on how to reproduce.

Lesson learned

Susan McCourt responded to an earlier bug report of mine in a very descriptive comment:

That is one of those things I’ve been meaning to fix forever, never wrote a
bug, and so keep forgetting to fix. And it seems like if I’m actually
[doing what triggers the bug], it’s because something is wrong, and so I again postpone
writing a bug.

Sure, when we hit a bug (or a bug hits us) we are always in some context of doing something challenging. Something that requires our mind to stay in focus. Something we want to get done.
Well, work isn’t perfectly linear, so we know how to react to interrupts. Bugs are such interrupts. Sometimes I like the challenge of isolating a bug etc. Sometimes I’m sufficiently relaxed when the bug occurs so I actually take the challenge. Sometimes the bug is sufficiently good-natured so making a small note and going back to the bug after the actual work is done is a perfect strategy. Some bugs, however, smell like depending on so many factors from your current context that reproduction an hour later seems extremely unlikely.

I think I have a solution to all this: given we don’t want to be distracted from our actual work, given also that some bugs need immediate care or they will escape our attempts to identify. Given some of the worst moments are when we start to isolate a bug and during that task a second bug stops us from working on the first bug etc. The only way to relentlessly follow all those tasks is to enable branching universes in your working environment. The frequent use of the work “task” may give a hint that I should finally start using Mylyn (I have no excuse for not doing so), but I would need a Mylyn that is able to capture full contexts: the complete state of my machine plus the full state of my brain. As a start I’m dreaming of always working in a virtual machine, and whenever something strange happens, I quickly (!!) create a snapshot of the virtual machine. Then I could first isolate (and fix 🙂 ) the bug that just occurred, and then go back to the exact point where I
interrupted my work and act as if nothing had happened. Branching universes with the ability of back porting fixes between branches is what I need. Of course the clock needs to be reset when returning from a bug reporting / fixing branch.

Yeah, that’s why I can’t quite live up to my dreams of good participation in open source development: I haven’t figured out how to enable branching universes for my character. If anybody has a hint on how to do this or any other strategy to not get overwhelmed between work and bug reporting, I’d really appreciate.

And if I don’t smile while writing a bug report, please excuse, I might just be terribly stressed because your bug interrupted my work on isolating another bug that stopped me from doing …

”Always look at the bright side of life…” 🙂

