The Object Teams Blog

Adding team spirit to your objects.

Posts Tagged ‘static analysis

Book chapter published: Confined Roles and Decapsulation in Object Teams — Contradiction or Synergy?

leave a comment »

I strongly believe that for perfect modularity, encapsulation in plain Java is both too weak and too strong. This is the fundamental assumption behind a book chapter that has just been published by Springer.

The book is:
Aliasing in Object-Oriented Programming. Types, Analysis and Verification
My chapter is:
Confined Roles and Decapsulation in Object Teams — Contradiction or Synergy?

The concepts in this chapter relate back to the academic part of my career, but all of my more pragmatic tasks since those days indicate that we are still very far away from optimal modularity, and both mistakes are found in real world software: to permit access too openly and to restrict access too strictly. More often than not, it’s the same software exhibiting both mistakes at once.

For the reader unfamiliar with the notions of alias control etc., let me borrow from the introduction of the book:

Aliasing occurs when two or more references to an object exist within the object graph of a running program. Although aliasing is essential in object-oriented programming as it allows programmers to implement designs involving sharing, it is problematic because its presence makes it difficult to reason about the object at the end of an alias—via an alias, an object’s state can change underfoot.

978-3-642-36946-9

Aliasing, by the way, is one of the reasons, why analysis for @Nullable fields is a thorny issue. If alias control could be applied to @Nullable fields in Java, much better static analysis would be possible.


How is encapsulation in Java too weak?

Java only allows to protect code, not objects

This manifests at two levels:

Member access across instances

In a previous post I mentioned that the strongest encapsulation in Java – using the private modifier – doesn’t help to protect a person’s wallet against access from any other person. This is a legacy from the pre-object-oriented days of structured programming. In terms of encapsulation, a Java class is a module utterly unaware of the concept of instantiation. This defect is even more frustrating as better precedents (think of Eiffel) have been set before the inception of Java.

A type system that is aware of instances, not just classes, is a prerequisite for any kind of alias control.

Object access meets polymorphism

Assume you declare a class as private because you want to keep a particular thing absolutely local to the current module. Does such a declaration provide sufficient protection? No. That private class may just extend another – public – class or implement a public interface. By using polymorphism (an invisible type widening suffices) an instance of the private class can still be passed around in the entire program – by its public super type. As you can see, applying private at the class level, doesn’t make any objects private, only this particular class is. Since every class extends at least Object there is no way to confine an object to a given module; by widening all objects are always visible to all parts of the program. Put dynamic binding of method calls into the mix, and all kinds of “interesting” effects can be “achieved” on an object, whose class is “secured” by the private keyword.

The type system of OT/J supports instance-based protection.

Java’s deficiencies outlined above are overcome in OT/J by two major concepts:

Dependent types
Any role class is a property of the enclosing team instance. The type system allows reasoning about how a role is owned by this enclosing team instance. (read the spec: 1.2.2)
Confined roles
The possible leak by widening can be prevented by sub-classing a predefined role class Confined which does not extend Object. (read the spec: 7.2)

For details of the type system, why it is suitable for mending the given problems, and why it doesn’t hurt in day-to-day programming, I have to refer you to the book chapter.


How is encapsulation in Java too strict?

If you are a developer with a protective attitude towards “your” code, you will make a lot of things private. Good for you, you’ve created (relatively) well encapsulated software.
But when someone else is trying to make use of “your” code (re-use) in a slightly unanticipated setting (calling for unanticipated adaptations), guess what: s/he’ll curse you for your protective attitude.
Have you ever tried to re-use an existing class and failed, because some **** detail was private and there was simply no way to access or override that particular piece? When you’ve been in this situation before, you’ll know there are basically 2 answers:

  1. Give up, simply don’t use that (overly?) protective class and recode the thing (which more often than not causes a ripple effect: want to copy one method, end up copying 5 or more classes). Object-orientation is strong on re-use, heh?
  2. Use brute force and don’t tell anybody (tools that come in handy are: binary editing the class file, or calling Method.setAccessible(true)). I’m not quite sure why I keep thinking of Core Wars as I write this 🙂 .

OT/J opens doors for negotiation, rather than arms race & battle

The Object Teams solution rests on two pillars:

  1. Give a name to the act of disregarding an encapsulation boundary: decapsulation. Provide the technical means to punch tiny little holes into the armor of a class / an object. Make this explicit so you can talk and reason about the extent of decapsulation. (read the spec: 2.1.2(c))
  2. Separate technical possibility from questions of what is / should / shouln’t be allowed. Don’t let technology dictate rules but make it possible to formulate and enforce such rules on top of the existing technology.

Knowing that this topic is controversial I leave at this for now (a previous discussion in this field can be found here).


Putting it all together

  1. If you want to protect your objects, do so using concepts that are stronger than Java’s “private”.
  2. Using decapsulation where needed fosters effective re-use without the need for “speculative API”, i.e., making things public “just in case”.
  3. Dependent types and confined roles are a pragmatic entry into the realm of programs that can be statically analysed, and strong guarantees be given by such analysis.
  4. Don’t let technology dictate the rules, we need to put the relevant stakeholders back into focus: Module providers, application developers and end users have different views. Technology should just empower each of them to get their job done, and facilitate transparency and analyzability where desired.

Some of this may be surprising, some may sound like a purely academic exercise, but I’m convinced that the above ingredients as supported in OT/J support the development of complex modular systems, with an unprecedented low effective coupling.

FYI, here’re the section headings of my chapter:

  1. Many Faces of Modularity
  2. Confined Roles
    1. Safe Polymorphism
    2. From Confined Types to Confined Roles
    3. Adding Basic Flexibility to Confined Roles
  3. Non-hierarchical Structures
    1. Role Playing
    2. Translation Polymorphism
  4. Separate Worlds, Yet Connected
    1. Layered Designs
    2. Improving Encapsulation by Means of Decapsulation
    3. Zero Reference Roles
    4. Connecting Architecture Levels
  5. Instances and Dynamism
  6. Practical Experience
    1. Initial Comparative Study
    2. Application in Tool Smithing
  7. Related Work
    1. Nesting, Virtual Classes, Dependent Types
    2. Multi-dimensional Separation of Concerns
    3. Modules for Alias Control
  8. Conclusion

Written by Stephan Herrmann

March 30, 2013 at 22:11

The message is the message

leave a comment »

I have been ranting about bad error messages, so in my own work, error messages better be helpful. At least I try.

As for the recent milestone 6 of the JDT a significant novelty was actually mostly about the wording of a few error/warning messages issued by the null analysis of the JDT compiler. We actually had quite a discussion (no, I’m not expecting you to read all the comments in the bug:)).

Why did we go through all this instead of using the time to fix more bugs? Because the entire business of implementing more compiler warnings and in particular introducing annotation-based null analysis is to help you to better understand possible weaknesses of your code. This means our job isn’t done when the error message is printed on your screen, but only when you recognize why (and possibly how) your code could be improved.

So the game is:
when you see one of the new compiler messages
“what does it tell you?

Intended “inconsistency”

Let’s start with an example that at first looks inconsistent:

Both methods basically have the same code, still lines 14-17 are free of any warnings whereas the corresponding lines 24-27 have one warning and even an error. What does it tell us?

Here are some of the messages:

line 10 Redundant null check: The variable val1 cannot be null at this location
line 12 Null comparison always yields false: The variable val1 cannot be null at this location

Before bug 365859 the second method would show the same messages, giving no clue why after the initial start all the same code gives different results later. The initial improvement in that bug was to update the messages like this:

line 20 Redundant null check: The variable val2 is specified as @NonNull
line 22 Null comparison always yields false: The variable val2 is specified as @NonNull

Alright! Here lies the difference: in the first method, compiler warnings are based on the fact that we see an assignment with the non-null value "OK" and carefully follow each data-flow from there on. Non-null definitely holds until line 15, where potentially (depending on where b takes the control flow) null is assigned. Now the check in line 16 appears useful.

By contrast, the warnings in the second method tell us, that they are not based on flow analysis, but on the mere fact that val2 is declared as of type @NonNull String. This specification is effectual, independent of location and flow, which has two consequences: now the assignment in line 25 is illegal; and since we can’t accept this assignment, line 26 still judges by the declaration of val2 which says: @NonNull:

line 25 Null type mismatch: required ‘@NonNull String’ but the provided value is null
line 26 Redundant null check: The variable val2 is specified as @NonNull

Communicate the reasoning

Three levels to a good error message:

  1. “You did wrong.”
  2. “Your mistake was …”
  3. “This is wrong because …”

By now you probably know what I think of tools that answer using (1). To go from (2) towards (3) we split one particular error message into three. Look at this:

which gives these error messages:

line 31 Null type mismatch: required ‘@NonNull String’ but the provided value is null
line 32 Null type mismatch: required ‘@NonNull String’ but the provided value is specified as @Nullable
line 34 Null type mismatch: required ‘@NonNull String’ but the provided value is inferred as @Nullable

Line 31 is obvious.

Line 32 is wrong because in is declared as @Nullable, saying null is a legal value for in, but since it’s not legal for tmp2 the assignment is wrong.

In line 34 we are assigning a value that has no nullness specification; we say, unknown has a “legacy” type. From that alone the compiler cannot decide whether the assignment in line 34 is good. However, using also information from line 33 we can infer that unknown (probably) has type @Nullable String. In this particular case the inference is obvious, but the steps that lead to such conclusion can be arbitrarily complex.

What does this distinction tell you?

The error in line 31 is a plain violation of the specification: tmp1 is required to be nonnull, but the assigment attempts to definitely break that rule.

The error in line 32 denotes the conflict between two contradictory declarations. We know nothing about actual runtime values, but we can tell without any doubt that the code violates a rule.

Errors of the type in line 34 are reported as a courtesy to the user: you didn’t say what kind of variable unknown is, thus normally the compiler would be reluctant to report problems regarding its use, but looking a bit deeper the compiler can infer some missing information. Only in this category it makes sense to discuss whether the conclusion is correct. The inference inside the compiler might be wrong (which would be a compiler bug).

Sources of uncertainty

Of the previous messages, only the one in line 31 mentions a runtime fact, the remaining errors only refer to possibilities of null values where no null value is allowed. In these cases the program might actually work – by accident. Just like this program might work:

void maybe(Object val) {
   System.out.println(val.toUpperCase());
}

While this is not a legal Java program, a hypothetical compiler could produce runnable byte code, and if the method is invoked with an argument that happens to be a String, all is well – by accident.

While we have no guarantee that things would break at runtime, we know for sure that some rule has been broken and thus the program is rejected.

The following method shows a different kind of uncertainty:

What can we tell about this assignment? Well … we don’t know, it’s not definitely bad, but it’s not good either. What’s the problem? We need a @NonNull value, but we simply have no information whether unspecified can possibly be null or not. One of those legacy types again. After much back and forth we finally found that we have a precendent for this kind of problem: what’s the compiler say to this snippet:

void safety2(List unspecified) {
    List<String> strings = unspecified;
}

Right, it says:

Type safety: The expression of type List needs unchecked conversion to conform to List

meaning: we receive an argument with a type that lacks detailed specification, but we require such details on the left hand side of the assignment. Whether or not the RHS value matches the LHS requirement cannot be checked by the compiler. Argument unspecified has another kind of legacy type: a raw type. To gracefully handle the transition from legacy code to new code with more complete type specifications we only give a warning.

The same for null specifications:

line 41 Null type safety: The expression of type String needs unchecked conversion to conform to ‘@NonNull String’

In both cases, raw types and types lacking null specification, there are situations where ignoring this warning is actually OK: the legacy part of the code may be written in the full intention to conform to the rule (of only putting strings into the list / using only nonnull values), but was not able to express this in the expected style (with type parameters / null annotations). Maybe the information is still documented, e.g., in the javadoc. If you can convince yourself that the code plays by the rules although not declaring to do so: fine. But the compiler cannot check this, so it passes the responsibility to you, along with this warning.

Tuning comiler messages

If you buy into null annotations, the distinction of what is reported as an error vs warning should hopefully be helpful out of the box. Should you wish to change this, please do so with care. Ignoring some errors can render the whole business of null annotations futile. Still we hope that the correspondence between compiler messages and configuration options is clear:

These options directly correspond to the messages shown above:

problems controlled by this option
lines 31,32 Violation of null specification
line 34 Conflict between null annotations and null inference
line 39 Unchecked conversion from non-annotated type to @NonNull type

Conclusion

The compiler does an awful lot of work trying to figure out whether your code makes sense, definitely, or maybe, or maybe not, or definitely not. We just decided, it should try a little harder to also explain its findings. Still, these messages are constrained to be short statements, so another part of the explanation is of course our job: to educate people about the background and rationale why the compiler gives the answers it gives.

I do hope you find the messages helpful, maybe even more so with a little background knowledge.

The next steps will be: what’s a good method for gradually applying null annotations to existing code? And during that process, what’s a good method for reacting to the compiler messages so that from throwing code at the compiler and throwing error messages back we move towards a fruitful dialog, with you as the brilliant Holmes and the compiler your loyal assistant Watson, just a bit quicker than the original, but that’s elementary.

Written by Stephan Herrmann

April 15, 2012 at 01:53

Posted in Eclipse

Tagged with , , , ,

Using git for an interactive video – Replay the JDT tutorial

with 2 comments

When preparing the tutorial How to train the JDT dragon, Ayushman and me packed way more material than we could possible teach in 3 hours time.

Luckily, Ayush came up with an idea that enables all of you to go through the exercise all on your own: instead of creating zip files with various stages of the example projects, we pushed everything to github. I apologize to those who were overwhelmed by how we pushed git onto the tutorial participants in Reston.

Now, as EclipseCon is over, here’s what you can do:

  • Fetch the slides: PDF or slideshare
  • Clone the material for the plain JDT part (git: URL), containing two projects for the first two exercises.
    • org.eclipsecon2012.jdt.tutorial1: traverse the Java Model starting from the element selected in the UI
    • org.eclipsecon2012.jdt.quickFix1: implement a quickfix for adding a missing null check
  • Clone the misc repo (git: URL), containing
    • handouts: instructions for the exercises
    • org.eclipsecon2012.jdt.populator: plug-in that was used in the tutorial to create sample content in the runtime workbench; show cases how to programmatically create Java files from String content
  • Clone the OT repo (git: URL), containing
    • AntiDemo: the OT/Equinox hello-world: prohibition of class names starting with “Foo”.
    • Reachability: a full blown reachability analysis in < 300 LOC.

The last exercise, “Reachability”, we couldn’t even show in Reston, but luckily this is the most detailed repo: in 13+1 commits I recorded the individual steps of adding inter-procedural reachability analysis piggy-back on top of the JDT compiler. If you replay these commits from the git repo you can watch me creating this little plug-in from scratch. Starting from step 5 you’ll be able to see the results when you fire up a runtime workbench: the analysis will print to stdout what methods it found during each full build.

Look at the commit comments which summarize what I would have explained in demo mode. Still the material is pretty dense as it explains three technologies at the same time:

  1. How does this analysis work?
  2. What points in the JDT do we hook onto?
  3. How is the integration achieved using Object Teams.

Speaking of integration: using the “Binding Editor” you can see all bindings in a tabular view:

Bindings of the Reachability plug-in

The right hand side of the table give the full list of JDT elements we connect to:

  • classes that are extended using a role_obj role
  • methods/fields that are accessed externally via a callout binding callout binding
  • methods that are intercepted using a callin binding (after) callin binding.

As a special bonus, step 14 in the git repo shows how to add problem markers for each unreachable method, so that the JDT will actually offer its existing quickfix for removing:
Unreachable methods
(Yep, both methods are detected as unreachable: calling each other without being called from another reachable method doesn’t help).

I hope you enjoy playing with the examples. For questions please use one of the forums:

Written by Stephan Herrmann

April 7, 2012 at 15:20

New Honours, New Synergy

with 3 comments

I’m flattered to see my name appear on one more Eclipse web-page:

Thanks to the JDT/Core team for accepting me as a new member!

Actually, I’ve been poking about the JDT source code since 2003 when we started implementing the Object Teams Development Tooling (OTDT) on top of the JDT. So it was only natural that I would stumble upon a bug in the JDT every once in a while (incl. one “greatbug” 🙂 ). In this specific situation, reporting bugs and trying to help find solutions was more than just good Eclipse citizenship: the discussions in bugzilla also helped me to better understand the JDT sources and thus it helped me developing the OTDT. And that’s what this post is about: synergy!

Personal Goals

Actually, watching the JDT/Core bug inbox provides a constant stream of cool riddles: spooky Java examples producing unexpected compiler/runtime results. Sometimes those are real fun to crack. Which compiler is right, which one isn’t? What’s the programmer trying to do? I’d say there’s no better way to test your Java knowledge 🙂

Also, watching this stream helps me stay informed, because every patch will eventually need to be merged into the Object Teams branch of the JDT/Core.

But I also have some pet projects that I’d like to actively push forward. Currently I’m thinking about these two:

  • Supporting inter-procedural null analysis
  • Making the compiler closer match the OSGi semantics

Inter-procedural null analysis

Naturally, the JDT/Core is a facilitator for tons of cool functionality in the IDE, but it hardly communicates directly with the user. But we have one channel, where the compiler can actually talk to the developer and give advice that’s well worth your bucks: errors and warnings. Did you know, that the compiler can immediately tell you what’s wrong about this piece of code:

void foo(String in) {
   if (in == null) new IllegalArgumentException("Null is not allowed here");
   ... // real work done here
 

(Yep, that was bug 236385, see the New&Noteworthy on how to enable).

Many Java developers share the experience that one of the most frequent problems is also one of the most mundane: NPE. Those, who are aware of this easily produce the opposite problem: cluttering there code with meaningless null-checks (many of which have no better action than the above IllegalArgumentException which isn’t much better than throwing the NPE in the first place). The point is, we’d want to know which values can actually be null (and why!) so we write only the necessary null-checks and for those we should think really hard what would be a suitable action, right? (“This shouldn’t happen” is not an answer!).

Thus I’m happy I could get involved in some improvements of the compiler’s flow analysis. I also experienced that null-analysis combined with aggressive optimization is a delicate issue – remember SDK 3.7M2a? Sorry about that, but to my justification I might add that bug 325755 has always been dormantly present, I only woke’em up. I must admit for those 105 minutes my heart beat was a bit above average 😉 .

Anyway, the current null analysis is still fundamentally limited: it knows nothing about nullness of method arguments nor method call results. So it would be a real cool enhancement if we could feed such information into the compiler. Fortunately, the theory of how to do this (you may either think of improved type systems or of design by contract) is a well-explored subject. Actually, bug 186342 already has some patches in this direction. Since discussing in bugs with long history is a bit tedious I created this wiki page. Sadly, it’s not a technical difficulty that’s keeping us from immediately releasing that patch, but a stalled standardization process 😦 .

However, as outlined in the wiki, here comes another synergy: in case we can’t find a solution that will be sufficiently compatible with future standards, I can easily create an early-adopters release, so sorting out the details of implementation and usage can be done in parallel with the standardization process. How that? I’d simply ship the compiler enhancement as an OT/Equinox-enabled plug-in. This would allow us to ship the compiler enhancement as separate plug-in ready to be tried by early adopters.

OSGi-aware compiler

Some of you may have noticed (here, or here, or here, or here or one of the many duplicates) that the compiler’s concept of a (single, linear) classpath is not a good match for compiling OSGi code. I will soon write another wiki page with my current thinkings about that issue. Stay tuned …

Comments?

For both issues I’d appreciate any comments / feedback. I’ll be checking updates of the bugs I listed, or the wiki page or … talk to me during ESE! Let’s move something together.

See you all in Ludwigsburg!
Stephan

Written by Stephan Herrmann

October 26, 2010 at 19:18