Thursday, June 29, 2006

constness of it all

As a designer of libraries and code that others use, I have come to an impasse with Java. I am convinced that it was originally designed for ultimate portability and application development and not for efficiency and library design. One sticking point that has bothered me since I got JDK 1.0 back in 1996 was the absence of const, commonplace in C/C++. const keyword means something is not allowed to be changed (barring any void* hacks).

So if someone writes:

const char *p;

You know that this is a pointer to a char array that cannot be changed. But const is not always this easy to understand, lets look at this example:

const char *const p;


This means that the pointer contents cannot be changed and the pointer value (memory location of the char data) cannot be changed. Further you can really complicate things with:

const char *const get(const char *const) const;

Here a class member function accepts a pointer to a char array that cannot be changed and pointer that cannot be reassigned, it returns the same and on top of that the class data cannot change. Little much but const can be a beautiful thing in the hands of an interface designer.

Let's take a simple example of a class that is a global cache that you cannot change:

class GlobalCache
{
public:
const ObjectData& get(const KeyName&) const;
};


Simple enough you give it a reference to a key object and it looks up and returns a constant ObjectData reference without changing its own contents. The fact that reference is used to pass data around is efficient because no memory is allocated/copied in the process. The method is const (the last const in that declaration) so nothing is written to the object during that call and the most important const of them all is the first one. It means that the data you get you cannot change in any way. Why is it important? You don't need to synchronize access to that object, in a multi-threaded environment you can have as many threads as you want reading that object without a need to synchronize it in any way.

There is no way to do this in java. The closest thing is to return a copy of the ObjectData (and ObjectData can be huge):

class GlobalCache
{
public:
public ObjectData get(const KeyName key) {
return new ObjectData(someinternalObjectaData;
}
};

Java has to make a copy of this object to guarantee consistency. If you return the actual object then anyone can change it which can have very unexpected results if the system is multi-threaded (object consistency is lost). Before you mention final keyword, remember that it means the reference cannot change and makes no guarantees about the object itself.

Not having a const means you need make a lot more complicated of a design to guarantee that the code behaves as intended (because you can never expect the users to do what is expected, they will always do what is not). This is one of the reasons java code tends to bloat in the class count area and it is never fun to have several classes doing what really one class should have done.

I am hoping in one of the upcoming java versions implements a const-like behavior (afterall 5.0 has template-like behavior)... More and more java is starting to look like C++. Funny how that happens.

2 comments:

Allan "Goldfish" Clark said...

Hi Alex;

That necessary copy, does that optimize using Copy-On-Write? Or is that also tossed out in a multicore architecture?

Alex Chachanashvili said...

It definitely still calls the constructor and tries to copy resources due to multithreading (since a thread may modify that object between the time of the call and the time of write). It forces you to design classes from interfaces which may be ok but gets to be a pain when using java IDEs like eclipse where F3 takes you to the definition of the method and you get the interface first then you have to do F4 to get hierarchy and then open your class and locate the method you need inside. It's pretty annoying. While it may be nitpicking it also increases the class count. I prefer the elegance of const and passing the reference explicitly which signals to people that the object is being viewed and to be aware of multithreaded issues that may occur.