Monday, July 18, 2011

Quoth The Donald... nevermore.

The Quote

In a somewhat famous statement, Donald Knuth (in a quote he later tried to pawn off on Tony Hoare) allegedly said: "We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil".
In many years of software development, many people have justified their poorly written code with: "premature optimization" is bad and should never be done because Donald Knuth said so. Very often this was a response when their product fell over under a mild load.

As someone who is involved in code optimization (and have been doing it for over a decade), I can tell you what Donald meant was: to not waste time on making well written code optimally fast and concentrate getting it working well and deal with unexpected inefficiencies later. Profiling and tuning well written code is relatively easy, poorly written code presents a time consuming problem as basic need to be fixed first.

He did not mean that you can just write poorly optimized code and try to justify your actions with a misunderstood quote. Maybe Donald realized the can of worms he had created and tried to distance himself from it; we will never know.


How it all begins

No one writes inefficient code willingly or knowingly. I have never met a developer who had bad intentions. What I have met are developers who have convinced themselves that their coding skills are amazing and they write amazing code. That they were a lot better than they really were.

Case 1

I worked with a developer once who told me that he "was something of a big deal around here", this is before his code was unable to scale past 5 concurrent users and eventually required a complete rewrite (but not before his options vested, he made a lot of money and left to start another company; never learning how average of a developer he was). He was not a bad guy, an average developer at best, but he had convinced himself that he was way above average; a legend in his own mind. Everyone praise him (mostly non technical people). Since he was an average developer who thought he was amazing, he hired people for his group that he thought were average (when in fact they were barely able to write code). The codebase reflected it. When I looked through the code, I saw many basic problems (CS101 level issues). Recursion without stopping conditions through all code paths, this caused an occasional race condition on the browser. Many cases of incorrect collections (like lists of name/value pairs doing O(N) lookup when a map/tree with O(N log N) would have been far more efficient). Places where he used loops inside loops which resulted in exponential slowdowns (single loops were needed). When I asked him about his many basic problems? Quoth The Donald... nevermore.

Case 2

Another lead developer I worked with was constantly told, by people who did not know how to write code, of how he was a superior developer and he had the mentality that he could do no wrong. He spent his development time trying out new technologies and then applying them to the current product mostly unsure how it would work but that it was somehow cool. This was a Frankenstein's monster of software. It had 2 pass pre-build (making it insanely difficult to debug), 2 view rendering engines, 3 client presentation layer libraries, 2 database interface libraries neither worked well enough so there was a lot of SQL statements as well, database schema so poorly designed that an average query required 2 inner joins and 2 outer joins (many were so poor that response times for a web page under 1 user load would easily go over 30 seconds); it was a mess. Load tests with more than 5 concurrent users often never completed and scaling was pretty much buying lots of hardware and hoping it was enough. When asked why not stick with one technology for each area, he said that he was looking for the right one but hasn't had time to go back and unify them. He saw no problems with 20-30 second reply times and claimed it just needed some tweaking. And of course he did Quoth The Donald... nevermore.


What The World Needs Now

There are many more cases but you get the idea. Premature optimization is not writing efficient code up front (that is desirable), it is writing code that is specifically optimized without knowing the big picture. Premature optimization is not an excuse to write poor code or build on a poor design. It means: Don't spend too much time on minor optimizations which yield negligible results.

Efficient well written code at early stages of development can be a difference between a successful project and a long, protracted code cleanup.

Poor Donald... we hardly knew you.

Monday, January 24, 2011

Running the code through PVS-Studio

I was generously offered a temporary license to PVS-Studio to do static analysis on my code and the following is the result.


The Hardware


OS: Windows XP Professional x64 Edition (5.2, Build 3790) Service Pack 2
Processor: AMD64 (8 CPUs), ~2.7GHz
Memory: 4094MB RAM

Performance was not much of an issue and analysis took a few minutes per project, but I figured I should port the hardware used for reference.


The Install

Install was painless and it integrated itself into Visual Studio 2010 as a menu item. You can analyze a file, a project or whole solution. You can also load/save analysis.


The Analysis


I analyzed most of the base libraries used by AObjectServer, then base modules and then the actual server code. I also analyzed the load test tool I use (which comes as a command like tool and a Win32 GUI). Overall it was 11 DLLs, 3 EXE. Analysis per project is about 2-5 minutes (depending on files I suppose).


The Results

Results are divided into 3 levels.

Start with Level 1 issues.

MESSAGE: V550 An odd precise comparison: - 1.0 == m_stateTimeLimit. It's probably better to use a comparison with defined precision: fabs(A - B) '<' Epsilon.
CODE:
#define INVALID_TIME_INTERVAL -0.1
return (INVALID_TIME_INTERVAL == m_stateTimeLimit || m_stateTimer.getInterval() < m_stateTimeLimit ? false : true);

FIX: return (0.0 > m_stateTimeLimit || m_stateTimer.getInterval() < m_stateTimeLimit ? false : true);
NOTE: This is definitely a good find


MESSAGE: V550 An odd precise comparison: 0.0 == difftime (t). It's probably better to use a comparison with defined precision: fabs(A - B) '<' Epsilon.
CODE: return (0.0 == difftime(t) ? true : false);
FIX: According to the API reference return of 0 from difftime
NOTE: This is a tougher one, this tests to see if difftime returns 0 which means that it is the same time. I did notice that this function was operating on time_t structure which defined on Windows as __int64 so comparing to another time_t should be fine. I made this change, but I am not sure if other operating systems may use something else for time_t.

-----------

Next are Level 2 issues. Almost all were V112 type that flags usage of numbers that relate to memory sizes (like 4, 8 32, etc); all of these were false positives for me. Few of the messages were interesting.

MESSAGE: V401 The structure's size can be decreased via changing the fields' order. The size can be reduced from 32 to 24 bytes.
CODE: in dbghelp.h (part of Microsft SDK)
typedef struct _MINIDUMP_EXCEPTION_INFORMATION64 {
DWORD ThreadId;
ULONG64 ExceptionRecord;
ULONG64 ContextRecord;
BOOL ClientPointers;
} MINIDUMP_EXCEPTION_INFORMATION64, *PMINIDUMP_EXCEPTION_INFORMATION64;
FIX: None, it's part of the Microsoft SDK but interesting.

MESSAGE: 2776 V801 Decreased performance. It is better to redefine the first function argument as a reference. Consider replacing 'const .. path' with 'const .. &path'.
FIX: I accidentally used pass object instead of pass by reference, while not critical in this instance since it was part of initialization call, this message is very useful.

MESSAGE: 297 V112 Dangerous magic number 4 used
NOTE: The tool is fixated on numbers that may be associated with memory word sizes like 4, in all my cases 4 represents a 4 and not a memory size (I use sizeof if I need word sizes). So this resulted in many false positived and probably should have been level 3.

-----------

Now on to Level 3 here are the more interesting ones.

MESSAGE: V547 Expression 'findFromFront (p) >= 0' is always true. Unsigned type value is always >= 0.
CODE: AASSERT(this, findFromFront(p) >= 0);
FIX: This was not needed since size_t is the result from findFromFront and it is always >=0, this I suspect is leftover code from changeover from int to size_t.
NOTE: Good to clean dead code even if it is harmless

MESSAGE: V547 Expression 'moveSize < 0' is always false. Unsigned type value is never < 0.
NOTE: These were very useful since they are a result of another porting project that replaced int with size_t and went from signed to unsigned. Most of the warning were harmless and inside assert statements but nevertheless, cleaner code is always better.

MESSAGE: V509 The 'throw' operator inside the destructor should be placed within the try..catch block. Raising exception inside the destructor is illegal.
NOTE: Great catch, this was a result of a conversion from assert() to a macro which throws an execption and this was inside a destructor which is a bid thing.


The Summary

Overall I liked the tool, it did find a few non-critical issues with the server code, but to be fair I have ran most of my code through FlexeLint and BoundChecker (until my license expired that is). I also have Visual Studio warning level 4 turned on for debug builds and that catches a lot of issues.

The main benefit of PVS-Studio is that it is good at finding issues that may affect porting 32-bit code to 64-bits.