Wednesday, December 29, 2004

One View To Rule Them All

This must be "The Daily WTF of the Year":

From "the global leader in providing supply chain execution and optimization solutions": One View to Rule Them All. See the underlying SQL-Code.

One of the most absurd things I have ever seen: Subselects, which contain subselects, which contain subselects, which contain aggregation functions. Also, the DISTINCT clause is certainly beneficial in regard to performance - especially on a view that might return every potential row, with close to 400 columns!

The unfortunate maintenance programmer explains:
"I was further astounded to learn that timeouts on certain critical operations were *routine*! [...] Paging through the trace, I found a stored procedure which took 505306 milliseconds - that's 8.5 minutes (!!!) - to execute, at 45% server utilisation."

Tuesday, December 28, 2004

Digital Fortress

I read Dan Brown's Digital Fortress during Christmas holidays. The book has an interesting plot, and the story takes some surprising twists, so it was in no sense boring. The problem is, it is just quite flawed from a technical point of view - and this really confines the reading experience for the average techie with some basic knowledge about cryptography.

I don't even want to start ranting about the unrealistic showdown, when a software worm takes down the NSA's security tiers one by one, and the agency's director decides to take the risk, instead of simply shutting down the system. Or the fact that the leading character, IQ-170 wonder-mathematician Susan Fletcher does not even grasp the most obvious coherences. Or that their massive-parallel miracle-system goes up in flames due to overheating (No heating ventilation? No emergency shutdown? Using NMOS, or what? And no backups and no redundant datacenter?). Let's just examine one of the book's main area of interest, namely cryptographic issues:

The so called Digital Fortress algorithm (which is able to resist brute force code-breaking attempts) is published on the internet, encrypted BY ITSELF (the main storyline is about the chase for the cipher key). The highest bidder will receive the key, hence will own the algorithm. Now wait a second - in order to decrypt it, he needs to know the algorithm already, right? Hmmm, makes you wonder how he should get hands on the algorithm, as it is only available in its encrypted form? Or put it the other way: let's suppose this is all possible, then the bidder HAS that algorithm already in cleartext at that point in time - before he is actually going decrypt it. So there is no reason for purchasing the key in the first place.

Another gemstone: "To TRANSLTR [the decryption machine] all codes looked identical, regardless of which algorithm wrote them". I am amazed how they decrypt something without any knowledge about the underlying algorithm. Anyway, no modern encryption standard depends on having to keep its algorithm secret. Keeping the key secret is the only thing that counts.

"Public Key Encryption" is being described as "software that scrambles personal e-mail message in such a way that they were totally unreadable. [...] The only way to unscramble the message was to enter the sender's pass-key". OK, this explanation is confusing, but the author also misses the main point here. Public/private keypairs resp. asymmetric encryption solves the problem of key distribution (e.g. for exchanging a short-lived, temporary symmetric key - symmetric encryption outperforms asymmetric encryption by far, hence is much more suitable for higher data volume and/or server applications), and enables sender authentication and message integrity. The receiver's public key is used for encryption by the sender, so that the counterpart private key (used for decryption) remains with the receiver exclusively. Accordingly the sender applies his private key for signing the message, so the receiver can verify the signature and the message's integrity using the sender's public key.

The author also mixes up key length and key space. He states: "... the computer's processors auditioned thirty million keys per second - one hundred billion per hour. If TRANSLTR was still counting [for 15 hours], that meant the key had to be enormous - over ten billion digits long." OK, 1,5 * 1012 keys in 15 hours, even if we are talking about binary digits here, 41bit keys are enough to cover that range. Not "over ten billion digits". This also contradicts the statement that TRANSLTR breaks 64bit keys in about 10 minutes.

In the book's foreword, the author says thanks to "the two faceless ex-NSA cryptographers who made invaluable contributions via anonymous remailers". Makes me wish they would have proof-read it once.

Saturday, December 25, 2004

EPIC 2014

What will have happened to the news in the year 2014? An interesting look ahead into the future - entertaining, although not very likely if you ask me.

Friday, December 17, 2004

Microsoft

Yesterday I attended a Microsoft marketing presentation. Yes it was a cursory demo. That's obvious when you get a first glance at Whidbey, Yukon, Avalon, Indigo, Longhorn, all within one afternoon.

What impressed me most (probably because I had not seen it before) was the forthcoming Visual Studio Team System. It is scheduled about six months after Visual Studio 2005 (Whidbey), which means it will ship in about a year. I think Team System will substantially change the way we develop on the Microsoft platform. Yes, we have been working on component modeling, automated builds, static and dynamic code profiling, unit and load testing, and so on before, but this always used to require N different tools from N different vendors. But this is the first time all of this gets integrated into one big suite. Visual Studio Team System licenses are costly. But after all, it is mainly aimed for enterprise application projects.

Talking about Microsoft marketing, admitted: Microsoft has a notorious history of creating "Fear, Uncertainty and Doubt". Promising impossible shipping dates for products and product features was one common strategy, next too ruthless business tactics. IBM experienced this with OS/2, as did 3COM with LAN Manager. Fifteen years have gone by since those days, and Microsoft is still aggressive, but it also grew up. They have stayed on the top of software development companies, many say thanks to the fact that their former CEO was the only real nerd, while his competitors were lead by MBAs who just didn't understand the software business.

Microsoft has also pioneered the concept of "good enough" software - which is just another term for finding the optimal economic balance between code purity and practicality (there is nothing bad about "good enough" software, actually it turned out to be the most successful approach for many shrink-wrapped products). In my experience, Microsoft has improved a lot on quality issues (see also their Trustworthy Computing Campaign). Lately I met with several Microsoft consultants, and all of them were top-notch engineers. When you think about it, this does not come as a surprise: Microsoft always hires the smartest of the smartest. Now they are in the position to do so (50.000 job applications a month - this means they can invite the TOP 5% for interviews, and employ the TOP 1%). But even in the old days, Bill Gates always engaged Triple-A developers. B-people are scared of A-people, so when put in charge, they tend to hire more B- or even C-people, dragging down your work force's qualification level.

Microsoft bashing is a common hobby among many. Some criticize their business behaviour (even the department of justice does from time to time) - one may agree or disagree to that. But I refer to unreflected criticism, the one that has the nature of religious wars (e.g. "Windows sucks, Linux rules"). It's funny to note that this often comes from people who are the least qualified to judge. Very rarely those are Triple-A people, and just don't play in the same league as those who develop the next version of Windows or the next Microsoft development platform up in Redmond (well OK, there is always an exception to the rule: programming gods like Linus Torvalds, Bill Joy or James Gosling are allowed to complain about Microsoft ;-)). Some hacked their time around at college on university systems, hardly ever worked on a major real-life software-project, but instead preferred to build up their little fiefdoms on archaic system that no one else really cared about.

Now, badmouthing Microsoft may make the averagely talented developer look cool, at least in front of those who don't know any better. Here is my advice for all unsolicited Microsoft bashers: Please, grow up! Welcome to the real world - in a professional cooperate environment no one wants to hear your religious rants. Microsoft is there, and it is going to stay, you'd better get used to it.

An interesting fact is that our Microsoft consultants knew very well about the strengths and weaknesses of their products. E.g. they never had a problem expressing their respect for cool J2EE features or the like. And they pointed out in which areas Microsoft still has to improve.

I have been working on Unix, Java and Microsoft platforms, and I appreciate all of them. I have the highest respect for their creators. But I am tired of B- and C-people who seriously think they are in any position to fire unsolicited flames on the efforts of real talented folks at world's most successful software company, just for the sake of boosting their own crippled egos.

Sunday, December 12, 2004

Plan Your Architecture Before Choosing Your Technology

It should be obvious to every software project manager, but unfortunately it doesn't always seem to be: System architecture design PRECEDES technology decision-making. Premature, ill-fated technology decisions can bring whole projects down. Or, as Hank Rainwater states in his book "Herding Cats: A Primer for Programmers Who Lead Programmers":

"The magic bullet or golden hammer (whatever you want to call it) technology doesn't solve business problems, people do. Sure, you employ technology to implement a solution, but you are wasting time if you think buying the latest addon to your development environment is going to increase productivity."

[...]

"I encourage you to determine your architectural needs and plan a system before you choose a technology of implementation. You'll just have to do it all over again if the new whiz-bang tool doesn't pan out. You've heard it said many times: If you don't have time to do the job right, when will you have time to do it over again?"

Wednesday, December 08, 2004

Apache AXIS, Proxies And SSL

Some time ago I ported a subsystem from a proprietary XML-over-HTTP request/response format to webservices. The webservice client was done in Java. Now, as we were applying secure sockets (including our own local keystore for holding a client certificate), there used to be an issue with the Java Secure Socket Extension's default behaviour when tunnelling HTTP over a proxy. JSSE's default SSLSocketFactory somehow expects a "HTTP/1.0 200 OK" response (this is hardwired!), but many proxies reply with "HTTP1.1/200 OK" or "HTTP1.1/200 connection established". More on this issue on JavaWorld.

Now, we simply implemented our own SSLTunnelSocketFactory which would not be as restrictive. One can either attach it globally by invoking HttpsURLConnection.setDefaultSSLSocketFactory(), or on a per-connection basis: HttpsURLConnection.setSSLSocketFactory().

This just worked fine for HttpsURLConnections. But Apache Axis (1.1) is different. It comes with its own JSSESocketFactory implementation, which of course once again won't support the proxy's HTTP1.1-response. And it ignores the fact that we already installed our own SSLTunnelSocketFactory. I was about to patch Axis and roll out our own build, when I came across a forum post, which mentioned that Axis would allow other SocketFactories once they implement a public SocketFactory(Hashtable attributes) constructor. Actually, this constructor will never be invoked. It just needs to be there. And it works like a charm now.

Do you remember the last time you saw one of those "Three Mouseclicks To Create Your Webservice Client On (VS.NET | IBM WSAD)" presentations? Real life just ain't that easy.

Tuesday, December 07, 2004

On Software Development Methodologies

Tamir Nitzan on Joel on Software:

Lastly there's MSF. The author's [annotation: Joel Spolsky's] complaint about methodologies is that they essentially transform people into compliance monkeys. "our system isn't working" -- "but we signed all the phase exits!". Intuitively, there is SOME truth in that. Any methodology that aims to promote consistency essentially has to cater to a lowest common denominator. The concept of a "repeatable process" implies that while all people are not the same, they can all produce the same way, and should all be monitored similarly.

For instance, in software development, we like to have people unit-test their code. However, a good, experienced developer is about 100 times less likely to write bugs that will be uncovered during unit tests than a beginner. It is therefore practically useless for the former to write these... but most methodologies would enforce that he has to, or else you don't pass some phase. At that point, he's spending say 30% of his time on something essentially useless, which demotivates him. Since he isn't motivated to develop aggressively, he'll start giving large estimates, then not doing much, and perform his 9-5 duties to the letter. Project in crisis? Well, I did my unit tests. The rough translation of his sentence is: "methodologies encourage rock stars to become compliance monkeys, and I need everyone on my team to be a rock star".

Sunday, December 05, 2004

The Triumph Of Belief Systems Over Engineering

From www.zeitgeist.com (and nothing has changed ever since):
  In Computer Science there's... While in Computer Scientology it's..
00000 John Von Neumann L. Ron Hubbard
00001 Communications of the ACM InformationWeek
00010 SMTP/MIME Notes "Mail"
00011 SNMP "E-meters"
00100 "Two Phase Commit" "Automatic Data Replication"
00101 TCP/IP IPX
00110 The Internet Compu$erve (or, AOL)
00111 Usenix/LISA Conference Novell World
01000 SecurID/SKey/SecureNetKey RLA/ARA
01001 Distributed Systems Windows95
01010 The World Wide Web IBM/Lotus Notes
01011 Object Oriented Programming Visual Basic
01100 Java ActiveX
01101 Linux/NetBSD/FreeBSD Windows/NT Server
01110 ACM TOPLAS "Secrets of the Visual Basic Masters"
01111 GNU Public License Patent Lawyers
10000 Lead Developers "Empowered Managers"

Saturday, November 27, 2004

Hire The Right People

For Joel Spolsky, the #1 cardinal criteria for getting hired at Fog Creek is to be Smart and to Get Things Done.


(BTW, why does blogger.com image upload only support JPEGs? (using "Picasa Hello"))

It's hard to find smart doers, but please, keep on searching for them. If you have the slightest doubt that a potential employee does not fulfill this criteria, don't hire.

The most dangerous species are those who are smart, but don't get things done. First of all, they are harder to be identified as such during the recruitment process. Checking their project portfolio helps (still, they usually are smart enough to fake it).

As opposed to not-so-mart doers, who cause operative mistakes (bad enough), smart no-doers are in the position to make strategic mistakes (even worse). Those are the kind of people who can talk their management into following doomed endeavors which will never result in any marketable product, sometimes putting the whole company at risk. If combined with weak social skills and when put into charge, smart no-doers are best way to get rid of your last smart doers.

About O/R Mappers

We all like object-oriented programming, right? SQL, well... it's cool to have SQL code generated. And your domain object model, too. Some Not-Invented-Here experts will even propose to build a home-brewn O/R mapper. After all, they can do a better job than those Hibernate guys, now don't they?

I know O/R mappers look tempting, and even more tempting is to build one on your own, at least to some architecture astronauts. Things might look promising unless... people actually start applying it. But at this point in time, the investment has been made. No way back. And more than once, the developers who built the O/R mapper are not the same who actually use it, and the blame game is about to begin.

I have seen applications stall, projects fail, people quit, companies go south thanks to the overly optimistic usage of home-made, unproven O/R mappers. Building an O/R mapper is a long and painful road. Avoid unnecessary database roundtrips, provide the right caching strategies, don't limit the developer ("but we were able to do subselects in SQL"), optimize SQL, support scalability, build a graphical mapping tool and so on. Some people designing SOA / message-oriented systems just DON'T WANT their database model floating around within the whole application, but that's what is most likely going to happen.

Will it support all kind of old legacy systems, including some bizarre / not-normalized database designs? And yes, sometimes your programmers don't know the consequences of a virtual proxy being expanded. Only database profiling will show what goes on behind the hood. "One Join" is certainly the preferred solution in comparison to "N Selects", but this is porbably not going to happen once you traverse over a 1:N relationship. Will your caching algorithm still work in concurrent/distributed scenarios? And what about reporting? Your report engine might require plain old resultsets, no persistence objects.

All those benefits the architects expected - they just don't turn out that way. "Too much magic", as one of our consultants expressed it. There must be a reason why accessing relations databases is done in SQL. It's just coherent. There is no OO equivalent, that fits. There is no silver bullet.

Now, there are scales of grey, just as there are application scenarios, where O/R mappers do make sense. I recommend considering an O/R mapper if

(1) You have full control over the database design (no old legacy database).
(2) Load and concurrency tests prove that the O/R mapper works in a production scenario.
(3) Your customer favors development speed over future adaptability.
(4) The O/R Mapper supports SQL execution (or a similar kind of query language).
(5) The O/R Mapper is a proven product, and not the pet project of an inhouse architect.

It is also important to distinguish standalone O/R mappers from container-controlled persistence (the later were designed for running on a middle tier). J2EE Container-Managed-Persistence Entity Beans do make sense in a couple of scenarios (and then again they don't make sense in a lot of others, and J2EE architects will rarely ever recommend a 100% CMP EJB approach). And of course, Hibernate and others do a pretty good job as well on N-tier systems.
Summing up, I strongly agree with Clemens Vasters in most of the cases. He states:

I claim that the benefits of explicit mapping exceed those of automatic O/R mapping by far. There's more to code in the beginning, that's pretty much all that speaks for O/R. I've wasted 1 1/2 years on an O/R mapping infrastructure that did everything from clever data retrieval to smartt [sic] caching and we always came back to the simple fact that "just code the damn thing" yields far superior, more manageable and maintainable results.

Sunday, November 21, 2004

Real Geeks

Real Geeks don't go to www.amihot.com or www.amihotornot.com, they vote on www.amibiosornot.com. No matter whether it actually is AMI BIOS, or not.

Tuesday, November 16, 2004

Life After Microsoft

Tomorrow Wednesday, October 17th, 7:00pm, the Upper-Austrian Workers Chamber will show "Life After Microsoft", a German TV documentation about former Microsoft employees, who suffer under serious burnout syndrome.

Subsequently there will be panel discussion. One panel member is the movie's director, Regina Schilling.

I have seen "Life After Microsoft" before. The Microsoft working ethics are quite demanding. Past achievements of long-time Microsofties don't count that much. You got to prove your commitment each day again.

On one hand, I would like to experience such working conditions. It must be very stimulating. On the other hand, it also sounds a little bit scary. I remember one of those Ex-Microsofties saying "I turned into a vegetable".

WSDL Binding Styles

IBM provides a good introductory document on different WSDL Binding Styles. The most common are RPC/encoded and Document/literal. WS-I Basic Profile also recommends Document/literal, which seems to become the broadly accepted standard. This encoding style should actually be sufficient in most of the cases, plus it provides the possibility of validating SOAP messages against their XSD schemas.

Well, there are always people who know better, e.g. certain government agencies that provide webservices, and somehow decided they had to use the completely uncommon RPC/literal binding style. This means that

(a) .NET 1.0 / 1.1 clients cannot access their webservice, as the .NET framework webservice implementation does not support RPC/literal. Admitted, there is a workaround, but that's definitely not something for the average programmer who drags and drops the webservice's reference into Visual Studio.
(b) SOAP messages are cluttered with unnecessary type encoding information on each request/response.

Of course those are the same folks that publish hand-coded (hence errorneous) WSDL-files.

Thursday, November 11, 2004

GETFIREFOX.COM

Recognize me by looking out for someone wearing one of those t-shirts.

Monday, November 08, 2004

Journey To The Past (14): Enterprise Applications (2002-today)

By the end of 2002 I returned to the multi-tier world, working on various enterprise application projects, mainly under J2EE resp .NET and .NET Enterprise Services. I share my time among consulting services, project management and programming.

Sunday, November 07, 2004

Journey To The Past (13): Wireless Systems (2001-2002)

Programming mobile phones was a real adventure. The segmented memory model of the 16-bit architecture implied just like the same 8086 / 80286 constraints from ten years before. Despite the restriction on system resources (which made me gain valuable know-how, even for my current work back in the client/server resp. multi-tier world), the development and debugging environments for embedded devices are something that takes getting used to. I was involved in several customer projects, mainly implementing man-machine-interfaces and sometimes even low-level device drivers (e.g. for the Samsung SGH-A500 and Asus J100 phones).



In the old days, common practice was to rewrite whole applications from the scratch for each new phone, depending on the underlying device drivers. We tried to improve that ponderous approach by building a C++ framework for mobile phone applications, which would encapsulate device specifics and provide a feature-rich API.



As one of the senior programmers I was in charge of laying some of the framework's groundwork (GUI, non-preemptive scheduler, API design and similar topics), and I also wrote several tools that completed our developer workbench, e.g. a phone emulation environment for Windows, graphic and font conversion programs, and a language resource editor. I was also managing a Java 2 MicroEdition port project.

Saturday, November 06, 2004

Journey To The Past (12): Generic and Dynamic Hypertexts (2000-2001)

The topic of my second thesis was the implementation of a tool for generic hypertext creation in Java. The resulting system was cleanly designed from the beginning until the end, and still pleases my somewhat higher standards of today. I developed the editor (MVC-approach using observer mechanisms for loose coupling), including features like HTML content editing for graph nodes, an automation engine for building concrete hypertexts from generic templates, and a hypertext runtime implemented using Java Servlets.


Friday, November 05, 2004

Journey To The Past (11): PC Banking (1999-2001)

Our department was also developing and maintaining a PC banking system. For several years, the frontend used to be a 16bit Windows / MFC application. I was leading a developer team that ported the old client to Java, utilizing Swing, JDBC (Sybase SQLAnywhere), and an in-house application framework.



Besides the complex business logic, and the need for a RDBMS-to-OOP data mapping, realizing a powerful graphical user-interface in Java was the most difficult challenge. We had to build several more complex controls on our own, like data grids and navigation panels. The client runs under Windows as well as under Linux and MacOS.

The PC banking application is mainly aimed for business customers. It reached an installation base of about 5.000 companies at the end of 2002, while another 50.000 companies were expected to upgrade within the following 12 months.

Journey To The Past (10): Visual Chat (1998)

My course curriculum for computer science included an assignment for a medium-scale project. I decided to build a chat program, which would let the user move around in a 3D world, and meet people at different locations. For several reasons I abandoned the idea of using standard protocols like IRC or VRML. Instead, I ramped up the complete solution (server and client) in Java, employing my own communication protocol and 3D engine. The client actually runs inside any internet browser, without the need to install any additional software. This was a very valuable experience, and included issues like threading, synchronization and networking. This was my first pure OOP project (disregarding the previous Smalltalk experience), and I remember it as a great field for experimenting, without tight schedules or sealed specifications. Of course when I look at the old code today, I can clearly notice that I was still lacking some experience back then.



Anyway, the size of Visual Chat's user community has risen steadily to more than 150.000 by the end of 2002. There are dozens of other Visual Chat Server installations on the internet today (it's freeware).

Thursday, November 04, 2004

Journey To The Past (9): Internet Banking and Brokering (1997-2001)

Internet banking was still in its infancy at the beginning of 1997. We designed and developed the internet banking and brokering solution for one of the largest Austrian bank groups. Our system consisted of a Java applet as a frontend, the business logic was running on Sun Solaris servers, and we integrated bank legacy systems like Oracle databases on Digital VAX, or CICS transactions on IBM mainframes. Later on, I was project lead for porting our solution for several other regional banks.





Over the years we replaced the proprietary middleware by a Java 2 Enterprise Edition application server, and kept session state on the server which allowed for stateless clients. Java Server Pages would now produce HTML for client browsers resp. WML for WAP mobile phones.



At this point in time, 500.000 people have subscribed to the Internet banking and brokering service, accounting for an average of 100.000 logins per day.

Besides that, I was also responsible for the implementation of an internet ticketing service, including online payment.

Journey To The Past (8): Database Application Programming (1993-1996)

4th Generation Tools were specifically en-vogue those days, and heavily applied at university, mainly for their prototyping capabilities. I developed a 4th Dimension application on the Apple Macintosh, which helped university researchers to enter survey data about companies' information technology infrastructure, calculated statistical indices based on alterable formulas, and finally brought up some reports, with all kind of pie and bar charts. I also employed 4th Dimension for my business informatics diploma thesis, when I implemented an information system for university institutes (managing employees, students and course data, automated course enrollment).



We used SQL Windows for building a prototype for a bookstore database application under Window 3.1, which came with a nice multiple document interface. Another prototype was done for a room resource planning system at university. And I did several freelance projects, mainly on MS Access, e.g. a customer relation management and billing system for a local media company. We installed this application in a Windows for Workgroups / LAN-Manager multi-user environment. It is still in use today.





My Access knowledge would also help me later on, during military service, when I could spend two out of eight months inside a warm office (while my comrades were being drilled outside in cold Austrian winter), implementing a database solution for the Airforce Outpatient Department.

Wednesday, November 03, 2004

Journey To The Past (7): Object Oriented Programming (1992)

Getting to know object oriented programming was a major turning point. The new paradigm was overwhelming, and Smalltalk really enforced pure object-orientation. I spent long hacking nights at the university lab getting to know the Visualworks Smalltalk framework on an Apple Macintosh II, and somehow managed to hand in the final project on time: a graphical calendar application.

Tuesday, November 02, 2004

Journey To The Past (6): IBM 3090 Mainframe (1991)

Mainframe programming under MVS and PL/1 was exciting at first, it felt like having stepping into the "big world" of business application development. Everyday experience turned out to be less fun at the end, when batch-jobs were slowed down by my senior university colleagues, who used to play multi-user dungeon games, and each program printout implied waiting for the next morning until I would finally receive it.

Monday, November 01, 2004

Journey To The Past (5): The Age Of Atari (1988-1992)

I really fell in love with my first Atari. It was a 520ST, equipped with 512KB RAM (upgraded to 1MB for a tremendous amount of money as soon as I received payment from a summer job). GFA Basic was a mighty language. This was my introduction to GUI programming (Digital Research's GEM, an early Macintosh look-alike). One could invoke inline assembler code, so I bought a book about 68K assembler, and finally managed to run some performance-critical stuff in native mode.



But I also felt the lack of support for modules and data encapsulation in Basic, so I decided to learn C (sounds like a contradiction today, but hey, this was 1988) using Borland Turbo C, which came along with a great graphical development environment for GEM. Phoenix on the other hand was a relational database system, that shipped with a very nice IDE. I learned about relational database modeling, and implemented some simple database applications.



Modula2 was the language of choice at my first university courses, but that wasn't too much of a change from the old Pascal days. Luckily, Modula2 compilers existed for the Atari ST as well, so I didn't have to spend my time at the always crowded university lab in front of those Apple Macs with 9-inch monitors.

In 1991 I purchased Atari's next generation workstation, the Atari TT-030. It was equipped with a Motorola 68030 processor running at 32Mhz, a 80MB HD and 8MB RAM.



But Atari did not manage to make the TT a winner, while the PC was gaining more and more market share. Notwithstanding all sentimental restraints, I finally bought a 486-DX2 in 1993.

Friday, October 29, 2004

Journey To The Past (4): The PC (1987)

My school heavily invested into computer equipment, and bought brand new IBM-compatible PCs, that came with MS DOS 3.3. We started coding in Turbo Pascal, which was a fine environment for making my first steps in structured programming.

Thursday, October 28, 2004

Journey To The Past (3): Commodore 128 (1986-1988)

The first home computer I ever possessed was a Commodore 128D. Starting with C64 Basic, I used to write endless text-base adventure games (until I ran out of memory), which mainly consisted of a page of prints and a prompt asking the user to choose one option for his next move. Later, I discovered Simon's Basic, an enhancement of C64 basic, which allowed for graphical on-screen operations, and started to get to know C128 Basic. I also managed to get CP/M running (the first real operating system for micros) and Turbo Pascal for CP/M, so I could do computer science assignments at home.


Tuesday, October 26, 2004

Curious Perversions In Information Technology

"The Daily WTF" is a collection of really kinky code snipplets posted by unfortunate maintenance programmers. The level of these code samples ranges between "unbelievable" and "absolutely hilarious". I actually felt better after skim-reading those postings. I thought I had seen the bad and the ugly, but hey, there is even worse out there.

My personal favorites so far include:

So here is my own Top Ten List Of Programming Perversions. I have encountered them over the years. To stay fair, I am not going to post any code or other hints about their origin, so here is an anonymized version:

10. Declare it a Java List, but instantiate it as a LinkedList, then access all elements sequentially (from 0 to n - 1) by invoking List.get(int). Ehm - LinkedList, you know... a LINKED list? Here is what the doc says: Operations that index into the list will traverse the list from the begining or the end, whichever is closer to the specified index. Iterators, maybe?

9. Implement your own String.indexOf(String) in Java. Uhm, it does not work, and it does not perform. But hey, still better than Sun's library functions.

8. Draw a Windows GDI DIB (device-independent bitmap) by calculating each and every pixel's COLORREF and invoke SetPixel() on it. SetPixel also updates the screen synchronously. Yes, performance rocks - you can actually watch the painting process scanline by scanline. BitBlt(), anybody?

7. Write a RasterOp library, which - in its most-called function (SetAtom(), the function that is responsible for setting a pixel by applying a bit-mask resp. one or two bit-shifting operations) - checks for the current clipping rectangle and adjusts the blitting rectangle by creating some Rectangle objects and invoking several methods on those Rectangle objects. Ignore the fact that more or less all simple RasterOp's (like FillRect(), DrawLine() and the like) already define which region will be affected and offer a one-time possibility to adjust clipping and blitting. Instead do it repeatedly on each pixel drawn and create a performance penalty of about 5000%. Also prevent SetAtom() from being inlined by the compiler.

6. Custom ListBox control, programmed in C++. On every ListBox item's repaint event, create about ten (custom) string objects on the stack. Copy-assign senselessly to and from those strings (which all have the same character content), which leads to constant re-allocation of the strings' internal buffers. Score extra points by doing all of that on an embedded system. Really, a string's value type assignment-operator does work differently from the reference type assignment operator, which simply copies references? It might invoke all kind of other weird stuff, like malloc(), memcpy(), free() and the like? How should you have known that?

5. Use a Java obfuscator to scramble string constants, which are hardcoded and cluttered all over inner-most loop bodies (what about constants?). This implies de-obfuscation at runtime (including some fancy decryption-algorithm), each time the string is being accessed. Yes, it is used for self-implemented XML-parsing (see (2)) on megabytes of XML-streams.

4. Convert a byte-buffer to a string under C++/MFC: Instantiate one CString for each byte, write the byte to the CString's buffer, and concatenate this CString to another CString using the "+"-operator.

3. Build your own Object-Relational-Mapper, ignoring the fact that there are plenty of them freely available. Argue that this lowers the learning curve for programmers who don't know SQL (sure it make sense to employ programmers who don't know SQL in the first place). When your (actually SQL-capable) programmers complain that your ORM is missing a working query language, tell them to load a whole table's content into memory, and apply filters in-memory. Or: open a backdoor for SQL again (queries only). Let them define the mapping configuration in XML without providing any tool for automization or validation. Find out that your custom ORM just won't scale AFTER it has been installed on your customer's production environment. Finally conclude that you had preferred to shoot yourself in the foot instead of using Hibernate.

2. The self-implemented Java XML-Parser on a JDK 1.3 runtime (huh, there are dozens of JAXP-compliant, freely available XML-Parsers out there). Highly inefficient (about 100 times slower than Apache Xerces) and actually not-working (what are XML escape sequences again?).

1. Invent the One-Table-Database (yes, exactly what you are thinking of now...)

Journey To The Past (2): Commodore CBM 8032 (1985-1986)

At high-school there was a broad variety of microcomputers, namely an Apple II for the chemistry lab (I didn't really like chemistry enough in order to sign in for advanced voluntary classes, so I never had the chance to access it), a Radio Shack TRS-80, and several Commodore CBMs. We used the CBM (an upgraded Commodore PET model) in computer science classes, where we learned about some more advanced topics, and solved problems like waiting queue simulations, or simple text-based, keyboard-controlled racing games.

Monday, October 25, 2004

Journey To The Past (1): Sinclair ZX-80 (1984)

My first programming experiments go back to the year 1984. A schoolmate of mine owned a Sinclair ZX-80, an early 8-bit microcomputer, that came with the famous Zilog Z80, 16K RAM, a rubber-like keyboard, and could be plugged into a TV set. I somehow managed to convince my friend to lend me his ZX-80 over summer holidays, and started writing my first lines of Basic. The storage system was a tape recorder, simply connected with audio cables.

1984

It has been twenty years since my first contact with computers. Time for a journey to the past. I have spent some time digging for old pictures and screenshots, and composed a mini-series of hardware- and software-technologies that I have worked with over the years.

You see, that's why 1984 was not going to be like George Orwell's "1984". OK, someone else actually said that. For me, it all started with a Sinclar ZX-80...

From IEEE Spectrum

  • If an apparently serious problem manifests itself, no solution is acceptable unless it is involved, expensive and time-consuming.

  • Completion of any task within the allocated time and budget does not bring credit upon the performing personnel - it merely proves that the task was easier than expected.

  • Failure to complete any task within the allocated time and budget proves the task was more difficult than expected and requires promotion for those in charge.

  • Sufficient monies to do the job correctly the first time are usually not available; however, ample funds are much more easily obtained for repeated major redesigns.

A Luminary Is On My Side

I was reading Joel Spolsky's latest book last night, where I found another real gemstone: "Back to Basics", which shows some surprisingly (or probably not so surprisingly?) close analogies to what I tried to express just a short time ago. Joel Spolsky writes about performance issues on C strings, in particular the strcat() function, and concludes:

These are all things that require you to think about bytes, and they affect the big top-level decisions we make in all kinds of architecture and strategy. This is why my view of teaching is that first year CS students need to start at the basics, using C and building their way up from the CPU. I am actually physically disgusted that so many computer science programs think that Java is a good introductory language, because it's "easy" and you don't get confused with all that boring string/malloc stuff but you can learn cool OOP stuff which will make your big programs ever so modular. This is a pedagogical disaster waiting to happen.

Here is what I had to say some weeks ago:

The course curriculum here in Austria allows you to walk through a computer science degree without writing a single line of C or C++ (not to mention assembler).

[...]

What computer science professors tend to forget is that the opposite course is likely to happen as well to those graduates once they enter working life (those who have only applied Java or C#). There is more than Java and C# out there. And this "backward" paradigm shift is a much more difficult one. Appointing inexperienced Java programmers to a C++ or C project is a severe project risk.


It's a nice comfort to know that my opinion complies with the conclusions of a real software development celebrity. I admire Joel Spolsky's knowledge of the software industry's internal mode of operation, but the same is true for his excellent writing style. His writings are perceptive and entertaining at the same time.

In another article, "The Guerrilla Guide to Interviewing", Joel states:

Many "C programmers" just don't know how to make pointer arithmetic work. Now, ordinarily, I wouldn't reject a candidate just because he lacked a particular skill. However, I've discovered that understanding pointers in C is not a skill, it's an aptitude. In Freshman year CompSci, there are always about 200 kids at the beginning of the semester, all of whom wrote complex adventure games in BASIC for their Atari 800s when they were 4 years old. They are having a good ol'; time learning Pascal in college, until one day their professor introduces pointers, and suddenly, they don't get it. They just don't understand anything any more. 90% of the class goes off and becomes PoliSci majors, then they tell their friends that there weren't enough good looking members of the appropriate sex in their CompSci classes, that's why they switched. For some reason most people seem to be born without the part of the brain that understands pointers. This is an aptitude thing, not a skill thing – it requires a complex form of doubly-indirected thinking that some people just can't do.

<humor_mode>Java is for wimps. Real developer use C.</humor_mode>

But there is something very true in that. So much harm is done by people who just don't understand the underlyings of their code.

I have been quite lucky so far. Either I could choose my project team members (admitted, from a quite limited pool of candidates), or I happened to work in an environment with sufficiently qualified people anyway. But I know of many occasions when the opposite happened. The candidates cheated on their resume. Their presentation skills were good, so they obscured their technical incompetence - even to technical-savvy interviewers. Those interviewers just didn't dig deep enough. They were too busy presenting their company, talking 80% of the time. They really should just have asked the right questions instead. And they should have spent more time on listening carefully.

Joel is actually looking for Triple-A people only. That's kind of hard when you work in a corporate environment. I do not have any influence on the recruiting process itself, so there is no "Hire" or "No Hire" flag I could wave. In order to attract excellent people, I can only try to ensure a professional working environment within the corporate boundaries. Still, this is not what potential hires get to see when they go through the recruiting process.

But I did something else. I forwarded "The Guerrilla Guide to Interviewing" to the people in charge.

Wednesday, October 20, 2004

The StringBuffer Myth

Charles Miller writes about his assessment of StringBuffer usage in Java:

One of my pet Java peeves is that some people religiously avoid the String concatenation operators, + and +=, because they are less efficient than the alternatives.

The theory goes like this. Strings are immutable. Thus, when you are concatenating "n" strings together, there must be "n - 1" intermediate String objects created in the process (including the final, complete String). Thus, to avoid dumping a bunch of unwanted String objects onto the garbage-collector, you should use the StringBuffer object instead.

So, by this theory, String a = b + c + d; is bad code, while String a = new StringBuffer(b).append(c).append(d).toString() is good code, despite the fact that the former is about a thousand times more readable than the latter.

For as long as I have been using Java, this has not been true. If you look at StringBuffer handling, you'll see the bytecodes that a Java compiler actually produces in those two cases. In most simple string-concatenation cases, the compiler will automatically convert a series of operations on Strings into a series of StringBuffer operations, and then pop the result back into a String.

The only time you need to switch to an explicit StringBuffer is in more complex cases, for example if the concatenation is occurring within a loop (see StringBuffer handling in loops).


Charles compares those to approaches:

return a + b + c;

VS.

StringBuffer s = new StringBuffer(a);
s.append(b);
s.append(c);
return s.toString();


As Charles points out correctly, the Java compiler internally replaces the string concatenation operators by a StringBuffer, which is converted back to a String at the end. This looks like the same result, as when using StringBuffer directly.

But the Java bytecode, that Charles analyzed in detail, only tells half of the story. What he did not take a closer look on was what happens inside the call to the StringBuffer constructor, which the compiler inserted. And that's where the real performance vulnerability strikes hard:

public StringBuffer(String str) {
    this(str.length() + 16);
    append(str);
}


The constructor only allocates a buffer for holding the original String plus 16 characters. Not more than that. In addition, StringBuffer.append() only expands the StringBuffer's capacity to fit for the next String appended:

public synchronized StringBuffer append(String str) {
    if (str == null) {
        str = String.valueOf(str);
    }

    int len = str.length();
    int newcount = count + len;
    if (newcount > value.length)
        expandCapacity(newcount);

    str.getChars(0, len, value, count);
    count = newcount;
    return this;
}


That means constant re-allocation on each consecutive call to StringBuffer.append().

You - the programmer - know better than that. You might know exactly how big the buffer is going to be in its final state - or if you don't know the exact number, you may at least apply a decent approximation. You can then construct your StringBuffer like this:

return new StringBuffer(
a.length() + b.length() + c.length()).
append(a).append(b).append(c).toString();


No constant reallocation necessary, that means better performance and less work for the garbage collector. And that's where the real benefit lies in when applying StringBuffer instead of String concatenation operators.

Monday, October 18, 2004

Exceptional C++

I am currently reading Herb Sutter's Exceptional C++ and More Exceptional C++. Lately I am not really working on a lot of C++ projects (only maintaining some old libraries from time to time), I mostly code in C# those days. But reading Exceptional C++ keeps my C++ knowledge up to date, plus trains switching my synapses. Some chapters are real puzzles. And I thought I knew about templates.

Other C++ books that I can highly recommend:

More on my Amazon C++ Listmania List.

Saturday, October 16, 2004

The Ultimate Geek Test

Eric Sink has assembled the ultimate geek test. It goes like this:

  • In vi, which key moves the cursor one character to the right?

  • What is the last name of the person who created Java?

  • True or false: In the original episode IV, Greedo shot first.

Threadsafe GUI Programming

When I was working on Visual Chat back in 1998, I faced a severe performance problem when the chat client received data from the server, which caused some UI refreshing. Reason was that the TCP socket's receiving thread accessed Java AWT components directly, e.g. by setting a TextField's text, or by invoking Component.repaint().

I had not considered that while ordinary calls to Component.repaint() (within the main AWT event dispatch thread) are merged to one big paint effort once the thread runs idle, posting repaint events from another thread into the AWT event dispatch thread's (see below) queue causes constant repainting - a heavy performance issue. Once I figured out, it was easy to fix - I posted a custom event to the AWT thread's queue, which caused a call to Component.repaint() within the correct thread context, namely the AWT thread.

A connected issue is thread-safety. AWT Components or .NET Controls respectively their underlying native peers are not re-entrant, which makes it potentially dangerous to access them from any other thread than the one they were created on.

Once you think about it, it's quite obvious why thread-safety cannot be achieved like this: When one or more windows are being opened, the operating system associates them with the calling thread. Messages dedicated to the window will be posted to the thread's message queue. The thread then usually enters a message loop, where it receives user input, repainting and other messages dedicated to its windows. The message loop usually runs until the main window closes (signaled by the arrival of a WM_CLOSE message in case of Microsoft Windows).

The message loop implementation dispatches the messages to so called window functions, specific to each window class. A typical message loop implementation looks like this under Microsoft Windows:

MSG msg;
while (GetMessage(&msg, NULL, 0, 0)) {
    TranslateMessage(&msg);
    DispatchMessage(&msg);
}


Modal dialogs are based on "inner" message loops that lie on the current call stack, and exit once the modal window closes.

Java AWT hides this mechanism from the programmer. As soon as a Java window is being opened, the framework creates the AWT event dispatch thread, which gets associated with the window by the underlying operating system, and runs the message loop for this and all other Java windows. The main thread then blocks until the AWT event dispatch thread exits. The AWT event dispatch thread is also reponsible for repainting. It collects all GUI refresh events, checks which regions have been invalidated, and triggers painting them.

Under .NET WinForms, Application.Run() starts the message loop on the current thread.

So when someone calls any AWT-Component-method (or MFC-method resp. WinForms-method, if you prefer) from another thread, this bypasses the thread that actually runs the message loop. Most of those methods are supposed to be called within the thread that holds the message loop, e.g. because member variables are not protected by synchronization from concurrent access. So you actually run the risk of destroying your Component's data.

How can we spawn a worker thread then, which runs in parallel, hence keeps the GUI responsive, but can display some visual feedback, e.g. a dialog containing a progressbar and a cancel button? Yes we may always give Application.DoEvents() a try, but we might also want to invoke a Stored Procedure or a Webservice, which blocks program flow for several seconds. Then we need some kind of inter-thread communication, posting events to the message loop for example. The registered Window Function/EventHandler can then update the progressbar state or react on a button click. Its code will run inside the expected thread context.

Some years later I had to maintain somebody's old Java code, which just happened to behave in the same flawed way as described above. In the meantime, Swing offered some convenience methods for invoking funtionality inside the AWT event dispatch thread which I was happy to apply, namely SwingUtilities.invokeLater() and SwingUtilities.invokeAndWait(). The .NET WinForms API provides Control.Invoke() and Control.BeginInvoke(). The .NET WinForms library will actually throw an exception if any Control members are being accessed from another thread than the one that created it, hence runs the message loop.

Ten Years Of Netscape

Netscape is celebrating the 10th anniversary of its first browser release those days. On October 13th, 1994 Netscape Navigator, available for Microsoft Windows, Apple Macintosh, and X Window System environments, could be obtained via anonymous FTP from ftp.netscape.com.



I remember the summer of 1994 was the first time I actually used Mosaic on an Apple Macintosh. I think it was not until early 1995 that I switched over to Netscape Navigator.

Its funny to note that Jim Clark and Marc Andreessen originally named their new company "Mosaic", but dropped that name after a dispute with the National Center for Supercomputing Applications, University of Illinios (NCSA), home of the original Mosaic browser, which had been developed under the lead of Marc Andreessen in 1993. Later on, some people seriously dared to doubt Andreessen's contributions to the Netscape browser development.

Jamie Zawinski, one of Netscape's original developers, still got his online diary running, recapitulating how working life looked like at Netscape in summer 1994. Later on, in 1998, a PBS documentation named "Code Rush" showed Netscape's struggle for keeping its browser market share.

Thursday, October 14, 2004

Avoid Duplicate ADO.NET DataTable Events

Using ADO.NET DataSets, it's quite common to have some code like this:

dataset.datatable.Clear();
dataset.Merge(anotherDataset.datatable);


When your DataTable has some EventHandlers attached, e.g. due to DataBinding, those EventHandlers will be notified several times about value changes. Also, indices will be rebuild more than once and the like. This might result in a whole cascade of additional calls and lead to slow runtime performance.

This drawback can be avoided by batching two or more data manipulations using DataTable.BeginLoadData() / DataTable.EndLoadData():

try {
    dataset.datatable.BeginLoadData();
    dataset.datatable.Clear();
    dataset.Merge(anotherDataset.datatable);
}
finally {
    dataset.datatable.EndLoadData();
}


Another performance gain can be achieved by setting DataSet.EnforceConstraints to false (when not needed). This avoids constraint checks. In most of my .NET projects so far, it turned out to be better to let the database take care of constraint violations. Setting EnforceConstraints inside Visual Studio's XSD-Designer can be tricky. I didn't find the according property on the DataSet's property grid. But EnforceConstraints can be set manually within the XSD-file itself:

<xs:element name="MyDataSet" msdata:IsDataSet="true"
msdata:EnforceConstraints="False">


By the way, user interface controls bound to DataSets / DataTables resp. to IListSource-implementations are being repainted asynchronously on value changes in the underlying model. So luckily, several consecutive value changes lead to one repaint event only.

Wednesday, October 13, 2004

Bad Boy Ballmer

I just finished reading "Bad Boy Ballmer" by Frederic Alan Maxwell. The book was partly entertaining on one hand, on the other hand it didn't provide too much of new information to the well-informed reader. And: the usual Microsoft-bashing, dispensable to say the least. I wonder when unreflected Microsoft-bashing will finally go out of fashion.

Steve Ballmer protects his private life more than Bill Gates (Gates was and still is successfully promoted as the company's harmless chief nerd to the broader public by Microsoft's PR, while within the industry his business tactics are known to be merciless).

"Bad Boy Ballmer" also contains several hilarious bloopers on technology terms, I will just mention two of them:

Quote:
"By 2002 one could buy a Dell computer with 20 gigabytes - 20 billion bytes - of memory for less than eight hundred dollars. Given that IBM's 1981 Personal Computer had between zero and fortyeight thousand bytes of memory, a single, low-end 2002 Dell computer contains more than the memory of all the two hundred thousand IBM PCs sold in their first year combined."

Huh? The author seems to be confusing memory on a IBM PC as of 1981 with hard disk space on a Dell 2002 PC here. The first IBM PCs came with 16kB or 64kB, expandable to 256kB. 4GB is the current addressing limit on Win32 flat memory (although it can be extended up to 64GB using technologies like Physical Address Extension (PAE) on high-end Windows server versions). Dell's standard PCs of 2002 typically were equipped with 256MB or 512MB of memory.

Quote:
"Andreesen and Clark jointly established Netscape, its powerful search engine an outgrowth of Mosaic to the point that the University of Illinois threatened to sue."

Mosaic, Netscape a search engine? You can't be serious.

Did no one tech-savvy proof-read this? Hmmm... the book WAS ENTERTAINING after all. ;-)

Monday, October 11, 2004

AllowSetForegroundWindow

Starting with Windows ME/2000, applications not running in the foreground can not bring their windows to the front. You usually see their taskbar button blink when they are trying to do so, e.g. by invoking SetForegroundWindow().

Unfortunately one of our older applications was actually split in two processes, which communicate to each other using TCP sockets. Both applications have graphical user interface. As customers (who never really saw two applications but just one GUI) started to switch from Windows NT to 2000/XP, they noticed that newly created windows were overlapped or completely hidden by others. As a temporary workaround (which was only available under XP) they started the application in NT compatibility mode.

The remedy came with a new API-function, AllowSetForegroundWindow(). The foreground process can allow any other process to bring windows to the front. All it needs to know is the target process's id. And of course timing is crucial, once the invoking process is not the foreground process any more, it cannot allow other process to open foreground windows either. Clearly one has to check the proper windows version before invoking AllowSetForegroundWindow(). This can be done using GetVersion().

Sunday, October 10, 2004

WinForms UserControl DataBinding

Last week I worked on a UserControl which provided a bindable property. Our GUI developer bound it to a DataTable's column, but what happened at runtime was that while binding the property worked just fine, the bound DataRow's RowState was set to "Modified" each time, even when no user input had occurred on the control. In this case the RowState was expected to stay as "Unchanged".

I spent hours on searching the internet, until I finally found the reason why: There seems to be an undocumented naming convention inside .NET DataBinding: One must provide a public event of the following kind...

public event EventHandler PropertynameChanged;

... and fire the event when the bound property is set:

public object Propertyname {
    get {
        // return the property value
    }
    set {
        // set the property value,
        // then fire the event (in case the value changed)
        if (PropertynameChanged != null) {
            PropertynameChanged(this, EventArgs.Empty);
        }
    }
}


Hello, can someone there at MSDN please update the DataBinding documentation? Or did I miss something?

Saturday, October 09, 2004

Suffering From The Not-Invented-Here Syndrome

Amazingly (and I have been watching this for years), there are still organizations out there that try to re-invent the wheel as often as they can. I am talking about people that happily keep on building their own project-specific windowing toolkit, standard gui controls, collection implementations, XML-parsers, HTTP-protocols, object-relational-mapping layers, report engines, content management systems, workflow systems, you name it.

Typically those organizations are either departments within larger corporations, who in some way or another gained a jester's license, while the corporation swallows the costs of their doomed endeavours, or small startup vendors that managed to convince a clueless customer about the absolute necessity for their futile product. Customers that just don't know they could get a better product for a fraction amount or even for free from somewhere else, without the pain of being the vendor's guinea pig for an unproven technology, that no one else uses.

Sometimes it's cool to be an early adopter. But then let me be an early adopter of something innovative that really fits my needs, and not the 47th clone of another "have-been-there-before"-product from the little ISV next door.

Please, stop this insanity. Let's do what we were supposed to do: Develop Applications. Leave the other stuff to the experts...

Thursday, October 07, 2004

The Fallacy Of Cheap Programmers

Robin Sharp writes about "The Fallacy Of Cheap Programmers":

A lot of research on programmer productivity shows that the best programmers are up to 20 times more productive than the worst programmers. If the income differential between the best and worst programmers is even 5 times, it means employers are getting incredible value for money hiring the best programmers.

Why then don't companies hire a few very good programmers and leave the rest flipping Big Macs? There is one very good reason: the psychology of managers. Managers simply can't believe that one programmer can be as productive as 3, let alone 5 or even 20 times. Managers believe that productivity is a management issue.

They believe that simply by re-organising their human resources it is they who can gain leaps in productivity, and reap the rewards. But the reality of management, as we all know, is that most projects are late and over budget.


I could not agree more, Robin.

Tuesday, October 05, 2004

Not Everyone Seems To Love The Mac

Watch "Crash Different"!

Pat Helland Sings "Bye Bye Mr. CIO Guy"

Pat Helland (Microsoft Architect) sings "Bye Bye Mr. CIO Guy" on Channel9. Guess who plays the Guitar: Mr. COM / Mr. Indigo himself, Don Box.

Object-Relational Technologies = Vietnam?

Ted Neward posted a critical article regarding O/R Mapping Tools. He states: "Object-relational technologies are the Vietnam of the Computer Science industry."

Is this why Microsoft slipped ObjectSpaces from Whidbey to WinFS? If not even Microsoft can get it right, do you seriously think the little ISV next door can? On the other side, Hibernate seems to do a good job on that (although I have not tried it in detail yet), and I have also applied Container-Managed Entity EJBs under certain scenarios in some J2EE projects, which just worked fine.

Still it seems that projects using O/R often harvest far more problems than benefits. Especially those that follow an undifferentiated 100% O/R approach. Might it be that people listen too much to pied piper consultants? Or even worse, some think they must implement their own O/R Mapping Layer. Don't burn your fingers, this is by no means at all a trivial task.

The following quote from Ted's article sounds like the accurate summary of a real project I know of (luckily, I have only been watching it from the outside):

Both major software vendors and project teams (building their own O-R layer) fall into the same trap. With object-relational technologies, products begin by flirting with simple mappings: single class to single table, building simple Proxies that do some Lazy Loading, and so on. "Advisers" are sent in to the various project teams that use the initial O-R layer, but before long, those "advisers" are engaging in front-line coding, wrestling with ways to bring object inheritance to the relational database.

By the time of the big demo, despite there being numerous hints that people within the project team are concerned, project management openly welcomes the technology, praising productivity gains and citing all sorts of statistics about how wonderful things are going, including "lines of code" saved, how we were writing far more useful code than bugs. Behind the scenes, however, project management is furious at the problems and workarounds that have arisen, and frantically try every avenue they can find to find a way out: consultants, more developers, massive code reviews, propping up the existing infrastructure by throwing more resources at it, even supporting then toppling different vendors' products as a means of solving the problem.

Nothing offers the solution the team needs, though: success, a future migration path, or at the very least, a way out preserving the existing investment. Numerous "surprises" (such as the N+1 query problem thanks to lazy-loading proxies or massive bandwidth consumption thanks to eager-loading policies) make the situation more critical.

Finally, under new management (who promise to fix the situation and then begin by immediately looking to use the technology in other projects), the team seizes on a pretext, ship the code and hand it off to system administrators to deploy, and bring the developers home to a different project. Not a year later, the project is cancelled and pulled from the servers, the project's defeat complete in all but name.

Monday, October 04, 2004

Knowing Java But Not Having A Clue?

Some years ago, I had the pleasure to program some archaic embedded systems, namely mobile phones, based on a 16bit CPU with a segmented memory model (comparable to the old 8086 memory model). We wrote an application framework (windowing system, scheduler, network stack API and the like) applying a C++ compiler/linker (which had just introduced some brand-new C++ features like templates in its latest version ;-) ). This framework represented the technological base for several customer projects.

Our department was growing, so we employed some recent university graduates. The course curriculum here in Austria allows you to walk through a computer science degree without writing a single line of C or C++ (not to mention assembler).

We didn't really let them touch any core stuff, but still: The new hires started coding as they used to code in their assignment classes under Java: highly inefficient, producing one memory leak after the other, not knowing about the consequences of their code - especially on a system with very limited resources.

Some of them didn't know the implications of a copy constructor invocation (heck, they didn't even know they would invoke a copy constructor on a value assignment - they simply thought they would be working on references as they used to). One time a new hire allocated and free'd memory at such a high rate that we faced serious performance and memory fragmentation problems just because of scrolling through some listbox items (several hundred malloc's and free's during a single keypress).

And just recently I spent half an hour or so explaining our current summer intern the difference between stack and heap. In his Java times at university he never had to worry about that...

For someone who as been programming C++ for several years, it is not too hard to switch to a managed environment that provides garbage collection and the like, as Java or .NET do. OK, some new concepts, but it's a lot of convenience.

What computer science professors tend to forget is that the opposite course is likely to happen as well to those graduates once they enter working life (those who have only applied Java or C#). There is more than Java and C# out there. And this "backward" paradigm shift is a much more difficult one. Appointing inexperienced Java programmers to a C++ or C project is a severe project risk.

Sunday, October 03, 2004

Enzo's World

Enzo's WebLog went live: Enzo's World

The J2EE Tutorial, Second Edition

I don't receive any commission on that, but "The J2EE Tutorial, Second Edition" (Addison Wesley) is an excellent book which provides a good overview over all technologies that lie under the hood of J2EE 1.4.


Wednesday, September 29, 2004

Awful Performance

Some time ago, a colleague of mine asked me to profile a collection of libraries, which had been provided by another vendor. One function received data over a TCP socket, and did some XML parsing. It seemed to perform weakly (100% CPU usage for several seconds even on a small size of data).

In this case we easily managed to improve performance by a factor of up to 30000. Workloads that took seconds or even minutes before now finish within some milliseconds. Here is what I did:

  • Applying a standard XML parser instead of byte-wise self-implemented parsing.

  • Buffering read data in byte chunks that grow dynamically by powers of two instead of concatenating single bytes to an MFC CString (the later implied constant reallocation of CString's internal buffer).

  • One library was written in Java. The bytecode had been obfuscated. This included string constants, which were used inside the innermost-loop of the XML-parsing algorithm (these strings actually contained XML element names). Each time one of those strings was referenced at runtime, the inverse obfuscation algorithm was invoked on it - in order to "decrypt" the string to its original content. The performance impact was devastating - esp. as the "decryption"-algorithm was not really what I would consider lightweighted.

  • Removing a memory violation, that the original vendor circumvented by delivering a debug version of one library. The debug version always protects the method-stack by some trailing bytes.

Tuesday, September 28, 2004

Joel On Software Book

I just received Joel Spolsky's new book from Amazon. This book mainly represents a "best of" Joel's website. I have been a loyal reader of Joel's weblog for several years and know most of his postings, which are full of deep insight into the business of software development. The Joel On Software Book is a unique collection of articles, and a lot of fun to read. I guess this clearly determines how I will spend the next couple of nights - reading with a flashlight under the blanket.

Monday, September 27, 2004

Pointy-Haired Boss Under Consulting Spell?

I mean it certainly makes sense to get help from external consultants e.g. from IBM when your department migrates its middleware to IBM Websphere, or from Microsoft if you are specifically interested in .NET Enterprise Services hosted by a COM+ application server. And I even understand the need for all those SAP or Baan consulting firms.

What I just don't get is why companies are so eager to spend weeks on explaining their business domain to a complete stranger (=consultant), who in exchange will charge them huge amounts of money either for a substrate of some outdated Gartner or McKinsey papers, or for pushing their own proprietary software solution, which doesn't even closely solve the technological problems the customer is facing.

I am talking about those guys who became consultants simply by printing their job title on a business card.

Why don't companies just instead give their own talented people some time and resources, and let them go figure out? Or as Wally would say: "Just because we pay inexperienced strangers to tell us how to do our jobs, that doesn't mean we're morons!"




First Posting

Welcome to Arno's Weblog. This blog is about software development, or to be more specific: the daily joy and pain of work as an ordinary software engineer.

While I don't expect too many visitors to be reading this, some people might find the content interesting or amusing. I am also planning to collect links to articles on more notorious software development blogs that might be of common concern.

By the way, I am not a native English speaker. So please forgive all those mistakes in my writings, that are certainly going to occur.