Failing The Turing Test: 2009

Wednesday, June 24, 2009

Composite WPF ("Prism") DelegateCommand.CanExecuteChanged Memory Leak

I was happy to hear that the Microsoft Patterns and Practices Team has picked up this Composite WPF ("Prism") bug report that I submitted several weeks ago. From the issue description:

When profiling my application I noticed that plenty of EventHandlers had never been deregistered from DelegateCommand's CanExecuteChanged-Event. So those EventHandlers were never been garbage-collected, which caused a severe memory leak.

As registering CanExecuteChanged-EventHandles is done outside application code scope I had expected them to be deregistered automatically as well. At this point I thought this might as well be a ThirdParty WPF control issue, but digging further I read a blog post stating that "WPF expects the ICommand.CanExecuteChanged-Event to apply WeakReferences for EventHandlers". I had a look into RoutedCommand, and noticed it uses WeakReferences as well.

Now this is no showstopper for us as we are very early in the development cycle, and we simply patched DelegateCommand in the meantime by using WeakReferences for its CanExecuteChanged-EventHandlers.

As far as I know the issue has been fixed already, and it is currently going through testing.

Wednesday, June 17, 2009

NHibernate Criteria-Query: Child Collection Not Populated Despite FetchMode.Join When Criteria Exists For Child Table

Example taken from NHibernate Bugreport #381:


session.CreateCriteria(typeof(Contact))
   .Add(Expression.Eq("Name", "Bob")
   .SetFetchMode("Emails", FetchMode.Join)
   .CreateCriteria("Emails")
      .Add(Expression.Eq("EmailAddress", 
      "Bob@hotmail.com"))
   .List();

The resulting SQL will include a Join to Emails as expected, the resultset returned by the database is OK, but within the object model Contact.Emails is not going to be populated with data. Which means once Contact.Emails is being accessed in code, lazy loading occurs, which probably was not the coder's intention. This is not the case when

CreateCriteria("Emails")
      .Add(Expression.Eq("EmailAddress", 
      "Bob@hotmail.com"))

is omitted.

The bug report was closed without fix, but contained a comment that "According to Hibernate guys this is correct behavior" and a link to Hibernate bug report #918.

To me that does not sound completely implausible. Hibernate's interpretation ot this criteria tree is that the EMail-criteria is meant to narrow down die Contact parent-row, not the EMail child-rows. HQL queries act just the other way around. Under HQL, additionaly Join-With- or Where-expressions can limit which child rows are loaded into the child collection. I know that HQL - in contrast to Criteria queries - does not apply the fetching strategy defined in the mapping configuration. But with an explicit FetchMode.Join I would have expected Criteria query to do the same.

Apparently under Criteria API this can be worked around by applying an outer Join (which of course is somewhat semantically different):

session.CreateCriteria(typeof(Contact))
   .Add(Expression.Eq("Name", "Bob")
   .CreateCriteria("Emails", JoinType.LeftOuterJoin)
      .Add(Expression.Eq("EmailAddress", 
      "Bob@hotmail.com"))
   .List();

Which seems kind of inconsistent compared to the inner join scenario, and there is even a Hibernate bug report on that.

What I would recommend anyway: If the goal is to narrow the parent data, but then fetch all the children, why not apply an Exists-subquery for narrowing, and in the same query fetch-join all children without further narrowing. Or, if you prefer lazy loading, simply define fetchmode="subselect" on the association.

On a related topic, eagerly joining several child associations has the drawback that the resultset consists of a cartesian product over all children - lots of rows with duplicate data. Let's say there are three child associations A, B and C with 10 rows each for a given parent row, joining all three associations will blast up the resultset to 1 x 10 x 10 x 10 = 1000 rows, when only 1 + 10 + 10 + 10 = 31 rows would be needed.

And while those duplicates will only lead to duplicate references in the object model (and not to duplicate instances), and even those duplicate references can be eliminated again by using Maps or Sets for child collections, these Joins impose a severe performance and memory drawback on the database resp. ADO.NET level.

Of course one could simply issue N single select statements, one for each table, with equaivalent where-clauses. But that implies N database roundtrips as well. Not so good.

The means to avoid this are Hibernate Criteria- and HQL-MultiQueries. Gabriel Schenker has posted a really nice article on MultiQueries with NHibernate.

More Hibernate postings:

Tuesday, June 16, 2009

My Amazon Listmania Lists

I had nearly forgotten about them, and was surprised to see that over time nearly 25,000 people have viewed my Amazon Listmania Book Recommendation Lists. Hey I should have received commission! ;-) Subtle hint: The J2EE list is a little bit out-dated by now.

Friday, June 12, 2009

Visualizing TFS Work Items With Treemaps

Microsoft Team System / Team Foundation Server is a really nice line of products. Besides version control we heavily rely on TFS Work Items for organizing development tasks. One of our largest project is conducted using Scrum, so we are utilizing Conchango's Scrum Plugin for Team System, plus Conchango Taskboard for Sprint planning. Taskboard is better suited than the general purpose Work Item lists and forms that are part of Visual Studio Team Explorer. Let's compare.

Visual Studio Work Item list:

Conchango Taskboard:

From a certain project size on the Visual Studio Work Item lists just won't scale and end up with heaps of data that one can scroll through forever. Don't get me wrong, those lists are sufficient for standard tasks, but they are cumbersome for gaining insight into the project's big picture. And Conchango Taskboard is for Sprint planning and Product Backlog maintenance only. The Conchango Scrum Plugin does have a set of really nice reports though.

So this is where I decided to ramp up my own little solution, which would be based on rendering Work Item data into Treemaps. This week I hacked out a little prototype in my sparetime (working title "Aurora"):

(this is an old screenshot that still misses labeling the treemap blocks)
This configuration example provides an overall impression of the sample project's progress: Green tasks are done, blue tasks are not done, and their size represents their complexity. And this is by no means limited to Scrum projects, it works for all kinds of TFS project templates.

Three simple input parameters are all it takes:

Work Item type (e.g. Product Backlog Item, Sprint Backlog Item, Bug)
Work Item attribute defining Treemap size (e.g. Storypoints or any other numeric data, or none in case all items should be rendered with equal size)
Work Item attribute defining Treemap color (e.g. State, Sprint ID, etc)

Plus an optional query string for narrowing the list of items.

This approach possibly allows to visualize about 70% of the reports I could think of. I am still wondering how to implement the missing 30%, as they cannot be covered that easily. For instance I want to group Area Paths with equal prefixes by rendering them with the same color. Or simplify the creation of queries (can't expect everyone to know WIQL by heart). And I don't want to over-complicate things either. Any suggestions regarding those matters are highly welcome! Another requirement is to let the user define color mapping. And item hierarchies are still missing, too (that's the "Tree" in "Treemaps" after all).

By the way, I am using woopef's WPF TreeMap control, thanks a lot for making it publicly available. I am also going to open-source Aurora once it provides basic functionality and reaches a certain level of stability, most likely on CodePlex.

Monday, June 08, 2009

Vanilla Data Access Layer Library 0.6.0 Released

I have just updated Vanilla DAL on Sourceforge. Release 0.6.0 is still in Alpha state, and comes with improved automatic transaction management. I now chose am approach similar to System.Transactions.TransactionScope. Of course nothing as sophisticated, Vanilla DAL's TransactionScope is a simple IDisposable object that consists of a thread-bound transaction and a does some reference-counting. The transaction will be commited when the last TransactionScope is being disposed, resp. rollbacked in case of any exception during execution:


    using (accessor.CreateTransactionScope()) 
    {
        accessor.Update(new UpdateParameter(northwindDataset.Customers));
        accessor.ExecuteNonQuery(new NonQueryParameter(new Statement("DeleteTestCustomers")));
    }

My main problem was how to find out whether the current call to IDisposable.Dispose() happens within the process of exception unwinding. Several people have recommended Marshal.GetExceptionPointers(), which is the only working solution I have found so far. But I consider this a semi-hack. Any better ideas?

By the way, Luis Ramirez has written a nice article on Codeproject.Com, comparing Vanilla DAL to plain ADO.NET, Microsoft Data Access Application Block and SqlNetFramework.

Sunday, June 07, 2009

Code Coverage Analysis With QmCover

My cousin has developed a graphical code coverage analysis tool for the GNU toolchain. Now I am not really up-to-date regarding the current state of code coverage tools in Unix land (during my old Solaris days all I needed was vi and gcc), but QmCover certainly looks cool!

Saturday, May 30, 2009

Implementing Equals And HashCode In Hibernate POCOs/POJOs

Nearly everybody who uses (N)Hibernate stumbles over this question sooner or later: How to implement Equals() And (Get)HashCode() in Hibernate POCOs/POJOs?

What's the problem? At first sight a natural approach seems to apply the object's primary key in Equals And HashCode implementations - simply delegate the calls to Equals and HashCode to the primary key type. But many primary keys are AutoIncrement / Identity columns which the database will set when inserting rows. So the primary key's value becomes available only after that - which is too late in most cases. E.g. if there is more than one newly created object (Hibernate state "transient") at one point in time, all these objects would appear to be "equal". Additionally when the newly created object has been added to a child collection of type Map or Set, it will be replaced by the next newly created object that comes along. Undoubtedly this is unwanted behavior.

Moreover this breaks the Equals/HashCode contract, because the HashCode value changes once the primary key has been set by the database. This may lead to all kinds of weird things, e.g. Maps / IDictionaries whose implementations depend on HashCode will not be able to find their objects any more.

So what to do about it? Articles on hibernate.org and the holy book of Hibernate "Java Persistence With Hibernate" recommend applying business keys instead of surrogate keys in Equals / HashCode. Business key values are unique too, plus they are available BEFORE the data has been inserted into the database.

I used to follow that advice in former projects, but at times found it difficult to determine which would be my business keys. Sometimes they just don't exist in the application domain.

Plenty of postings point out similar problems. Some people came up with quite sophisticated solutions, e.g. a combined surrogate / business key approach. In case an object is newly created and has not been assigned its database surrogate key yet, it will apply a business key. The business key values and hashcodes will be cached and re-used for the lifetime of the object, even after inserting it into the database table. Hence the Equals / HashCode contract is not broken.

This should work, but seems unnecessarily complicated for me. Reading hibernate.org's famous 109-article once again, I figured that my current project falls into the first of the following three categories:

We don't have composite keys, and there is no need to compare objects from different Hibernate sessions. We do have multiple new instances in Sets, and collections must stay intact after saving. Which means I can omit any custom Equals / HashCode implementations altogether, and can go along with the default reference-based implementations (instead of value-based). And it works like a charm!

More Hibernate postings:

Thursday, April 23, 2009

Codegeneration With CodeSmith

It has been a while since I blogged about the sad state of NHibernate tool support in .NET land, compared with Hibernate Tools for Eclipse.

I then discovered Tom DuPont's Codesmith templates for NHibernate, which was a nice starting point. I added some functionality, e.g. support for one-to-one associations, self-referring tables, and of course I had to adapt the output to our needs. By now our develpoment team can create:

NHibernate POCOs
NHibernate mapping configuration
DAOs (yeah I know some folks will complain at this point that with Hibernate there is no need for DAOs - but I mean DAOs in a Spring HibernateDAOSupport sense), DAO-Interfaces
Additionally - and although I generally try to avoid CRUDY Service interfaces - I provided the option for generating entity-based Service implementations and interfaces (but hey, at least they are based on DTOs). We have some situations where this is necessary
DTOs, incl. validation attributes
Mapping code for NHibernate POCO <=> DTO conversion
Spring.NET XML-configuration for DAOs and Services
Unit Test Templates

All of this is generated based on the database schema information plus some SqlServer extended properties on a table- and column-level. Those properties may define name aliases, whether associations are bidirectional or not, which validations to apply, how to do the DTO-assembling, which DB-columns to ignore or which .NET type to map to (if it differs from the default type), stuff like that. They are optional, so without extended properties we end up with a default generation style.

CodeSmith also comes with a nice integration into Visual Studio. So our developers just right-click the code generation project configuration, choose some database tables and which code fragments to generate, and seconds later the generated components are either added or updated within Visual Studio. Some of those fragments are split to partial classes and are then attached as code-behind files, so one can easily overwrite those parts in follow-up code-generations without losing hand-crafted code. Sweet! This is saving my development team tons of time.

Tuesday, March 31, 2009

Programming Contests

After last year's Cubido C# Pirates Programming Competition I consider entering the Catalysts Coding Contest this time. That one is about implementation speed, so I was wondering which development environment to use. Most of my programming lately has been in C#, so Visual Studio would be a natural choice. But I consider Eclipse to be a more productive IDE. And there is still the option for a real radical approach - applying LISP for example (oh well, two months left for intensive LISP training then).

Saturday, February 28, 2009

Teaching C++

I recently taught two sessions of C++ programming to a group of mobile computing students. Noticing my C++ knowledge was getting a little bit rusty this kind of refurbishing was really welcome. I was also half sick, and found it hard to concentrate, but at the end all of my students passed their exam, so it didn't seem to have been THAT bad...

Wednesday, January 21, 2009

Stressed!

Lots of work currently, and my kids keep me busy on the weekends. So unfortunately no more postings this months.