Wednesday, June 17, 2009

NHibernate Criteria-Query: Child Collection Not Populated Despite FetchMode.Join When Criteria Exists For Child Table

Example taken from NHibernate Bugreport #381:


session.CreateCriteria(typeof(Contact))
.Add(Expression.Eq("Name", "Bob")
.SetFetchMode("Emails", FetchMode.Join)
.CreateCriteria("Emails")
.Add(Expression.Eq("EmailAddress",
"Bob@hotmail.com"))
.List();



The resulting SQL will include a Join to Emails as expected, the resultset returned by the database is OK, but within the object model Contact.Emails is not going to be populated with data. Which means once Contact.Emails is being accessed in code, lazy loading occurs, which probably was not the coder's intention. This is not the case when

CreateCriteria("Emails")
.Add(Expression.Eq("EmailAddress",
"Bob@hotmail.com"))


is omitted.

The bug report was closed without fix, but contained a comment that "According to Hibernate guys this is correct behavior" and a link to Hibernate bug report #918.

To me that does not sound completely implausible. Hibernate's interpretation ot this criteria tree is that the EMail-criteria is meant to narrow down die Contact parent-row, not the EMail child-rows. HQL queries act just the other way around. Under HQL, additionaly Join-With- or Where-expressions can limit which child rows are loaded into the child collection. I know that HQL - in contrast to Criteria queries - does not apply the fetching strategy defined in the mapping configuration. But with an explicit FetchMode.Join I would have expected Criteria query to do the same.

Apparently under Criteria API this can be worked around by applying an outer Join (which of course is somewhat semantically different):

session.CreateCriteria(typeof(Contact))
.Add(Expression.Eq("Name", "Bob")
.CreateCriteria("Emails", JoinType.LeftOuterJoin)
.Add(Expression.Eq("EmailAddress",
"Bob@hotmail.com"))
.List();


Which seems kind of inconsistent compared to the inner join scenario, and there is even a Hibernate bug report on that.

What I would recommend anyway: If the goal is to narrow the parent data, but then fetch all the children, why not apply an Exists-subquery for narrowing, and in the same query fetch-join all children without further narrowing. Or, if you prefer lazy loading, simply define fetchmode="subselect" on the association.

On a related topic, eagerly joining several child associations has the drawback that the resultset consists of a cartesian product over all children - lots of rows with duplicate data. Let's say there are three child associations A, B and C with 10 rows each for a given parent row, joining all three associations will blast up the resultset to 1 x 10 x 10 x 10 = 1000 rows, when only 1 + 10 + 10 + 10 = 31 rows would be needed.

And while those duplicates will only lead to duplicate references in the object model (and not to duplicate instances), and even those duplicate references can be eliminated again by using Maps or Sets for child collections, these Joins impose a severe performance and memory drawback on the database resp. ADO.NET level.

Of course one could simply issue N single select statements, one for each table, with equaivalent where-clauses. But that implies N database roundtrips as well. Not so good.

The means to avoid this are Hibernate Criteria- and HQL-MultiQueries. Gabriel Schenker has posted a really nice article on MultiQueries with NHibernate.

More Hibernate postings: