Tuesday, July 26, 2005

Please, Do Not String-Concatenate SQL-Parameters

No reason to be surprised that code like this one not only exists in old legacy applications, but also still is being produced on a daily basis - as long one of the most widespread German .NET developer magazines publishes articles like that:

Sonderzeichen in SQL-Kommandos unter ADO.NET
INSERT Murks? Möglicherweise haben Sie es noch gar nicht bemerkt, aber wenn Sie mithilfe von SQL-Anweisungen Texte in Datenbanken eintragen, spielt deren Zusammensetzung eine entscheidende Rolle. Bestimmte Sonderzeichen verwirren ADO.NET, und der SQL-Aufruf kann scheitern. Mit einem einfachen Trick sichern Sie sich ab.

The author then suggests a string-replacement approach in order to manually escape apostrophs when using ADO.NET'S SQLServer Provider. For those of you who are interested, a single ' will be escaped to '' (SQLServer's usual escaping schema for apostrophs).

So he argues apostrophs might brake SQL statements, e.g. on statements like this:

string sql = "select * from dotnet_authors where clue > 0 or name <> '" + yournamegoeshere + "'";

and instead recommends something like:

string sql = "select * from dotnet_authors where clue > 0 or name <> '" + MyCoolEscapingMethod(yournamegoeshere) + "'";

Now if yournamegoeshere contains an apostroph, yes the former command will fail. But who on earth would voluntarily code SQL like this anyway?

Here is how developers with some common sense (as opposed to overpaid consultants who have no idea about real-life projects) are going to handle that: Obviously we prefer to externalize out SQL-code somewhere else, e.g. in our own XML-structure, and embed that XML-file in our assembly. We won't leave SQL lying inside C# code, so we will never even feel the urge to merge those two. And we keep our hands off VS.NET's SQL Query wizard - it just produces unreadable and unmaintainable code. We don't really need a wizard to assemble SQLs, now do we? (if we would, we'd better better get back to Access). And of course, and this is the main point here, we take advantage of parameter placeholders, so ADO.NET will take care of expanding them properly into the statement (and SQLServer will receive the strongly typed values and at the same time can prevent SQL injection) .

select * from dotnet_authors where clue > 0 or name <> @yournamegoeshere

How was our expert here going to insert DataTime values (which format pattern to use?), or GUIDs, or BLOBS, or... and with is string-concatenation approach he opens all doors for SQL-injection. Plus he locks in on specific SQLServer collation settings (they'd better never change in the future).

So please, don't bother with solutions for problems that don't even exist.