Arno's Software Development Weblog

Friday, May 23, 2008

Victory!

As luck would have it I won the Cubido C# Pirates Programming Competition for the best Battleship Game Algorithm last week, and took home an XBox 360 Premium Edition as first prize.

In general my algorithm pretty much stayed the same since I last blogged about it a while ago, but some fine-tuning brought the average winning shot count down to about 38 (compared to 39 shots achieved by the initial version) - that is the number of shots to sink five ships of size 2, 3 (2 times), 4 and 5 on a 10x10 battlefield.

Tuning was absolutely necessary as 17 algorithms scored under 39 shots at the end (around 80 players took part over all). I had submitted several slightly different versions, hence the victory was manifold. My best algorithm variation attained 38.19 shots, with the closest competitor finishing at 38.33.

The final competition was a public event (a big thank you goes to the kind folks at Cubido), and many contestants dropped by at the Cubido Headquarters and watched the progress on a large scoreboard (which was also available online). The battle lasted nearly four hours. 10,000 games had been played eventually (each algorithm was required to finish 10,000 identical battlefields). With results so close this large number was a must to avoid any "accidental winner", a winner that would have benefited by certain random number distributions (as a matter of fact during the preliminary rounds, which only consisted of 50 games each, my algorithms scored at between 34.5 and 40.5 shots, so that's quite a spread).

As I have promised before, I am going to talk about the ideas behind the algorithm and post the complete sourcecode over the next few days.

Monday, April 07, 2008

Battleship Game Algorithm Competition

I am one of the C# Pirates taking part in Cubido's Battleship Game Algorithm Competition. Dozens of computer programs battling it out against each other - in batch mode on the server (current ranking), as well as viewable in any Silverlight-enabled web browser.

Last week I attended Microsoft Austria's Big Days 2008 and found out about the contest - two nights of coding later, and Captain Alex (namesake of my older son) reached the top position. Captain Ahab, the old swashbuckler, is still a tough competitor though.


And by the way kudos to Cubido. What a great idea! This is the kind of stuff that excites most software developers. And their Silverlight client does have really cool visuals.


I am going to blog about the Captain Alex algorithm once the contest has ended (it is quite simple).

Friday, March 14, 2008

Performance-Tuning DrawCircle()

A while ago a friend of mine asked me an interesting programming question: "How would you implement a DrawCircle() function?". "I'd loop and invoke Sin() and Cos() in order to obtain coordinates" I answered. "OK", he replied, "and how would you make it fast?".

This kept me thinking. So why not give it a try right here and now! Let's start with something like this (C#, .NET 2.0):

private static void DrawCircle1(int radius)
{
double angle = 0;

do
{
double rad = angle * Math.PI / 180.0;
double x = Math.Cos(rad);
double y = Math.Sin(rad);

DrawPixel(x, y);

angle++;
} while (angle < 360);
}

Sure enough, a circle appears on the screen. It looks weird - the line is dotted, not solid. Right, incrementing the angle by 1 each time might imply advancing too fast. An increment value better suited is the angle between two pixels on the circle. Something like this:

private static void DrawCircle2(int radius)
{
double angle = 0;

double step = Math.Asin(1.0 / radius) * 180.0 / Math.PI;

do
{
double rad = angle * Math.PI / 180.0;
double x = Math.Cos(rad) * radius;
double y = Math.Sin(rad) * radius;

DrawPixel(x, y);

angle += step;
} while (angle < 360);
}

This works, so that's something to start with. To get an idea about its speed, let's ignore the call to DrawPixel() for a second (and assume that we have a lightning-fast DrawPixel() implementation at that point - and by lightning-fast I mean an inlined function or macro that just does something like an indexed array lookup, some pointer arithmetic and one or two bitwise operations, not slow stuff like SetPixel(), Bitmap.SetPixel() or BufferedImage.setRGB()).

Some profiling tells us that his function can calculate the pixels for 16,000 circles (of radius 100) per second.

Even with floating point units, floating point operations are expensive. Sure we can build up a large lookup table with pre-calculates Cos and Sin values, and interpolate in between lookup table values. I used to do that on other occasions, e.g. in my 3D chat project. But here there is no way of telling in advance how precise the lookup table needs to be, and with increasing precision the table will grow in size too. It is a valid tuning measure nonetheless, but we can expect the floating point unit to provide a similar kind of optimization already wired in hardware.

Another way of speeding up the algorithm is to take advantage of the fact that a circle is symmetric. Wouldn't it be enough to calculate one quarter of the circle, and then just mirror it on the x- and y-axis. But wait - it's what about calculating only one eighth of the circle, and mirroring it on the diagonal, too? No sooner said than done:

private static void DrawCircle3(int radius)
{
double angle = 0;

double step = Math.Asin(1.0 / radius) * 180.0 / Math.PI;

do
{
double rad = angle * Math.PI / 180.0;
double x = Math.Cos(rad) * radius;
double y = Math.Sin(rad) * radius;

DrawPixel(x, y);
DrawPixel(x, -y);
DrawPixel(-x, y);
DrawPixel(-x, -y);
DrawPixel(y, x);
DrawPixel(y, -x);
DrawPixel(-y, x);
DrawPixel(-y, -x);

angle += step;
} while (angle <= 45);
}

This gives us something like the expected 8-fold speedup (120,000 circle calculations per second to be exact). Still it seems like a dead end. Runtime complexity O(N/8) still is O(N). We need a different approach.

Let's search for a way to draw a circle without the need of invoking Sin() and Cos(). What does a circle function look like? It's

x^2 + y^2 = 1

so:

y = sqrt(1 - x^2)

Let's also take advantage of the fact that we know that x is always incremented while looping. How do we know that? Because we just calculate one eighth of the circle. We are starting at coordinate (x = 0, y = radius), hence on each step will either move right or move right and down.

private static void DrawCircle4(int radius)
{
double x = 0;
double y = radius;

do
{
DrawPixel(x, y);
DrawPixel(x, -y);
DrawPixel(-x, y);
DrawPixel(-x, -y);
DrawPixel(y, x);
DrawPixel(y, -x);
DrawPixel(-y, x);
DrawPixel(-y, -x);

x++;
//cast to get rid of decimal places
y = (int)((Math.Sqrt(1.0 - Math.Pow(x / radius, 2)) * radius));
} while (x <= y);
}

100,000 circles per second. It even got slower! The calls to Sqrt() and Pow() more than eat up what has been gained by avoiding Sin() and Cos().

Still this might open a new perspective. Having a look at two neighboring pixels on the circle, what does make them neighbors? Their x- and y-coordinates differ by 1, either x does, or y, or both. Starting at a well-known pixel, e.g. (x = 0, y = radius) we could move along the circle just be deciding whether we increment x and leave y as it is, or decrement y and leave x as it is, or increment x and decrement y.

Once more the circle function:

x^2 + y^2 = 1

which equals to:

x^2 + y^2 - 1 = 0

Drawing a perfect circle on a discrete field of pixels, there is always a deviation, an error. An error e that can be expressed like this:

x^2 + y^2 - 1 = e

The pixel we draw is either inside the perfect circle or outside. When we have moved outside the perfect circle, we have to pull back to the inside, and vice versa. In each iteration, we must keep e as small as possible.

How does e change on each iteration? Let's assume we increment x and decrement y, so the next pixel is at (x + 1, y - 1):

e = x^2 + y^2 - 1
nextE = (x + 1)^2 + (y - 1)^2 - 1 = (x^2 + 2x + 1) + (y^2 - 2y + 1) - 1

The difference between nextE and e is influenced by the following factors:

dx = (x^2 + 2x + 1) - x^2 = 2x + 1
dy = (y^2 - 2y + 1) - y^2 = -2y + 1

This is how nextE looks like when x is incremented and y is decremented:

nextE = e + dx + dy

If we decide only to increment x and leave y as it is, nextE would be:

nextE = e + dx

And in case we only decrement y, nextE is:

nextE = e + dy

At this point we just have to figure out whether e stays smallest by moving along the x-axis only, or the y-axis only, or both.

Wow, no need to invoke Sqrt() or Pow(), no need for floating point functions or values, we can do it all with integers only. Note that we could even avoid the multiplications by a factor of 2 simply by writing (x + x) and (- y - y), but compilers normally replace that by a left-shift anyway. Also, we do not invoke Math.Abs(), but get rid of minus signs on our own.

private static void DrawCircle5(int radius)
{

int e = 0;
int x = 0;
int y = radius;

do
{
DrawPixel(x, y);
DrawPixel(x, -y);
DrawPixel(-x, y);
DrawPixel(-x, -y);
DrawPixel(y, x);
DrawPixel(y, -x);
DrawPixel(-y, x);
DrawPixel(-y, -x);

int dx = 2*x + 1;
int dy = -(2*y) + 1;

int e1 = e + dx;
int e2 = e + dx + dy;
int e3 = e + dy;

e1 = e1 < 0 ? -e1 : e1;
e2 = e2 < 0 ? -e2 : e2;
e3 = e3 < 0 ? -e3 : e3;

if (e1 <= e2 && e1 <= e3)
{
x++;
e += dx;
}
else if (e2 <= e1 && e2 <= e3)
{
y--;
x++;
e += dx + dy;
}
else
{
y--;
e += dy;
}
} while (x <= y);

}

1,067,000 circle calculations a second. It cannot possibly get better, can it?

Yes it can. Similar to DrawCircle4(), when we always incremented x, we can assume here that we will never enter the last else-block in DrawCircle5().

And since we are always moving to the right incrementing x, the delta of e depends entirely on whether we also move down and decrement y. The change in the error value caused by decrementing y is (-2*y + 1), so let's choose to decrement y each time e is greater or equal to y, as this will move e closer to 0 again.

All of this allows us to get rid of several statements, and some local variables too. Also, let's do the left-shifting explicitly this time instead of multiplying by 2 (as mentioned above, we may expect our compiler to take care of that, but I have experienced cases when that did not happen, e.g. when writing -2*y instead of -(2*y)).

private static void DrawCircle6(int radius) 
{
int e = 0;
int x = 0;
int y = radius;

do
{
DrawPixel(x, y);
DrawPixel(x, -y);
DrawPixel(-x, y);
DrawPixel(-x, -y);
DrawPixel(y, x);
DrawPixel(y, -x);
DrawPixel(-y, x);
DrawPixel(-y, -x);

if (e < y)
{
e += (x<<1) + 1;
}
else
{
e += (x - y + 1)<<1;
y--;
}
x++;
} while (x <= y);

}

3,200,000 circles per second. It is now 200 times faster than at the beginning.

Side note: The numbers presented were measured in debug mode. In release mode DrawCircle1(), DrawCircle2() and DrawCircle3() are faster compared to debug mode by a factor around 10, DrawCircle5() and DrawCircle6() are faster by a factor around 2, while DrawCircle4() is just insignificantly faster. Anyway as you can tell from those numbers, DrawCircle5() and DrawCircle6() still clearly outperform all other variants in release mode as well.

I guess this is a good example that there is always room for improvement, and the solution that comes to mind first probably is not the quickest.

By the way, if you know an even faster algorithm (and I am sure several of them exist, maybe with optimizations only possible at assembler level), please drop me a line, and I will post a follow-up.

Wednesday, February 27, 2008

Windows Forms DataBound-Controls On TabPages

I stumble over this issue every once in a while: Windows Forms controls will only data-bind when they are visible or have been visible once - this can become an issue when controls are located on a TabPage that has never been shown. The control's bound value then has never been pushed into the UI.

Especially validation-code that checks on properties like TextBox.Text is vulnerable to this - e.g., no matter what the bound value is, TextBox.Text returns an empty string as long as it has not been visible (anyway, in my opinion validation code should always work on model values, not on view properties).

One way to work around this is to check on Control.Created before validating a control's content.

Monday, January 21, 2008

.NET Webcasts

I am currently watching lots of .NET Webcasts. Some of the best can be found at:


Especially Mike Taulty's screencasts are great. They have the highest entropy - tons of information in a relatively short amount of time. He exactly delivers all the facts I need to know - straight to the point.

I also try to save time when watching screencasts by switching to double playback speed in MediaPlayer.


With some concentration its possible to do so without missing any piece of information.

Sunday, December 23, 2007

Arno's Software Development Bookshelf

Time for a picture of my software development bookshelf. It's basically divided into three sections: (A) Software development technology (by far the largest part), (B) Software development project management and (C) The history of the software industry. Please click on the image to zoom in.

Wednesday, November 28, 2007

Famous Birthdays

My son Thomas was born one week ago, on November 21st. We are very happy and grateful he is doing fine. And thanks for all the well-wishing.

As he is giving my wife and me a little break right now, I did a quick search on which famous people share the same birthday with him. For November 21st the most admissible to me seems to be Voltaire.



For my older son Alexander's birthday (August 6th) it would be his namesake Alexander Fleming.



And finally regarding my birthday, December 26th, all I came up with first was Mao Zedong, who was born in 1893. Gulp. I had to dig further.

Luckily I found a much more admirable celebrity, and at the same time a founding-father of my profession: Charles Babbage.



Yep, that's definitely better. ;-)

Thursday, November 08, 2007

Switching Source Control Providers Under Visual Studio

Visual Studio 2005 let's you switch between different source control providers (e.g. Team Foundation Server and another SCC provider for Subversion or CVS) during runtime under Tools / Options / Source Control / Plug-In Selection. But if you still have to maintain some old .NET 1.1 code in Visual Studio 2003, you may be out of luck.

E.g. I recently had to access an old Sourcesafe repository, but since I had installed Team Foundation Server, Visual Studio 2003 would try to connect to the TFS server using a Sourcesafe URI. Not a good idea. And there is not way to change the provider back to VSS temporarily from within Visual Studio 2003.

Luckily I found this nice little tool SCCSwitch which did the job.

Sunday, November 04, 2007

Poker Calculator

My friend and colleague Josef developed a pokerhand odds calculator in Java, and I asked him to throw it into an applet so I could host it on this blog as well. Here we go:



Great stuff, Josef! Handling the input is very easy, just click the cards of choice for the input field having the focus (yellow background), and press "Calculate" to compute the odds, resp. to finish dealing out. You can also undo/redo any action.

The odds calculator uses a brute force approach. It looks quite fast, considering the fact it needs to process millions of combinations before the flop, and finishes in about a second on my PC. You can take a closer look at the code as well - it's open source.

My blog stylesheet won't resize the content area's width, that's why the applet is cut off on the right side, otherwise my navigation panel would have disappeared (I'll probably ask Josef to let his LayoutManager do some resizing instead, so the applet will fit within smaller screen estate as well). You can see it here in full size.

The applet requires Java 5, which - if missing - should be installed automatically on MSIE, I hope also on Firefox (for Mozilla browsers I had to replace the <embed> tag by an <applet> tag in order for it to run at all inside a blogspot.com page).

And by the way Josef, here are the odds for our game last week, when you went all in. Sorry to mention that 84% sometimes still is not enough. ;-)



Update: Some proxies seem to filter away <object> tags altogether, so I replaced it with an <applet> tag for all browsers.

Wednesday, October 31, 2007

How To Trim A .NET Application's Memory Workingset

.NET developers who have monitored their application's memory consumption in Windows Taskmanager, or slightly more sophisticated performance monitors like PerfMon, might have noticed the effect when memory usage slowly rises, and hardly ever drops. Once the whole operating systems starts to run low on memory, the CLR finally seems to give back memory as well. And as some people have noted, memory usage also goes down once an application's main window is minimized.

First of all it's important to note that by default the Windows Taskmanager only shows the amount of physical memory acquired. There is another column for displaying virtual memory usage, but it's not visible originally. So when physical memory usage drops, it's not always necessarily the CLR returning memory, but probably physical memory being swapped out to disk.

So memory consumption drops at some point in time - just probably too late. Those symptoms give us a first clue that we are not dealing with memory leaks here (of course memory leaks are more unlikely to happen in managed environments than in unmanaged ones, still it's possible - e.g. static variables holding whole trees of objects that could otherwise be reclaimed, or that EventListener that should have been unregistered but wasn't). Also, whatever amount of native heap the CLR has allocated, the size of the managed heap within that native heap is a whole different story. The CLR might just have decided to keep native memory allocated even if it could be free'd after a garbage collection pass.

And this does not look like a big deal at first glance - so what if the CLR keeps some more memory than necessary, as long as it's being returned once in a while? But the thing is, the CLR's decisions on when the right moment for freeing memory has arrived (or for that matter, the OS swapping unused memory pages to disk) might not always coincide with the users' expectations. And I have also seen Citrix installations with plenty of .NET Winforms applications running in parallel, soaking up a lot more resources than necessary, hence restraining the whole system.

Some customers tend to get nervous when they watch a simple client process holding 500MB or more of memory. "Your application is leaking memory" is the first thing they will scream. And nervous programmers will profile their code, unable to find a leak, and then start invoking GC.Collect() manually - which not only doesn't help, but is a bad idea generally speaking.

Under Java there is a way to limit the default maximum heap size (the default value depends on the Java VM), which can be overridden by passing the "-Xmx" commandline parameter to the runtime. Once the limit is reached, the garbage collector will be forced to run once more, and if that doesn't help any more either, an OutOfMemoryError is thrown. This might be bad news for the Java application, but at least it will not bring down the whole system.

I don't know of a counterpart to "-Xmx" in the .NET world. Process.MaxWorkingSet property allows for limiting the physical memory a process may occupy. I have read several postings recommending this approach to keep the whole .NET memory footprint low, but I am not so sure, plus setting Process.MaxWorkingSet requires admin privileges - something that application users will not (and should not) have.

A better choice is the Win32 API function SetProcessWorkingSetSize() with two special paramater values: -1.

From MSDN:

BOOL WINAPI SetProcessWorkingSetSize(
__in HANDLE hProcess,
__in SIZE_T dwMinimumWorkingSetSize,
__in SIZE_T dwMaximumWorkingSetSize
);

If both dwMinimumWorkingSetSize and dwMaximumWorkingSetSize have the value (SIZE_T)-1, the function temporarily trims the working set of the specified process to zero. This essentially swaps the process out of physical RAM memory.


And the good news: this does not require the user to have admin rights. By the way, SetProcessWorkingSetSize is what's being invoked when an application window is minimized, which explains the effect described above.

Of course there is quite an performance penalty associated with that approach as well, with lots of page faults following right after the call, but once all the pages required have been reloaded from the pagefile, the physical memory usage is limited to the bare minimum. And all that unused memory - as long as it is not being accessed, it will not be reloaded into physical memory. The same is true for .NET Assemblies which have been loaded, but are not currently used.

Obviously Windows' virtual memory implementation can not always swap out unused memory as aggressively. And it's my guess that what might hinder it furthermore is the constant relocation of objects within the native heap caused by garbage collection (which means a lot of different memory pages are being accessed over time, hence hardly ever paged to disk).

A Timer can be applied for repeated invocations of SetProcessWorkingSetSize(), with a reasonable interval between two calls of maybe 15 or 30 minutes (this depends heavily on the kind of application and its workload). Another possibility is to check from time to time on the physical memory being used, and once a certain amount has been reached the call to SetProcessWorkingSetSize() will occurr. A word of warning though - I do not advocate to invoke it too often either. Also, don't set the minimum and maximum working sizes (let the CLR take care of that), just use the -1 parameter values in order to swap out memory, after all that's what we are trying to achieve.

The complete code:

[DllImport("kernel32")]
static extern bool SetProcessWorkingSetSize(IntPtr handle, int minSize, int maxSize);

SetProcessWorkingSetSize(Process.GetCurrentProcess().Handle, -1, -1);


Anyway, our Citrix customers are happy again, and no one has ever screamed "Memory leak!" since we implemented that workaround.

Thursday, October 25, 2007

Five Easy Ways To Fail

Joel Spolsky describes the most common reasons for software projects to go awry in his latest articel "How Hard Could It Be? Five Easy Ways to Fail".

As kind of expected, a "mediocre team of developers" comes up as number one, and as usual Joel Spolsky describes it much more eloquently than I ever could:

#1: Start with a mediocre team of developers

Designing software is hard, and unfortunately, a lot of the people who call themselves programmers can't really do it. But even though a bad team of developers tends to be the No. 1 cause of software project failures, you'd never know it from reading official postmortems.

In all fields, from software to logistics to customer service, people are too nice to talk about their co-workers' lack of competence. You'll never hear anyone say "the team was just not smart enough or talented enough to pull this off." Why hurt their feelings?

The simple fact is that if the people on a given project team aren't very good at what they do, they're going to come into work every day and yet--behold!--the software won't get created. And don't worry too much about HR standing in your way of hiring a bunch of duds. In most cases, I assure you they will do nothing to prevent you from hiring untalented people.


I tend to question the four other reasons he mentions though (mainly estimating and scheduling issues). Don't get me wrong, he surely got his points, but I would rank other problem fields higher than that, lack of management support, amateurish requirements analysis or suffering from the NIH-syndrome among them.

Monday, October 15, 2007

Hints And Pitfalls In Database Development (Part 5): The Importance Of Choosing The Right Clustered Index

In database design, a clustered index defines the physical order in which data rows are stored on disk (Note: the most common data structure for storing rows both in memory and on disk are B-trees, so the term "page" can also be interpretated as "B-tree leaf node" in the following text, although it's not necessarily a 1:1 match - but you get the point). In most cases the default clustered index is the primary key. The trouble starts when people don't spend any further thought and stick with that setting no matter whether the primary key is a good choice for physical ordering or not...

File I/O happens at a page level, so reading a row implies that all other rows stored within the same physical disk page are read as well. Wouldn't it make sense to align those rows together which are most likely to be fetched en bloc too? This limits the number of page reads, and avoids having to switch disk tracks (which would be a costly operation).



So the secret is to choose an attribute for clustering which causes the least overhead for I/O. Those rows that are most likely going to be accessed together should reside within the same page, or at least in pages next to each other.

Usually an auto-increment primary key is a good choice for a clustered index. Rows that have been created consecutively will then be stored consecutively, which fits in case they are likely to be accessed along with each other as well. On the other hand if a row contains a date column, and data is mainly being selected based on these date values, this column might be the right option for clustering. And for child rows it's probably a good idea to choose the foreign key column referencing the parent row for the table's clustered index - a parent row's child rows can then be fetched in one pass.

I work on a project that applies unique identifiers for primary keys. This has several advantages, the client being able to create primary keys in advance among them. But unique identifier primary keys are a bad choice for a clustered index, as their values disperse more or less randomly, hence the physical order on disk will be just as random. We have experienced a many-fold performance speedup by choosing more suitable columns for clustered indexing.

Previous Posts:

Wednesday, October 10, 2007

Software Development Projects: The Road To Success

Friday, October 05, 2007

Fun With WinDbg

I did some debugging on an old legacy reporting system this week, using WinDbg. The reporting engine terminated prematurely after something like 1000 printouts.

After attaching WinDbg and letting the reporter run for half an hour, a first chance exception breakpoint hit because of this memory access violation:

(aa8.a14): Access violation - code c0000005 (first chance)
First chance exceptions are reported before any exception handling.
This exception may be expected and handled.
eax=00000000 ebx=665b0006 ecx=7c80ff98 edx=00000000 esi=00000000 edi=00000000
eip=665a384f esp=0012bdc4 ebp=00000005 iopl=0 nv up ei pl zr na pe nc
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00010246
*** ERROR: Symbol file could not be found. Defaulted to export symbols for GEEI11.dll -
GEEI11!FUNPower+0x15f:
665a384f 668b7804 mov di,word ptr [eax+4] ds:0023:00000004=????


Trying to access address 0x0000004 ([EAX+4]), one of the reporting DLLs was obviously doing some pointer arithmetics on a NULL pointer. The previous command was a call to GEEI11!WEP+0xb47c, which happened to be the fixup for GlobalAlloc:

665a3849 ff157c445b66 call dword ptr [GEEI11!WEP+0xb47c (665b447c)]
665a384f 668b7804 mov di,word ptr [eax+4] ds:0023:00000004=????


GlobalLock takes a global memory handle, locks it, and returns a pointer to the actual memory block, or NULL in case of an error. According to the Win32 API calling conventions (stdcall), EAX is used for 32bit return values.

The reporting engine code calling into GlobalLock was too optimistic and did not test for a NULL return value.

The next question was, why would GlobalLock return NULL? Most likely because of an invalid handle passed in. Where could the parameter be found? At the ESI register - it was the one pushed onto the stack before the call to GlobalAlloc, thus must be the one and only function parameter, and it is callee-saved, so GlobalAlloc had restored it in its epilog.

665a3848 56 push esi
665a3849 ff157c445b66 call dword ptr [GEEI11!WEP+0xb47c (665b447c)]

0:000> r
eax=00000000 ebx=665b0006 ecx=7c80ff98 edx=00000000 esi=00000000 edi=00000000
eip=665a384f esp=0012bdc4 ebp=00000005 iopl=0 nv up ei pl zr na pe nc
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00010246


As expected, ESI was 0x00000000, and GetLastError confirmed this as well:

0:000> !gle
LastErrorValue: (Win32) 0x6 (6) - Das Handle ist ungültig.
LastStatusValue: (NTSTATUS) 0xc0000034 - Der Objektname wurde nicht gefunden.


Doing some further research, I found out that the global memory handle was NULL because a prior invocation of GlobalAlloc had been unsuccessful. Again, the caller had not checked for NULL at that point. And GlobalAlloc failed because the system had run out of global memory handles, as there is an upper limit of 65535. The reporting engine leaked those handles, neglecting to call GlobalDelete() on time, and after a while (1000 reports) had run out of handles.

By the way, I could not figure out how to dump the global memory handle table in WinDbg. It seems to support all kinds of Windows handles, with the exception of global memory handles. Please drop me a line in case you know how to do that.

Now, there is no way to patch the reporting engine as it's an old third party binary, so the solution we will most likely implement is to restart the engine process after a while, so all handles are free'd by terminating the old process.

Saturday, September 29, 2007

Disdain Mediocrity

Ben Rady writes:

I do know some traits that good software developers all seem to share. One of them is a healthy disdain for mediocrity.

Good developers cannot stand sloppiness (in software, anyway). Apathy, haste, and carelessness send shivers down their spines. They may disagree on the best way to do things, but they all agree that things should be done the best way. And they’re constantly looking and learning to find exactly what the best way is. They realize that seeking it is an ever-changing, lifelong quest.


I couldn't agree more...