Thursday, October 26, 2006

C# Anonymous Delegates: Your Stack or Mine?

I want to take a moment to call out what I think is an interesting part of my previous post. Anonymous delegates have an interesting capability. To illustrate, here's an example:

using System;
using System.Threading;

class Program
{
static void Main(string[] args)
{
// 'num' lives on the main thread's stack.
int num = 0;
Console.WriteLine("initial num=" + num);

Thread thread = new Thread(
delegate()
{
// Yet this delegate running on a different thread with
// it's own stack has access to 'num' as well.
num = 42;
});

thread.Start(); // Start the worker thread.
thread.Join(); // Wait until the worker thread has finished.

Console.WriteLine("final num=" + num);
}
}



It's pretty simple. This program declares and initializes a variable named num, creates and starts a thread that will set num to a different value, then waits for the thread to to complete. In the program's output we see that num has indeed been set to 42. Here's the view from the command line:


F:\tmp>csc Program.cs
Microsoft (R) Visual C# 2005 Compiler version 8.00.50727.42
for Microsoft (R) Windows (R) 2005 Framework version 2.0.50727
Copyright (C) Microsoft Corporation 2001-2005. All rights reserved.


F:\tmp>Program.exe
initial num=0
final num=42


Okay, that's as expected. But wait--num lives on the main thread's stack. How is it that the second thread we've created has access to num? num isn't on the second thread's stack and it would be a pretty scary thing if the second thread had direct access to the first thread's stack. All sorts of mayhem could ensue. Providing access to another thread's stack isn't exactly the sort of thing we want to happen in managed code.


Chicanery?


As it turns out, though it appears that num is a stack variable in the main method it is not. What we're experiencing here is a convenience provided by the compiler. Or rather, a trick. A sleight-of-hand. An illusion. num doesn't actually live on the stack.


In fact, the compiler has made num a field in a compiler generated class so that it can be made available on the heap to the delegate. Here's the compiled code, courtesy of .NET Reflector:


 


[CompilerGenerated]
private sealed class <>c__DisplayClass1
{
// Methods
public <>c__DisplayClass1() { }
public void <main>b__0()
{
this.num = 0x2a;
}
      // Fields
public int num;
}


The compiler has given our anonymous delegate method a name: it's <main>b__0() on this compiler generated class. As a member of this class it obviously has access to num. The illusion is completed by fitting out the Main method so that it has an instance of <>c__DisplayClass1 to use:


 


private static void Main(string[] args)
{
Program.<>c__DisplayClass1 class1 = new Program.<>c__DisplayClass1();
class1.num = 0;
Console.WriteLine("initial num=" + class1.num);

Thread thread1 = new Thread(new ThreadStart(class1.
b__0));
thread1.Start();
thread1.Join();

Console.WriteLine("final num=" + class1.num);
}


So now our Main method has an instance of an object on the heap and can set num to zero since num is a public field on the compiler generated class.


And so, both threads are accessing a variable--not on the stack, but on the heap and accessible to both threads. This helpful behind-the-scenes work by the C# compiler allows us to keep the simplicity of the original code in this example; otherwise, we'd need to implement something like what the compiler has done for us. 


 


Technorati tags: , ,

[Edit: Fix up formatting.]

Wednesday, October 18, 2006

Ctrl-C and the .NET console application

I occasionally find myself writing a .NET console application, usually a one-off program used to test code that I'm working on. C# seems well-disposed to writing such programs. For example, I might need an application to listen for messages from another application, whether over sockets, HTTP, or another protocol--the sort of application that would listen for and process incoming messages, continuing forever. It's the sort of app you simply run from a console window and when you're done give it the old Ctrl-C. Just hit Ctrl-C or Ctrl-Break on the console window and the application quickly goes away.

Now, there's actually something a little sinister going on here... when the app goes away it's actually exits rather abruptly. There’s no exception thrown, no catch or finally blocks executed--it just (poof) goes away. This isn’t often a problem for me, but if my program modifies some system state while it’s running and needs to revert that state on exit, well, then it is an issue.

As a simple example consider a program that logs incoming messages. It may have a file open and log messages as they come in. (Or it could alter a registry setting or some other system state to indicate its presence only to revert that setting on exit. Get the idea?) If the app doesn’t terminate normally, how can it clean up after itself?

First Try: Simple, but broken

To make this a bit more concrete consider this example code. The tricky bit is that when you press Ctrl-C to terminate the program there is no clean up being done. Specifically, the log file isn't flushed to disk. When I run this program and hit Ctrl-C the log file is left entirely empty. This is because the program hasn't written enough to the file to flush the file buffer to disk when the program is interrupted. Remember, this program isn't exiting normally when we use Ctrl-C to interrupt it. In this case, the text "Handling messages." never appears in the log.

using System;
using System.IO;
using System.Threading;

namespace Example
{
class Program
{
static void Main(string[] args)
{
try
{
using (FileStream fs = new FileStream("sample.log", FileMode.Append, FileAccess.Write, FileShare.None))
using (StreamWriter logWriter = new StreamWriter(fs))
{
Console.WriteLine(
"This is the basic program. It doesn't call the clean-up code if you use Ctrl-C, "
+ "and the log file isn't flushed to disk.");
ProcessIncomingMessages(logWriter);
}
}
finally
{
Console.WriteLine("Clean-up code invoked.");
}
}

static void ProcessIncomingMessages(TextWriter writer)
{
writer.WriteLine("Handling messages.");
Thread.Sleep(Timeout.Infinite); // Well, pretend it processes messages, okay?
}
}
}



Console.CancelKeyPress Event


As you might guess there is a way to handle Ctrl-C and Ctrl-Break and it's the Console's CancelKeyPress event. By providing a delegate you can have the runtime notify you when a "cancel key" has been pressed. Here I've



  • Added an anonymous delegate to clean up if Ctrl-C or Ctrl-Break is pressed. As you can see it flushes the writer and closes the file stream.
  • I've also moved the fs and logWriter variables outside their using statements to allow access from the finally block.

If you read this code from top-to-bottom it may appear unusual that the code in the CancelKeyPress delegate appears to be using logWriter and fs before either variable is assigned a non-null reference. In fact, what is really happening there is that we're declaring the code for the delegate in line and handing it off to the CancelKeyPress event--it isn't executed where its position in the source suggests. In this case it's only executed when Ctrl-C or Ctrl-Break is pressed.


        static void Main(string[] args)
{
FileStream fs = null;
StreamWriter logWriter = null;
try
{
Console.WriteLine(
"This shows use of the cancel handler.");
Console.CancelKeyPress += delegate
{
Console.WriteLine("Clean-up code invoked in CancelKeyPress handler.");
if (logWriter != null)
logWriter.Flush();
if (fs != null)
fs.Close();
// The application terminates directly after executing this delegate.
};

fs = new FileStream("sample.log", FileMode.Append, FileAccess.Write, FileShare.None);
logWriter = new StreamWriter(fs);
ProcessIncomingMessages(logWriter);
}
finally
{
Console.WriteLine("Clean-up code invoked in finally.");
if (logWriter != null)
logWriter.Flush();
if (fs != null)
fs.Close();
}
}

Consolidating Clean Up Code


Here I've opted to put the clean-up code in one place rather than duplicating it in the finally block. I've declared my own delegate type CleanUpMethod so I have a tidy, parameter-less delegate that I can also call from the finally block.


        delegate void CleanUpMethod();

static void Main(string[] args)
{
FileStream fs = null;
StreamWriter logWriter = null;

string cleanUpLocation = "handler.";
CleanUpMethod cleanUp =
delegate
{
Console.WriteLine("Clean-up code invoked in " + cleanUpLocation);
if (logWriter != null)
logWriter.Flush();
if (fs != null)
fs.Close();
};

try
{
Console.WriteLine(
"This shows use of a single, no-param clean-up handler.");
Console.CancelKeyPress +=
delegate
{
cleanUp();
// The application terminates directly after executing this delegate.
};

fs = new FileStream("sample.log", FileMode.Append, FileAccess.Write, FileShare.None);
logWriter = new StreamWriter(fs);
ProcessIncomingMessages(logWriter);
}
finally
{
cleanUpLocation = "finally.";
cleanUp();
}
}


But why put the clean-up code in a delegate instead of it's own method? One very helpful aspect of using an anonymous delegate in this example is that the delegate code has access to the local variables in this method. If you use Lutz Roeder's .NET Reflector to take a look at this program you will see the substantial amount of work that the compiler does for you under the covers in order to provide easy access to the local variables from the delegate method:


private static void Main(string[] args)
{
ConsoleCancelEventHandler handler1 = null;
Program.<>c__DisplayClass3 class1 = new Program.<>c__DisplayClass3();
class1.fs = null;
class1.logWriter = null;
class1.cleanUpLocation = "handler.";
class1.cleanUp = new Program.CleanUpMethod(class1.
b__0);
try
{
Console.WriteLine("This shows use of a single, no-param clean-up handler.");
if (handler1 == null)
{
handler1 = new ConsoleCancelEventHandler(class1.
b__1);
}
Console.CancelKeyPress += handler1;
class1.fs = new FileStream("sample.log", FileMode.Append, FileAccess.Write, FileShare.None);
class1.logWriter = new StreamWriter(class1.fs);
Program.ProcessIncomingMessages(class1.logWriter);
}
finally
{
class1.cleanUpLocation = "finally.";
class1.cleanUp();
}
}

[CompilerGenerated]
private sealed class <>c__DisplayClass3
{
// Methods
public <>c__DisplayClass3()
{
}

public void
b__0()
{
Console.WriteLine("Clean-up code invoked in " + this.cleanUpLocation);
if (this.logWriter != null)
{
this.logWriter.Flush();
}
if (this.fs != null)
{
this.fs.Close();
}
}

public void <main>b__1(object, ConsoleCancelEventArgs)
{
this.cleanUp();
}

// Fields
public Program.CleanUpMethod cleanUp;
public string cleanUpLocation;
public FileStream fs;
public StreamWriter logWriter;
}



I'm pleased the C# compiler does so much work just to make my life easier. :)



Technorati tags: , ,

Thursday, August 24, 2006

Red Flag

A red flag. It’s a warning. An alert. An indication of danger. A notification that something is amiss. There are red flags in the code we work on and the processes we follow. But do we see them? I missed a red flag recently. It happened like this:

I had this curious bug I was trying to fix. The behavior suggested that it was most likely corrupted or uninitialized memory. That’s what intuition borne of experience was telling me, anyway. Randomly timed incorrect behavior in code that was processing a static stream of data. The input data was constant from one run to the next, the bits flowing through the code always the same, but the end result varied pretty much randomly in where and when it failed.

This suggested to me that we were processing someone else’s data or uninitialized data (which is really just someone else’s data from within the same process).

This body of C++ code was unfamiliar to me, so I found myself picking the brains of a coworker who had been around a while. In discussing the bug I found myself looking over his shoulder as he scrolled through some of the code in question, and he commented on a variable assignment that wasn’t used later in the function.

It was one of those pfft moments. “Been there, done that, seen it a million times.” A thoughtless assignment statement that someone typed in but then lost their train of thought. It looked something like this:


void fn()
{
size_t cbBase;
void* pvData;

if (get_value("base", &cbBase, &pvData))
{
store_data("base", cbBase, pvData);

size_t cbExtended;
void* pvDataExtended;

if (get_value("extended", &cbExtended, &pvDataExtended))
{
store_data("extended", cbExtended, pvDataExtended);
cbBase = cbExtended;
}
}
}


And quickly we moved on to discuss what might really be wrong with the code. And that quickly I’d dismissed the red flag.

In a world where most of the code that I interact with is not my own, where dozens of changes wrought by numerous hands happen over a period of years can I really pass off a small, unexplained assignment like that above as an innocuous error? Any moderately complex code base will transmogrify over the years. Initial errors may indeed be simple coding issues that we wish would have been corrected by code review, but over time source code changes not randomly but with specific intent. And with any luck you have both bug reports and a source code revision system on which you can rely to find that intent.

The red flag, of course, was the meaningless assignment statement. More than a day later as I waded through diffs of check-ins from ages past I ran across the rationale for the assignment. In previous check-in an attempt was made to correct some bad behavior. A previous version of the code looked more like this:


void fn()
{
size_t cbBase;
void* pvData;

if (get_value("base", &cbBase, &pvData))
{
store_data("base", cbBase, pvData);

size_t cbExtended;
void* pvDataExtended;

if (get_value("extended", &cbExtended, &pvDataExtended))
{
store_data("extended", cbExtended, pvDataExtended);
cbBase = cbExtended;
}

if (cbBase < MINIMUM_EXPECTED_DATA_SIZE)
{
backfill_missing_extended_data();
}

}
}


Ah, the unexplained assignment was orphaned by a previous check-in. In an effort to correct a particular problem a developer had removed code but left behind an ineffective assignment. Interestingly--partly because I like a tidy ending--the bug the developer was fixing was strongly related to the bug I was pursuing. The original author’s intent for the assignment, it turns out, was probably not


cbBase = cbExtended;


But


cbBase += cbExtended;


I reintroduced the missing code and patched up the assignment to find that, very conveniently, my bug was fixed as well. In the end, yes, it was incorrectly initialized data. It just wasn't where I expected to look.

Funny thing, those red flags. They’re hard to see. Where have you seen them lately? (Or not?)

Monday, July 17, 2006

Disclaimer

All postings are provided "AS IS" with no warranties and confer no rights. Opinions expressed herein are those of their respective authors.

Nobody is obligated either to read this publication or to leave comments. Don't be a jerk. Comments may be disabled or moderated. Offensive comments, spam, misrepresentation and spam will likely not be tolerated. I reserve the right to edit, ignore, or delete any comment without notice.

Though you may suspect a post discusses your code, it does not. Perhaps some posts here discuss issues that may appear in real product code, but none of them show or discuss your code. Relax.

Curt Nichols
Contents are copyright Curtis Nichols, all rights reserved.

Friday, December 02, 2005

On Writing Specs

[This post is reproduced from a previous blog of mine, originally published August 20, 2005. I'm moving it here for good measure.]

I recently re-read Joel Spolsky's series on Painless Functional Specifications. It's a few years old now, but is still a pleasant reminder of what I think are the best reasons to write specs:

  1. Writing a spec forces us to think about what the software does. We have the opportunity to declare what the software does, how it does it, and why it does it. Should we need to change the design, it is preferable to do so up front, while we're spending cycles thinking about these things. Why? Because changing the design before construction begins is much less expensive than changing actual code, whether it be during the construction of the product, during testing, or after release. The easiest, cheapest, and certainly funnest time to fix design errors is before we've spent time and money constructing software based on a faulty idea of what the product does.
  2. Writing a spec provides us with a means of communicating about and refining what the product should be. We can share this document with product management, development, quality assurance, and even marketing. With input from all of the above, we can come to an agreement that the software described in the specification is indeed something that is, all at once, a) useful to the intended users, b) saleable to the intended users, c) can feasibly be constructed, and d) is testable. In order to satisfy these goals we can discuss the spec and adjust it accordingly. Change and compromise are expected.
  3. Writing a spec gives us a recorded document from which we can derive a list of things to do while we construct the product. We could attach dates to the items in that list, do some resource balancing and call it a schedule, if we're the sort who create schedules. Managing software construction and testing tends to be much easier to do when we have a list of tasks that need to be completed in order to deliver the product.

I realize that not everybody believes in doing some or all "design" work up front, but my experience tells me that every project of any complexity needs some sort of spec before the construction work starts in earnest. Call it what you will and record it in whatever form you wish, but do yourself the favor of writing a functional specification.