Thursday, February 15, 2007

C#: Type issues with ref parameters

I've witnessed a few instances where programmers have tripped up on types when using C#'s ref parameters. Here's a post illustrating this. The temptation is to think that because one can assign a derived type to a base type reference that the same should work with C#'s ref and out parameters.

It doesn't. It's simply not type safe.

The Problem

Here are the types from the above-mentioned post; I've added the FavoriteColor property for later discussion:

    public class Contact

    {

    }

 

    public class Recipient : Contact

    {

        public string FavoriteColor { get { return "Alice Blue"; } }

    }

For discussion's sake, here's the declaration of the method in question:

    bool FetchContact(ref Contact contact, uint row);

FetchContact's signature indicates that it can return a Contact via the contact parameter.

And now the code the poster wants to use:

    Recipient recipient = new Recipient();

    FetchContact(ref (Contact)recipient, row);

It certainly appears that the calling code desires to get a Recipient from the call to FetchContact, but that's not within the interface contract of FetchContact. FetchContact returns a Contact via its contact ref parameter.

Try, Try Again

Now, it's true that the following fragment of code works; it adheres to the interface contract specified by FetchContact:

    Recipient recipient = new Recipient();

    Contact contact = recipient;

    Fetcher.FetchContact(ref contact, row);

Well, of course that works, we've modified the code to pass a ref Contact as specified by the method signature. But why doesn't the first?

Type Safety

Let's try a different tack. To simplify, ignore for the moment that we're looking at a ref parameter. Instead, consider it an out parameter. And we'll also assume that there are more types derived from Contact, for example:

    public class Sender : Contact

    {

    }

Now, consider again, what type FetchContact is returning here as an out parameter?

    Contact contact = recipient;

    Fetcher.FetchContact(out contact, row);

The correct answer is "We're not sure." FetchContact could return a Contact, a Recipient, a Sender, or any other type derived from Contact. The problem of the original code is that it assumes that FetchContact is returning a Recipient and tries to coerce the type. This would not be typesafe as we don't know that the object returned is actually a Recipient.

Consider what happens when FetchContact is (legally) implemented like this:

    bool FetchContact(out Contact contact, uint row)

    {

        contact = new Sender();

        return true;

    }

The object passed back to the caller is obviously a Sender, not a Recipient. The only way for the caller to treat the returned value as a Recipient is to break type safety, and C# generally does not let one do this.

Hypothetically Speaking...

What if C# did let us ignore type safety? Let's suppose that this code could compile and execute:

    Recipient recipient = new Recipient();

    Fetcher.FetchContact(out (Contact)recipient, row);

 

    string fave = recipient.FavoriteColor;

Now FetchContact has returned a Sender that we've coerced into a Recipient reference. Not good. It might execute right up until we use the FavoriteColor property. Think about it. FetchContact returns an instance of Sender, which we're treating as a Recipient. But the Sender type doesn't have a FavoriteColor property. What would happen then? It would surely crash or return invalid data, just as we would see in C++.

C# prefers type safety and doesn't allow this to happen. That's a good thing.

But it's a ref, Not an out Parameter

I believe this is the cause of the confusion around this issue. ref parameters are both in and out parameters. Noone would mind if we passed a Recipient instead of a Contact as an in parameter so why can't we do it here? Because ref parameters, like out parameters, have a stricter requirement: you cannot assume the returned type is a more derived type than the parameter's given type. To do so would not be type safe. The parameter in question must be a ref Contact.

But This Works in Other Languages

This may sometimes work in languages that allow you to subvert type safety (most famously C and C++). It will only work so long as you are lucky about your class layouts or your assumptions remain true. When either class layouts or your assumptions change your code will no longer work.

Here's an example in C++. I run it in the debugger and it crashes on line 20 due to the type coercion done on line 41. This should not come as a surprise.

    1 #include <string>

    2 using std::string;

    3 

    4 class Contact

    5 {

    6 public:

    7     virtual ~Contact() { }

    8 };

    9 

   10 class Recipient : public Contact

   11 {

   12 public:

   13     Recipient()

   14         : m_favoriteColor("Alice Blue")

   15     {

   16     }

   17 

   18     string GetFavoriteColor() const

   19     {

   20         return m_favoriteColor;

   21     }

   22 

   23 private:

   24     int m_otherStuff[256];

   25     string m_favoriteColor;

   26 };

   27 

   28 class Sender : public Contact

   29 {

   30 };

   31 

   32 void FetchContact(unsigned row, Contact** ppContact)

   33 {

   34     *ppContact = new Sender();

   35 }

   36 

   37 int _tmain(int argc, _TCHAR* argv[])

   38 {

   39     Recipient* pRecipient;

   40 

   41     FetchContact(0, (Contact**)&pRecipient);

   42 

   43     string fave = pRecipient->GetFavoriteColor();

   44 

   45     delete pRecipient;

   46 

   47     return 0;

   48 }

 Hope that helps.

1 comment:

Ashok said...

Thanks for posting about this - this is a language design issue that's easy to gloss over; thanks for bringing attention to this limitation.

Similar (but not identical) restrictions are associated with passing generic types to functions in Java.