Posts Tagged C#

Serializing Exceptions to XML

kick it on DotNetKicks.com

Exceptions are fundamental to languages like Java and C#. They’re suppose to make error-handling easier than dealing with return codes, which is more common in earlier languages like C or C++. But many times, Exceptions will arise that were unanticipated, and will need to be reviewed by a developer to possibly make changes to the responsible code. It is therefore common practice to instrument some logging mechanism to record Exceptions where necessary.

Since XML has become so ubiquitous, XML is an obvious choice to represent the data structure of an Exception in. Any other format like JSON may work just fine, but more people are familiar with XML. Thus, recording an Exception as XML is a common means of capturing unrecognized errors in programs.

But what’s the easiest way to serialize an Exception into XML? If you’ve ever tried using the .NET 2.0 XML serializer—XmlSerializer class in System.Xml.Serialization—you’ll quickly find out that it’s not possible without a workaround. Any object implementing IDictionary, or any object with a member that implements IDictionary (e.g. the property Exception.Data) cannot be serialized. For example, try this in a console application:

new XmlSerializer(typeof(Exception))
    .Serialize(Console.Out, new Exception());

And you’ll see:

An unhandled exception of type ‘System.InvalidOperationException’ occurred in System.Xml.dll
Additional information: There was an error reflecting type ‘System.Exception’.

(Why isn’t IDictionary serializable? Maybe someone knows. Or, JFGI with q=”idictionary+serialize“.)

Even if XmlSerializer could serialize an Exception with its IDictionary Data property, the XML it generates isn’t necessarily what we’d want. Take, for example, this simple struct:

public struct Message
{
	public string Sender = "Chris";
	public DateTime Timestamp = DateTime.Now;
}

And serialize it using XmlSerializer. The resulting output is a full-fledged XML document:

<?xml version="1.0" encoding="IBM437"?>
<Message
	xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
	xmlns:xsd="http://www.w3.org/2001/XMLSchema">
  <Sender>Chris</Sender>
  <Timestamp>2008-09-10T10:14:00.117-07:00</Timestamp>
</Message>

If the Exception is simply being serialized to be added to an existing document or we’re just interested in the XML element that would represent the Exception, the declaration and its namespaces are extraneous. Also, what the hell is that encoding?! What—UTF-8 isn’t good enough for you, XmlSerializer?

Instead of using XmlSerializer, the new System.Xml.Linq API (new as of .NET 3.5) can be used very easily. By rolling our own serialization method using the new XML API, we can also control what information gets serialized—and what doesn’t. Capture all the important information, and don’t capture any of the unimportant information. For most situations, the “important information” is a short list. It’s typically the Exception’s:

  • Type
  • .Message
  • .StackTrace
  • .InnerException
  • .Data collection

In our XML data of the serialized Exception then, we’ll only include these members as XML elements, and if one of these members is missing from an Exception, instead of an empty element (e.g. <Message />, we’ll omit the node from the XML data to keep the data (size) small and tight.

The Exception we’re aiming for should look something like this:

<System.ArgumentException>
	<Message>URI is relative.</Message>
	<StackTrace>
		at ConsoleApp...
	</StackTrace>
</System.ArgumentException>

Using the new XML API is straight-forward. Since we want the root element to be the Exception’s Type, we create an XElement with the Type as its name:

XElement root = new XElement(exception.GetType().ToString());

To add the Exception’s Message as a child:

if (exception.Message != null)
{
	root.Add(new XElement("Message", exception.Message));
}

Note that if there are any unsanitary characters (characters that need escaping: <, >, &, ', and ") in exception.Message, the new XElement automatically escapes them.

Next, the StackTrace:

if (exception.StackTrace != null)
{
	root.Add(new XElement("StackTrace", exception.StackTrace));
}

One may wonder, “Why is the StackTrace checked for null?” Here’s why: Exceptions that are thrown will always have a StackTrace, but if an Exception is instantiated but not thrown, the StackTrace has no value (is null):

[Test] // Passes
public void NewExceptionHasNullStackTrace()
{
	Assert.IsNull(new Exception().StackTrace);
}

Now for that pesky Data member whose IDictionary Type is so loathed by XmlSerializer.

// Data is never null; it's empty if there is no data

if (exception.Data.Count > 0)
{
	root.Add
	(
		new XElement("Data",
			from entry in exception.Data.Cast<DictionaryEntry>()
			let key = entry.Key.ToString()
			let value = (entry.Value == null) ?
							"null" : entry.Value.ToString()
			select new XElement(key, value))
	);
}

If there are any items in the Data collection, a Data element is created, and each element of the Data collection are added into the XML for the Exception as children elements of Data:

<System.Exception>
	...
	<Data>
		<Uri>/assets/images/logo1.jpg</Uri>
		<StatusCode>404</StatusCode >
	</Data>
	...
</System.Exception>

The last element to add to this serialized Exception is its InnerException. To do this, we’ll recursively run the InnerException through the same process that the original Exception goes through. If you take a look at the full source code at the bottom of this post, you’ll see that the small snippets throughout this post are taken from a class that encapsulates these procedures and strongly-types this resulting XML data as an ExceptionXElement, inheritting from XElement.

if (exception.InnerException != null)
{
	root.Add
	(
		new ExceptionXElement
			(exception.InnerException, omitStackTrace)
	);
}

Once we have finished populating our root XElement with subelements, the XML markup can be retrieved using the ToString() method:

Console.WriteLine(root.ToString());

And the output:

<System.ArgumentException>
	<Message>URI is relative.</Message>
	<Data>
		<Uri>/assets/images/logo1.jpg</Uri>
	</Data>
	<StackTrace>
		at ConsoleApp...
	</StrackTrace>
</System.ArgumentException>

There you have it. Check out how this is all wrapped up in the full source below.

using System;
using System.Collections;
using System.Linq;
using System.Xml.Linq;

/// <summary>Represent an Exception as XML data.</summary>
public class ExceptionXElement : XElement
{
	/// <summary>Create an instance of ExceptionXElement.</summary>
	/// <param name="exception">The Exception to serialize.</param>
	public ExceptionXElement(Exception exception)
		: this(exception, false)
	{ }

	/// <summary>Create an instance of ExceptionXElement.</summary>
	/// <param name="exception">The Exception to serialize.</param>
	/// <param name="omitStackTrace">
	/// Whether or not to serialize the Exception.StackTrace member
	/// if it's not null.
	/// </param>
	public ExceptionXElement(Exception exception, bool omitStackTrace)
		: base(new Func<XElement>(() =>
		{
			// Validate arguments

			if (exception == null)
			{
				throw new ArgumentNullException("exception");
			}

			// The root element is the Exception's type

			XElement root = new XElement
				(exception.GetType().ToString());

			if (exception.Message != null)
			{
				root.Add(new XElement("Message", exception.Message));
			}

			// StackTrace can be null, e.g.:
			// new ExceptionAsXml(new Exception())

			if (!omitStackTrace && exception.StackTrace != null)
			{
				root.Add
				(
					new XElement("StackTrace",
						from frame in exception.StackTrace.Split('\n')
						let prettierFrame = frame.Substring(6).Trim()
						select new XElement("Frame", prettierFrame))
				);
			}

			// Data is never null; it's empty if there is no data

			if (exception.Data.Count > 0)
			{
				root.Add
				(
					new XElement("Data",
						from entry in
							exception.Data.Cast<DictionaryEntry>()
						let key = entry.Key.ToString()
						let value = (entry.Value == null) ?
							"null" : entry.Value.ToString()
						select new XElement(key, value))
				);
			}

			// Add the InnerException if it exists

			if (exception.InnerException != null)
			{
				root.Add
				(
					new ExceptionXElement
						(exception.InnerException, omitStackTrace)
				);
			}

			return root;
		})())
	{ }
}

kick it on DotNetKicks.com

Advertisements

Comments (14)

What should be in C#

Generic Indexers

C# has generic classes, interfaces, and methods. What about generic indexers?

Class indexers are properties (has a get accessor):

public object this[string key]
{
	get { return new Object(); }
}

Just as indexers are advantageous and so are generics, having generic indexers would be also advantageous:

public T this<T>[string key]
{
	get { return default(T); }
}

Properties cannot be generic in C#, ergo no generic indexers. While I haven’t needed a generic property yet (whether to use a property or method is mostly preferential, so a generic method can always be used instead), I have had scenarios where a generic indexer would be advantageous.

For example, wouldn’t it be cleaner to do this:

var settings = new Settings();
int timeout = settings<int>["CacheInMinutes"];

Than to have to unbox or convert, as is the case without generic properties, like this:

var settings = new Settings();
int timeout = int.Parse(settings["CacheInMinutes"]);

I, for one, think so. Why aren’t there generic properties? I’m not sure. A conscious language design decision? Some logistical conflict in the compiler? Maybe someone more informed knows.

Static Local Variables

This is one example where I think VB got it right and C# got it wrong.

Let’s say I have a class with a method, and that method needs a static variable. Only that method (one method) is going to use the static variable though:

	
/// <summary>
/// Uber complex class with all sorts of instance fields,
/// properties, static methods, interface methods, etc.
/// </summary>
public class Feed
{
	private static readonly string[] adultTerms = ...;

	public bool IsAdult
	{
		get { return Feed.IsAdultCheck(this); }
	}

	public static bool IsAdultCheck(Feed feed)
	{
		foreach(var value in Feed.xmlElements)
		{
			if (Feed.adultTerms.Contains(value))
				return true;
		}
		return false;
	}
}

(Ignore the fact that an array and Array.Contains() is used for/on adultTerms instead of a Dictionary, and that “value” isn’t ToLower()ed. I’m trying to illustrate a point here, nitpicker!)

Only IsAdultCheck() needs access to the value of adultTerms, but in C#, the whole class can have access to that variable. Hiding it from the rest of the class, within the method, removes noise from the class. When examining a class you’re not familiar with for the first time, the less glutter, the better.

Not convinced? What if IsAdultCheck() is just the beginning—there are other methods that also have their own groups of static fields?

Would you rather see this, if you had to work with the Feed class:

public class Feed
{
	#region These could all be static locals!

	// Used by IsAdultCheck()

	private static readonly string[] adultTerms;

	// Used by NextPaginated()

	private static int pageCurrent;
	private static int pageStart;
	private static int pageEnd;
	private static string paginatedFeedUrl;

	// Used by Whatever()

	public static bool whateverSetting;
	public static string whateverSetting2;
	public static DateTime whateverSetting3;

	#endregion

	public static string NextPaginated(Feed feed)
	{ /* ... */ }
	
	public static string IsAdultCheck(Feed feed)
	{ /* ... */ }

	public static void Whatever()
	{ /* ... */ }
}

Or would you rather have all these private static variables encapsulated in each method in which they’re exclusively used, as local static variables?

Like I said, VB got it right when static local variables were included, starting with VB 3.0. adultTerms can be encapsulated in IsAdultCheck():

	
Public Class Feed
	Public Shared Function IsAdultCheck(ByRef feed as Feed) as Boolean
		Static adultTerms() as String = ...
		
		For Each term as String in adultTerms
			If term.Contains(term) Then Return True
			Return False
		Next
	End Function
End Class

What’s the argument against including static local variables in C#? I understand the need to differentiate the two .NET languages, but c’mon, I know a good thing when I see it! In

Update: I emailed the C# Team Project Manager, Mads Torgenson, to find out why the decision was made not to have static local variables. I’ll paraphrase his response: “There’s no need. Create a new class and encapsulate the method(s) and the static variable(s) in there.” In general I agree that this is a good idea, but then I wouldn’t have anything to complain about.

Full Support for Variable Declaration in Statements

Here’s some valid C#:

foreach(var url in urls)
	// ...
for(int i = 0; i < length; i++)
	// ...
&#91;/sourcecode&#93;

&#91;sourcecode language="csharp"&#93;
using(var buffer = new MemoryStream())
	// ...
&#91;/sourcecode&#93;

Notice how all three "for," "foreach," and "using" statements allow the declaration of a variable in them? They must be special, for some odd reason. Because here's some C# that's a no-go:

&#91;sourcecode language="csharp"&#93;
if ((var metaKeywords = AppSettings&#91;"MetaKeywords"&#93;) != null)
	Page.SetMetaTags(MetaTags.Keywords, metaKeywords);
&#91;/sourcecode&#93;

What's good about the for loops and the using statement above is that since the variables declared in the statements are only going to be used within the scope of the statements, they're inaccessible to the rest of the method in which the statements resides. This is another example of encapuslation that reduces extraneous noise, much like when I wrote of method-local static variables above. If the rest of the method doesn't need to use the variables, <i>they shouldn't be able to</i>.

Just as these snippets wouldn't be advisable if "i" and "buffer" were only used within the scope of the statements:


var i = 0;
for(; i < length; i++)
	// ...
&#91;/sourcecode&#93;

&#91;sourcecode language="csharp"&#93;
var buffer = new MemoryStream();
using(buffer)
	// ...
&#91;/sourcecode&#93;

And just as this wouldn't make sense&mdash;and isn't even possible:

&#91;sourcecode language="csharp"&#93;
string url;
foreach(url in urls)
	// ...
&#91;/sourcecode&#93;

Neither does it make sense to do the following if "metaKeywords" isn't going to be used outside the scope of the if statement (which you currently <i>have</i> to do in C#):


var metaKeywords = AppSettings["MetaKeywords"];

if (metaKeywords != null)
	Page.SetMetaTags(MetaTags.Keywords, metaKeywords);

The reason an inline variable declaration in an if statement is probably illegal, as someone pointed out to me, is that a lot of C and C++ code is terse—like a competition for fewest lines. The downside is readability and maintainability, which go hand-in-hand. Sure, it’s cool to refactor 10 lines into two, but it’s usually not so cool when a different developer goes in and tries to debug or modify that code. But if the readability of a for statement, for example, isn’t an issue in C#, then it seems rather arbitrary to me that if statements would be.

Sure, there’s room for abuse, but I don’t need C# to save me from myself. Perhaps C# needs to save itself from developers, such as those from dynamic or scripting languages, where this practice is more common, and clean code is sparser.

Non-integral Enums

Ever find yourself switch()ing an enum, and assigning different values to an object based on the case of that enum? For example, say we’ve got this enumeration:

enum AmericanHoliday
{
	NewYearDay,
	IndependenceDay,   
	Christmas
}

And depending on the AmericanHoliday selection, we have a switch that returns a corresponding date:

private DateTime GetHolidayDate(AmericanHoliday holiday)
{
	switch (holiday)
	{
		case AmericanHoliday.NewYearDay:
			return DateTime.Parse("1/1");
		case AmericanHoliday.IndependenceDay:
			return DateTime.Parse("7/4");
		case AmericanHoliday.Christmas:
			return DateTime.Parse("12/25");
		default:
			throw new NotImplementedException();
	}
}

We have this switch because AmericanHoliday enumerations are can only be byte, sbyte, short, ushort, int, uint, long, or ulong. The problem with integral-only is that a switch (or if) statement has to be used if you want that enum to represent anything other than an integral type (i.e. a number).

But what if we could use different types, that didn’t have to be constants, in an enum, like a DateTime?

enum AmericanHoliday
{
	NewYearDay = DateTime.Parse("1/1"),
	IndependenceDay = DateTime.Parse("7/4"),
	Christmas = DateTime.Parse("12/25")
}

We could eliminate the conversion from an enum to a DateTime performed in GetHolidayDate(), and simply write:

DateTime holidayDate = AmericanHoliday.NewYearDay; 

Or how about we had an enum that represented an error:

enum AccessDeniedReason
{
	UnknownUser = "The specified user is unknown.",
	InvalidPassword = "The specified password was invalid.",
	AccountDeactivated = "The account has been deactivated."
}

There are ways of achieving something similar in C#, but none of the workarounds are very good. Perhaps implementing such a thing just isn’t practical in a statically-typed language.

kick it on DotNetKicks.com

Comments (6)

Returning null from a class constructor?!

Just as an exercise, I thought I’d examine common creation patterns in C# and how operator overloading can expand these.

Typically, if the parameters passed into a class constructor are not valid and thus cannot create an instance of itself, an Exception is thrown. ArgumentNullException is most common when a parameter is null and shouldn’t be. Also common are ArgumentExceptions and FormatExceptions, especially when a constructor takes in a string that must abide by a syntax and doesn’t.

public class Expression
{
	public Expression(string pattern)
	{
		if (Expression.IsInvalidPattern(pattern))
		{
			throw new ArgumentException("pattern is invalid");
		}
		
		// Do something
	}
}

Then a try/catch block is used to catch Exceptions when instantiating the type:

Expression expression;
try
{
	expression = new Expression(pattern);
}
catch(ArgumentException)
{
	// Abort
}

An alternative would be to simply indicate that the instantiated class is in an invalid state:

public class Expression
{
	public readonly bool IsInvalid = false;
	
	public Expression(string pattern)
	{
		if (Expression.IsInvalidPattern(pattern))
		{
			this.IsInvalid = true;
			return;
		}
		// Do something
	}
}

The invalid state is readonly so that it can only be set by the constructor, and is then checked for before performing any operations with the new Expression:

Expression expression = new Expression(pattern);

if (expression.IsInvalid)
{
	// Abort
}

The Parse/TryParse pattern is also common. Instead of an Exception, readonly field, or get-only property, a static TryParse method is used that returns true or false, depending on whether the type can be created:

public class Expression
{
	public static bool TryParse(string pattern,
		out Expression expression)
	{
		if (Expression.IsInvalidPattern(pattern))
		{
			expression = null;
			return false;
		}
		
		expression = new Expression(pattern);
		
		return true;
	}
}

TryParse() is then used as in this example:

Expression expression;

if (!Expression.TryParse(pattern, out expression))
{
	// Abort
}

Another pattern which I haven’t seen used allows an invalid object to emulate a null reference. There are better solutions than this, which is why it isn’t used, but by using operator overloading, it can be accomplished nonetheless.

public abstract class Errorable
{
	protected bool isError = false;

	public static bool operator ==(Errorable left, object right)
	{
		if (right == null)
		{
			return left.isError;
		}
		
		return (left.GetHashCode() == right.GetHashCode());
	}

	public static bool operator !=(Errorable left, object right)
	{
		if (right == null)
		{
			return !left.isError;
		}
		
		return (left.GetHashCode() != right.GetHashCode());
	}
} 

public class Expression : Errorable
{
	public Expression(string pattern)
	{
		if (Expression.IsInvalidPattern(pattern))
		{
			base.isError = true;
			return;
		}
		// Do something
    }
} 

An Expression can then be evaluated to test for null, although we know that “expression” really isn’t a null referenced, which is one reason why this pattern is probably a bad idea.

Expression expression = new Expression(pattern);

if (expression == null)
{
	// Abort
}

Lastly, overloading the ! operator allows us to evaluate an object as fasly, which other languages (e.g. JavaScript) do by default:

public abstract class Errorable
{
	protected bool isError = false;

	public static bool operator !(Errorable errorable)
	{
		if (errorable == null || errorable.isError);
	}
}

public class Expression : Errorable
{
	public Expression(string pattern)
	{
		if (Expression.IsInvalidPattern(pattern))
		{
			base.isError = true;
			return;
		}
		// Do something
    }
}

To test an Expression as invalid, the ! operator is used:

Expression expression = new Expression(pattern);

if (!expression)
{
	// Abort
}

The first two patterns are common, and probably why the last two aren’t used. Generally, if there’s already a good, well-established pattern, there’s no reason to try something else. The alternative shown creation patterns were simply an exercise in operator overloading.

kick it on DotNetKicks.com

Comments (4)