Fun with Pseudocode

A long time ago, I saw this on bash.org:

A programmer started to cuss
Because getting to sleep was a fuss
As he lay there in bed
Looping ’round in his head
was: while(!asleep()) sheep++

It inspired me to name my blog foreach(bill) paywith(skill), which was intended to “pseudocodify,” if you will, the expression, “Skills to pay the bills.”

Recently I asked the audience of StackOverflow.com if they knew of any others. Here’s a few of my favorites, some of them with my slight modifications.

Shakespeare:

question = 2*b || !(2*b)

Working for the man:

class Employee
{
	long hours;
	short lunch;
	byte pay;
}

A Star Wars reference to Yoda, who says, “Try not. Do… or do not. There is no try.”:

do() || !do()
// try()

A classic:

while(!(succeed = try());

From Futurama, in BASIC:

10 SIN
20 GOTO HELL

Ah, to be a kid again.

if (happy && know(it)) hands.clap();

Comments (1)

Recursively processing XML elements in C#

Recursion can be tricky. But when you’ve nailed it, it’s damn sexy. Below is a basic outline for recursively processing an XML document for all its child elements, and keeping track of what depth the element is at (how many ancestors, or parents, are above it):

void Process(XElement element, int depth)
{
	// For simplicity, argument validation not performed

	if (!element.HasElements)
	{
		// element is child with no descendants
	}
	else
	{
		// element is parent with children

		depth++;

		foreach (var child in element.Elements())
		{
			Process(child, depth);
		}

		depth--;
	}
}

To begin processing a document, pass in the root element and a depth of 0:

Process(XDocument.Load(@"C:\test.xml").Root, 0);

Here’s an example of recursively displaying an XML document:


void Process(XElement element, int depth)
{
	// For simplicity, argument validation not performed

	if (!element.HasElements)
	{
		Console.WriteLine
		(
			string.Format
			(
				"{0}<{1}>{2}</{1}>",
				"".PadLeft(depth, '\t'), // {0}
				element.Name.LocalName,  // {1}
				element.Value			 // {2}
			)
		);
	}
	else
	{
		Console.WriteLine
		(
			"".PadLeft(depth, '\t') + // Indent to show depth
			"<" + element.Name.LocalName + ">"
		);

		depth++;

		foreach (XElement child in element.Elements())
		{
			Process(child, depth);
		}

		depth--;

		Console.WriteLine
		(
			"".PadLeft(depth, '\t') + // Indent to show depth
			"</" + element.Name.LocalName + ">"
		);
	}
}

Processing an XML file recursively in using extension methods and lambdas:

using System;
using System.Linq;
using System.Xml.Linq;

class Program
{
    static void Main(string[] args)
    {
        XDocument.Load(@"C:\test.xml").Root.RecursivelyProcess
        (
            // Element with no children reach

            new Action<XElement, int>((child, depth) =>
            {
                // Example of something to do with the child

                Console.WriteLine
                (
                    string.Format
                    (
                        "{0}<{1}>{2}</{1}>",
                        "".PadLeft(depth, '\t'), // {0}
                        child.Name.LocalName,    // {1}
                        child.Value			     // {2}
                    )
                );
            }),

            // Element with children reached

            new Action<XElement, int>((parent, depth) =>
            {
                // Example of something to do with the parent open

                Console.WriteLine
                (
                    "".PadLeft(depth, '\t') // Indent to show depth
                    + "<" + parent.Name.LocalName + ">"
                );
            }),

            // Finished processing element with children

            new Action<XElement, int>((parent, depth) =>
            {
                // Example of something to do with the parent close

                Console.WriteLine
                (
                    "".PadLeft(depth, '\t') // Indent to show depth
                    + "</" + parent.Name.LocalName + ">"
                );
            })
        );

        Console.Read();
    }
}

/* Alternatively, for clarity, see this implimentation:
 *
class Program
{
    static void Main(string[] args)
    {
        XDocument.Load(@"C:\test.xml").Root.RecursivelyProcess
		(
			Program.ProcessChild,
			Program.ProcessParentOpen,
			Program.ProcessParentClose
		);

        Console.Read();
    }

    static void ProcessChild(XElement child, int depth)
    {
        Console.WriteLine
        (
            string.Format
            (
                "{0}<{1}>{2}</{1}>",
                "".PadLeft(depth, '\t'), // {0}
                child.Name.LocalName,    // {1}
                child.Value			     // {2}
            )
        );
    }

    static void ProcessParentOpen(XElement parent, int depth)
    {
        Console.WriteLine
        (
            "".PadLeft(depth, '\t') // Indent to show depth
            + "<" + parent.Name.LocalName + ">"
        );
    }

    static void ProcessParentClose(XElement parent, int depth)
    {
        Console.WriteLine
        (
            "".PadLeft(depth, '\t') // Indent to show depth
            + "</" + parent.Name.LocalName + ">"
        );
    }
}
*/

/// <summary>
/// Extension methods for the .NET 3.5 System.Xml.Linq namespace
/// </summary>
public static class XExtensions
{
    /// <summary>
    /// Recursively perform operations on a XML element.
    /// </summary>
    /// <param name="element"></param>
    /// <param name="childAction">
    /// What to do when an element with no children is reached.
    /// The XElement is the child element, the int is the depth.
    /// To do nothing, pass null.
    /// </param>
    /// <param name="parentOpenAction">
    /// What to do when an element with children is reached.
    /// The XElement is the parent element, the int is the depth.
    /// To do nothing, pass null.
    /// </param>
    /// <param name="parentCloseAction">
    /// What to do when finished processing element with children.
    /// The XElement is the parent element, the int is the depth.
    /// To do nothing, pass null.
    /// </param>
    public static void RecursivelyProcess
    (
        this XElement element,
        Action<XElement, int> childAction,
        Action<XElement, int> parentOpenAction,
        Action<XElement, int> parentCloseAction
    )
    {
        if (element == null)
        {
            throw new ArgumentNullException("element");
        }

        element.RecursivelyProcess
        (
            0,
            childAction,
            parentOpenAction,
            parentCloseAction
        );
    }

    private static void RecursivelyProcess
    (
        this XElement element,
        int depth,
        Action<XElement, int> childAction,
        Action<XElement, int> parentOpenAction,
        Action<XElement, int> parentCloseAction
    )
    {
        if (element == null)
        {
            throw new ArgumentNullException("element");
        }

        if (!element.HasElements)
        {
            // Reached the deepest child

            if (childAction != null)
            {
                childAction(element, depth);
            }
        }
        else
        {
            // element has children

            if (parentOpenAction != null)
            {
                parentOpenAction(element, depth);
            }

            depth++;

            foreach (XElement child in element.Elements())
            {
                child.RecursivelyProcess
                (
                    depth,
                    childAction,
                    parentOpenAction,
                    parentCloseAction
                );
            }

            depth--;

            if (parentCloseAction != null)
            {
                parentCloseAction(element, depth);
            }
        }
    }
}

Leave a Comment

Get the XPath to an XML Element (XElement)

Problem

Given an XElement, what is its XPath?

Preface

Many XPaths can be used that will select an XML element. To select one and only one element, indexes need to be used. For example, given the following XML data:

<people>
	<person>
			<name>
				<first>Chris</first>
			</name>
		</person>
	<person>
			...
</people>

The XPath “/people/person” will select all the “person” elements that are children to “people.” To refer to the first “person” element, its index needs to be used in the XPath.

Like an array, an [index] can be used to refer to a specific node. In XPath, indexes begin at 1, not 0. To select the first “person” element, then, the XPath “/people/person[1]” must be used.

I think of this as an “absolute” XPath, because it points directly to one specific element, and is not relative. Therefore, when I ask, “given an element, what is its XPath?”, I am referring to its absolute XPath—an XPath expression that will always return a specific element and its children, if it exists.

Solution

The following code snippet uses extension methods to get an absolute XPath to an XElement:

using System;
using System.Linq.Xml;

/// <summary>Extension methods for the .NET 3.5 System.Xml.Linq namespace</summary>
public static class XExtensions
{
    /// <summary>
    /// Get the absolute XPath to a given XElement
    /// (e.g. "/people/person[6]/name[1]/last[1]").
    /// </summary>
	/// <param name="element">
	/// The element to get the index of.
	/// </param>
    public static string AbsoluteXPath(this XElement element)
    {
        if (element == null)
        {
            throw new ArgumentNullException("element");
        }

        Func<XElement, string> relativeXPath = e =>
        {
            int index = e.IndexPosition();
            string name = e.Name.LocalName;

            // If the element is the root, no index is required

            return (index == -1) ? "/" + name : string.Format
            (
                "/{0}[{1}]",
                name,
                index.ToString()
            );
        };

        var ancestors = from e in element.Ancestors()
                        select relativeXPath(e);

        return string.Concat(ancestors.Reverse().ToArray()) +
               relativeXPath(element);
    }

    /// <summary>
    /// Get the index of the given XElement relative to its
    /// siblings with identical names. If the given element is
    /// the root, -1 is returned.
    /// </summary>
    /// <param name="element">
    /// The element to get the index of.
    /// </param>
    public static int IndexPosition(this XElement element)
    {
        if (element == null)
        {
            throw new ArgumentNullException("element");
        }

        if (element.Parent == null)
        {
            return -1;
        }

        int i = 1; // Indexes for nodes start at 1, not 0

        foreach (var sibling in element.Parent.Elements(element.Name))
        {
            if (sibling == element)
            {
                return i;
            }

            i++;
        }

        throw new InvalidOperationException
            ("element has been removed from its parent.");
    }
}

Leave a Comment

Static constructors

kick it on DotNetKicks.com

Recently on Stackoverflow.com there was an interesting question asking what “hidden” or not-well-known C# features people would like to share. The first thing that came to my mind was static constructors (just “ctors” henceforth). For example, this is legal C#:

public class Foo
{
	static Foo()
	{ }

	public Foo()
	{ }
}

public static class Bar
{
	static Bar()
	{ }
}

Well gee, that’s neat, but what good is it? Here’s what MSDN has to say about that:

A static constructor is used to initialize any static data, or to perform a particular action that needs performed once only. It is called automatically before the first instance is created or any static members are referenced.

Essentially, static ctors are commonly best used to set static members of the class (whether the class itself is static or not). With a static class, the only operations that execute when the class is created is the assignment of static field values, and the static ctor.

Here’s an example of how I use this C# feature to set the connection string for a DataContext wrapper named FeedBurner, which is the name of a database server:

public static class FeedBurner
{
    private static string connectionString;  

    static FeedBurner()
    {
        SettingManager.OnChange = () =>
        {
            FeedBurner.connectionString =
                    SettingManager.Setting
						("ConnectionString");
        };
    }  

    public static FeedBurnerDataContext DataContext
    {
        get
        {
            return new FeedBurnerDataContext
			    (FeedBurner.connectionString);
        }
    }

SettingManager is a class that watches files with configuration settings and fires off OnChange() whenever these files change. In this example, if the connection string to the FeedBurner database is changed, the entire class will automatically update the connection string it uses to talk with the FeedBurner database, without having to restart the application.

Beware, though. If an Exception occurs in the static ctor, a TypeInitializationException is thrown. From MSDN:

When a class initializer fails to initialize a type, a TypeInitializationException is created and passed a reference to the exception thrown by the type’s class initializer. The InnerException property of TypeInitializationException holds the underlying exception.

And when this occurs, the Visual Studio debugger does not drill down into the static ctor for you to debug it. Instead, if one gets a TypeInitialzationExceptiom, he or she should programmatically force the debugger to break, to examine the local variables and other run-time data.

public class Foo
{
    static Foo()
    {
#if DEBUG
        try
        {
#endif
            // Do something
#if DEBUG
        }
        catch (Exception ex)
        {
            System.Diagnostics.Debugger.Break();
        }
#endif
    }
}

Pretty ugly, huh?

Generally, static ctors should be avoided. They’re usually more trouble than their worth, and there’s usually a better way to accomplish a task than using a static ctor. But they’re a nifty feature! Here are the notes MSDN offers on them:

A static constructor does not take access modifiers or have parameters.

A static constructor is called automatically to initialize the class before the first instance is created or any static members are referenced.

A static constructor cannot be called directly.

The user has no control on when the static constructor is executed in the program.

A typical use of static constructors is when the class is using a log file and the constructor is used to write entries to this file.

Static constructors are also useful when creating wrapper classes for unmanaged code, when the constructor can call the LoadLibrary method.

If a static constructor throws an exception, the runtime will not invoke it a second time, and the type will remain uninitialized for the lifetime of the application domain in which your program is running.

Happy coding!

kick it on DotNetKicks.com

Leave a Comment

Hexadecimal value 0x is an invalid character

kick it on DotNetKicks.com

Ever get a System.Xml.XmlException that says:

“Hexadecimal value 0x[whatever] is an invalid character”

…when trying to load a XML document using one of the .NET XML API objects like XmlReader, XmlDocument, or XDocument? Was “0x[whatever]” by chance one of these characters?

0×00
0×01
0×02
0×03
0×04
0×05
0×06
0×07
0×08
0×0B
0×0C
0×0E
0×0F
0×10
0×11
0×12
0×13
0×14
0×15
0×1A
0×1B
0×1C
0×1D
0×1E
0×1F
0×16
0×17
0×18
0×19
0×7F

The problem that causes these XmlExceptions is that the data being read or loaded contains characters that are illegal according to the XML specifications. Almost always, these characters are in the ASCII control character range (think whacky characters like null, bell, backspace, etc). These aren’t characters that have any business being in XML data; they’re illegal characters that should be removed, usually having found their way into the data from file format conversions, like when someone tries to create an XML file from Excel data, or export their data to XML from a format that may be stored as binary.

The decimal range for ASCII control characters is 0 – 31, and 127. Or, in hex, 0×00 – 0×1F. (The control character 0×7F is not disallowed, but its use is “discouraged” to avoid compatibility issues.) If any character in the string or stream that contains the XML data contains one of these control characters, an XmlException will be thrown by whatever System.Xml or System.Xml.Linq class (e.g. XmlReader, XmlDocument, XDocument) is trying to load the XML data. In fact, if XML data contains the character ‘\b’ (bell), your motherboard will actually make the bell sound before the XmlException is thrown.

There are a few exceptions though: the formatting characters ‘\n’, ‘\r’, and ‘\t’ are not illegal in XML, per the 1.0 and 1.1 specifications, and therefore do not cause this XmlException. Thus, if you’re encountering XML data that is causing an XmlException because the data “contains invalid characters”, the feeds you’re processing need to be sanitized of illegal XML characters per the XML 1.0 specification (which is what System.Xml conforms to—not XML 1.1) should be removed. The methods below will accomplish this:

/// <summary>
/// Remove illegal XML characters from a string.
/// </summary>
public string SanitizeXmlString(string xml)
{
	if (xml == null)
	{
		throw new ArgumentNullException("xml");
	}

	StringBuilder buffer = new StringBuilder(xml.Length);

	foreach (char c in xml)
	{
		if (IsLegalXmlChar(c))
		{
			buffer.Append(c);
		}
	}

	return buffer.ToString();
}

/// <summary>
/// Whether a given character is allowed by XML 1.0.
/// </summary>
public bool IsLegalXmlChar(int character)
{
	return
	(
		 character == 0x9 /* == '\t' == 9   */          ||
		 character == 0xA /* == '\n' == 10  */          ||
		 character == 0xD /* == '\r' == 13  */          ||
		(character >= 0x20    && character <= 0xD7FF  ) ||
		(character >= 0xE000  && character <= 0xFFFD  ) ||
		(character >= 0x10000 && character <= 0x10FFFF)
	);
}

Useful as these methods are, don’t go off pasting them into your code anywhere. Create a class instead. Here’s why: let’s say you use the routine to sanitize a string in one section of code. Then another section of code uses that same string that has been sanitized. How does the other section positively know that the string doesn’t contain any control characters anymore, without checking? It doesn’t. Who knows where that string has been (if it’s been sanitized) before it gets to a different routine, further down the processing pipeline. Program defensive and agnostically. If the sanitized string isn’t a string and is instead a different type that represents sanitized strings, you can guarantee that the string doesn’t contain illegal characters.

Now, if the strings that need to be sanitized are being retrieved from a Stream, via a TextReader, for example, we can create a custom StreamReader class that will skip over illegal characters. Let’s say that you’re retrieving XML like so:

string xml;

using (WebClient downloader = new WebClient())
{
	using (TextReader reader =
		new StreamReader(downloader.OpenRead(uri)))
	{
		xml = reader.ReadToEnd();
	}
}

// Do something with xml...

You could use the sanitizing methods above like this:

string xml;

using (WebClient downloader = new WebClient())
{
	using (TextReader reader =
		new StreamReader(downloader.OpenRead(uri)))
	{
		xml = reader.ReadToEnd();
	}
}

// Sanitize the XML

xml = SanitizeXmlString(xml);

// Do something with xml...

But creating a class that inherits from StreamReader and avoiding the costly string-building operation performed by SanitizeXmlString() is much more efficient. The class will have to override a couple methods when it’s finished, but when it is, a Stream could be consumed and sanitized like this instead:

string xml;

using (WebClient downloader = new WebClient())
{
	using(XmlSanitizingStream reader =
		new XmlSanitizingStream(downloader.OpenRead(uri)))
	{
		xml = reader.ReadToEnd()
	}
}

// xml contains no illegal characters

The declaration for this XmlSanitizingStream, with IsLegalXmlChar() that we’ll need, looks like:

public class XmlSanitizingStream : StreamReader
{
	// Pass 'true' to automatically detect encoding using BOMs.
	// BOMs: http://en.wikipedia.org/wiki/Byte-order_mark

	public XmlSanitizingStream(Stream streamToSanitize)
		: base(streamToSanitize, true)
	{ }

	/// <summary>
	/// Whether a given character is allowed by XML 1.0.
	/// </summary>
	public static bool IsLegalXmlChar(int character)
	{
		return
		(
			 character == 0x9 /* == '\t' == 9   */          ||
			 character == 0xA /* == '\n' == 10  */          ||
			 character == 0xD /* == '\r' == 13  */          ||
			(character >= 0x20    && character <= 0xD7FF  ) ||
			(character >= 0xE000  && character <= 0xFFFD  ) ||
			(character >= 0x10000 && character <= 0x10FFFF)
		);
	}

	// ...

To get this XmlSanitizingStream working correctly, we’ll first need to override two methods integral to the StreamReader: Peek(), and Read(). The Read method should only return legal XML characters, and Peek() should skip past a character if it’s not legal.

	private const int EOF = -1;

	public override int Read()
	{
		// Read each char, skipping ones XML has prohibited

		int nextCharacter;

		do
		{
			// Read a character

			if ((nextCharacter = base.Read()) == EOF)
			{
				// If the char denotes end of file, stop
				break;
			}
		}

		// Skip char if it's illegal, and try the next

		while (!XmlSanitizingStream.
		        IsLegalXmlChar(nextCharacter));

		return nextCharacter;
	}

	public override int Peek()
	{
		// Return next legal XML char w/o reading it 

		int nextCharacter;

		do
		{
			// See what the next character is
			nextCharacter = base.Peek();
		}
		while
		(
			// If it's illegal, skip over
			// and try the next.

			!XmlSanitizingStream
			.IsLegalXmlChar(nextCharacter) &&
			(nextCharacter = base.Read()) != EOF
		);

		return nextCharacter;

	}

Next, we’ll need to override the other Read* methods (Read, ReadToEnd, ReadLine, ReadBlock). These all use Peek() and Read() to derive their returns. If they are not overridden, calling them on XmlSanitizingStream will invoke them on the underlying base StreamReader. That StreamReader will then use its Peek() and Read() methods, not the XmlSanitizingStream’s, resulting in unsanitized characters making their way through.

To make life easy and avoid writing these other Read* methods from scratch, we can disassemble the TextReader class using Reflector, and copy its versions of the other Read* methods, without having to change more than a few lines of code related to ArgumentExceptions.

The complete version of XmlSanitizingStream can be downloaded here. Rename the file extension to “.cs” from “.doc” after downloading.

kick it on DotNetKicks.com

Comments (15)

Serializing Exceptions to XML

kick it on DotNetKicks.com

Exceptions are fundamental to languages like Java and C#. They’re suppose to make error-handling easier than dealing with return codes, which is more common in earlier languages like C or C++. But many times, Exceptions will arise that were unanticipated, and will need to be reviewed by a developer to possibly make changes to the responsible code. It is therefore common practice to instrument some logging mechanism to record Exceptions where necessary.

Since XML has become so ubiquitous, XML is an obvious choice to represent the data structure of an Exception in. Any other format like JSON may work just fine, but more people are familiar with XML. Thus, recording an Exception as XML is a common means of capturing unrecognized errors in programs.

But what’s the easiest way to serialize an Exception into XML? If you’ve ever tried using the .NET 2.0 XML serializer—XmlSerializer class in System.Xml.Serialization—you’ll quickly find out that it’s not possible without a workaround. Any object implementing IDictionary, or any object with a member that implements IDictionary (e.g. the property Exception.Data) cannot be serialized. For example, try this in a console application:

new XmlSerializer(typeof(Exception))
    .Serialize(Console.Out, new Exception());

And you’ll see:

An unhandled exception of type ‘System.InvalidOperationException’ occurred in System.Xml.dll
Additional information: There was an error reflecting type ‘System.Exception’.

(Why isn’t IDictionary serializable? Maybe someone knows. Or, JFGI with q=”idictionary+serialize“.)

Even if XmlSerializer could serialize an Exception with its IDictionary Data property, the XML it generates isn’t necessarily what we’d want. Take, for example, this simple struct:

public struct Message
{
	public string Sender = "Chris";
	public DateTime Timestamp = DateTime.Now;
}

And serialize it using XmlSerializer. The resulting output is a full-fledged XML document:

<?xml version="1.0" encoding="IBM437"?>
<Message
	xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
	xmlns:xsd="http://www.w3.org/2001/XMLSchema">
  <Sender>Chris</Sender>
  <Timestamp>2008-09-10T10:14:00.117-07:00</Timestamp>
</Message>

If the Exception is simply being serialized to be added to an existing document or we’re just interested in the XML element that would represent the Exception, the declaration and its namespaces are extraneous. Also, what the hell is that encoding?! What—UTF-8 isn’t good enough for you, XmlSerializer?

Instead of using XmlSerializer, the new System.Xml.Linq API (new as of .NET 3.5) can be used very easily. By rolling our own serialization method using the new XML API, we can also control what information gets serialized—and what doesn’t. Capture all the important information, and don’t capture any of the unimportant information. For most situations, the “important information” is a short list. It’s typically the Exception’s:

  • Type
  • .Message
  • .StackTrace
  • .InnerException
  • .Data collection

In our XML data of the serialized Exception then, we’ll only include these members as XML elements, and if one of these members is missing from an Exception, instead of an empty element (e.g. <Message />, we’ll omit the node from the XML data to keep the data (size) small and tight.

The Exception we’re aiming for should look something like this:

<System.ArgumentException>
	<Message>URI is relative.</Message>
	<StackTrace>
		at ConsoleApp...
	</StackTrace>
</System.ArgumentException>

Using the new XML API is straight-forward. Since we want the root element to be the Exception’s Type, we create an XElement with the Type as its name:

XElement root = new XElement(exception.GetType().ToString());

To add the Exception’s Message as a child:

if (exception.Message != null)
{
	root.Add(new XElement("Message", exception.Message));
}

Note that if there are any unsanitary characters (characters that need escaping: <, >, &, ', and ") in exception.Message, the new XElement automatically escapes them.

Next, the StackTrace:

if (exception.StackTrace != null)
{
	root.Add(new XElement("StackTrace", exception.StackTrace));
}

One may wonder, “Why is the StackTrace checked for null?” Here’s why: Exceptions that are thrown will always have a StackTrace, but if an Exception is instantiated but not thrown, the StackTrace has no value (is null):

[Test] // Passes
public void NewExceptionHasNullStackTrace()
{
	Assert.IsNull(new Exception().StackTrace);
}

Now for that pesky Data member whose IDictionary Type is so loathed by XmlSerializer.

// Data is never null; it's empty if there is no data

if (exception.Data.Count > 0)
{
	root.Add
	(
		new XElement("Data",
			from entry in exception.Data.Cast<DictionaryEntry>()
			let key = entry.Key.ToString()
			let value = (entry.Value == null) ?
							"null" : entry.Value.ToString()
			select new XElement(key, value))
	);
}

If there are any items in the Data collection, a Data element is created, and each element of the Data collection are added into the XML for the Exception as children elements of Data:

<System.Exception>
	...
	<Data>
		<Uri>/assets/images/logo1.jpg</Uri>
		<StatusCode>404</StatusCode >
	</Data>
	...
</System.Exception>

The last element to add to this serialized Exception is its InnerException. To do this, we’ll recursively run the InnerException through the same process that the original Exception goes through. If you take a look at the full source code at the bottom of this post, you’ll see that the small snippets throughout this post are taken from a class that encapsulates these procedures and strongly-types this resulting XML data as an ExceptionXElement, inheritting from XElement.

if (exception.InnerException != null)
{
	root.Add
	(
		new ExceptionXElement
			(exception.InnerException, omitStackTrace)
	);
}

Once we have finished populating our root XElement with subelements, the XML markup can be retrieved using the ToString() method:

Console.WriteLine(root.ToString());

And the output:

<System.ArgumentException>
	<Message>URI is relative.</Message>
	<Data>
		<Uri>/assets/images/logo1.jpg</Uri>
	</Data>
	<StackTrace>
		at ConsoleApp...
	</StrackTrace>
</System.ArgumentException>

There you have it. Check out how this is all wrapped up in the full source below.

using System;
using System.Collections;
using System.Linq;
using System.Xml.Linq;

/// <summary>Represent an Exception as XML data.</summary>
public class ExceptionXElement : XElement
{
	/// <summary>Create an instance of ExceptionXElement.</summary>
	/// <param name="exception">The Exception to serialize.</param>
	public ExceptionXElement(Exception exception)
		: this(exception, false)
	{ }

	/// <summary>Create an instance of ExceptionXElement.</summary>
	/// <param name="exception">The Exception to serialize.</param>
	/// <param name="omitStackTrace">
	/// Whether or not to serialize the Exception.StackTrace member
	/// if it's not null.
	/// </param>
	public ExceptionXElement(Exception exception, bool omitStackTrace)
		: base(new Func<XElement>(() =>
		{
			// Validate arguments

			if (exception == null)
			{
				throw new ArgumentNullException("exception");
			}

			// The root element is the Exception's type

			XElement root = new XElement
				(exception.GetType().ToString());

			if (exception.Message != null)
			{
				root.Add(new XElement("Message", exception.Message));
			}

			// StackTrace can be null, e.g.:
			// new ExceptionAsXml(new Exception())

			if (!omitStackTrace && exception.StackTrace != null)
			{
				root.Add
				(
					new XElement("StackTrace",
						from frame in exception.StackTrace.Split('\n')
						let prettierFrame = frame.Substring(6).Trim()
						select new XElement("Frame", prettierFrame))
				);
			}

			// Data is never null; it's empty if there is no data

			if (exception.Data.Count > 0)
			{
				root.Add
				(
					new XElement("Data",
						from entry in
							exception.Data.Cast<DictionaryEntry>()
						let key = entry.Key.ToString()
						let value = (entry.Value == null) ?
							"null" : entry.Value.ToString()
						select new XElement(key, value))
				);
			}

			// Add the InnerException if it exists

			if (exception.InnerException != null)
			{
				root.Add
				(
					new ExceptionXElement
						(exception.InnerException, omitStackTrace)
				);
			}

			return root;
		})())
	{ }
}

kick it on DotNetKicks.com

Comments (3)

Die, void instance methods

Written much JavaScript? How about Ruby? Play with .NET extensions lately? Heard of “fluent interfaces?” What do these have in common? Let me give you a couple examples that may clue you into what I have in mind.

An illustration using System.Xml.Linq extensions in .NET:

return this
	.Settings
	.Descendants("settings")
	.Descendants("setting")
	.Where(element => element.Attribute("key").Value == setting)
	.Single()
	.Value;

A JavaScript illustration using the jQuery framework:

$("loginStatus").addClass("errorMessage").show();

If you guessed that it’s the wicked sweet feeling you get after using method chaining, you’d be right… well, almost. In fact, it’s the construct of method chaining, or simply “chaining.”

Now that C#, as of version three, can be written more functionally and has prototypical extensions methods, you’re going to see more and more of this. Not necessarily more complex examples like fluent interfaces (there’s a lot of code behind the scenes required to write fluent intercaces—more than the problem probably warrants in most cases), but chaining in general.

Regardless, a sure-fire way to impede chainability—and make babies cry—is to have void instance methods. Why have a void method when you could return the instance instead?

public class Thingamijig
{
	// Bad, like sweaty gym socks 

	public void DoSomething()
	{
		// something.Do()
	}

	// Good, like cotton candy perfume

	public Thingamijig BetterDoSomething()
	{
		// something.Do(), but better:

		return this;
	}
}

Never return null from an instance method. Always return this instead. Tell ‘em I said so.

kick it on DotNetKicks.com

Leave a Comment

What should be in C#

Generic Indexers

C# has generic classes, interfaces, and methods. What about generic indexers?

Class indexers are properties (has a get accessor):

public object this[string key]
{
	get { return new Object(); }
}

Just as indexers are advantageous and so are generics, having generic indexers would be also advantageous:

public T this<T>[string key]
{
	get { return default(T); }
}

Properties cannot be generic in C#, ergo no generic indexers. While I haven’t needed a generic property yet (whether to use a property or method is mostly preferential, so a generic method can always be used instead), I have had scenarios where a generic indexer would be advantageous.

For example, wouldn’t it be cleaner to do this:

var settings = new Settings();
int timeout = settings<int>["CacheInMinutes"];

Than to have to unbox or convert, as is the case without generic properties, like this:

var settings = new Settings();
int timeout = int.Parse(settings["CacheInMinutes"]);

I, for one, think so. Why aren’t there generic properties? I’m not sure. A conscious language design decision? Some logistical conflict in the compiler? Maybe someone more informed knows.

Static Local Variables

This is one example where I think VB got it right and C# got it wrong.

Let’s say I have a class with a method, and that method needs a static variable. Only that method (one method) is going to use the static variable though:

/// <summary>
/// Uber complex class with all sorts of instance fields,
/// properties, static methods, interface methods, etc.
/// </summary>
public class Feed
{
	private static readonly string[] adultTerms = ...;

	public bool IsAdult
	{
		get { return Feed.IsAdultCheck(this); }
	}

	public static bool IsAdultCheck(Feed feed)
	{
		foreach(var value in Feed.xmlElements)
		{
			if (Feed.adultTerms.Contains(value))
				return true;
		}
		return false;
	}
}

(Ignore the fact that an array and Array.Contains() is used for/on adultTerms instead of a Dictionary, and that “value” isn’t ToLower()ed. I’m trying to illustrate a point here, nitpicker!)

Only IsAdultCheck() needs access to the value of adultTerms, but in C#, the whole class can have access to that variable. Hiding it from the rest of the class, within the method, removes noise from the class. When examining a class you’re not familiar with for the first time, the less glutter, the better.

Not convinced? What if IsAdultCheck() is just the beginning—there are other methods that also have their own groups of static fields?

Would you rather see this, if you had to work with the Feed class:

public class Feed
{
	#region These could all be static locals!

	// Used by IsAdultCheck()

	private static readonly string[] adultTerms;

	// Used by NextPaginated()

	private static int pageCurrent;
	private static int pageStart;
	private static int pageEnd;
	private static string paginatedFeedUrl;

	// Used by Whatever()

	public static bool whateverSetting;
	public static string whateverSetting2;
	public static DateTime whateverSetting3;

	#endregion

	public static string NextPaginated(Feed feed)
	{ /* ... */ }

	public static string IsAdultCheck(Feed feed)
	{ /* ... */ }

	public static void Whatever()
	{ /* ... */ }
}

Or would you rather have all these private static variables encapsulated in each method in which they’re exclusively used, as local static variables?

Like I said, VB got it right when static local variables were included, starting with VB 3.0. adultTerms can be encapsulated in IsAdultCheck():

Public Class Feed
	Public Shared Function IsAdultCheck(ByRef feed as Feed) as Boolean
		Static adultTerms() as String = ...

		For Each term as String in adultTerms
			If term.Contains(term) Then Return True
			Return False
		Next
	End Function
End Class

What’s the argument against including static local variables in C#? I understand the need to differentiate the two .NET languages, but c’mon, I know a good thing when I see it! In

Update: I emailed the C# Team Project Manager, Mads Torgenson, to find out why the decision was made not to have static local variables. I’ll paraphrase his response: “There’s no need. Create a new class and encapsulate the method(s) and the static variable(s) in there.” In general I agree that this is a good idea, but then I wouldn’t have anything to complain about.

Full Support for Variable Declaration in Statements

Here’s some valid C#:

foreach(var url in urls)
	// ...
for(int i = 0; i < length; i++)
	// ...
using(var buffer = new MemoryStream())
	// ...

Notice how all three “for,” “foreach,” and “using” statements allow the declaration of a variable in them? They must be special, for some odd reason. Because here’s some C# that’s a no-go:

if ((var metaKeywords = AppSettings["MetaKeywords"]) != null)
	Page.SetMetaTags(MetaTags.Keywords, metaKeywords);

What’s good about the for loops and the using statement above is that since the variables declared in the statements are only going to be used within the scope of the statements, they’re inaccessible to the rest of the method in which the statements resides. This is another example of encapuslation that reduces extraneous noise, much like when I wrote of method-local static variables above. If the rest of the method doesn’t need to use the variables, they shouldn’t be able to.

Just as these snippets wouldn’t be advisable if “i” and “buffer” were only used within the scope of the statements:

var i = 0;
for(; i < length; i++)
	// ...
var buffer = new MemoryStream();
using(buffer)
	// ...

And just as this wouldn’t make sense—and isn’t even possible:

string url;
foreach(url in urls)
	// ...

Neither does it make sense to do the following if “metaKeywords” isn’t going to be used outside the scope of the if statement (which you currently have to do in C#):

var metaKeywords = AppSettings["MetaKeywords"];

if (metaKeywords != null)
	Page.SetMetaTags(MetaTags.Keywords, metaKeywords);

The reason an inline variable declaration in an if statement is probably illegal, as someone pointed out to me, is that a lot of C and C++ code is terse—like a competition for fewest lines. The downside is readability and maintainability, which go hand-in-hand. Sure, it’s cool to refactor 10 lines into two, but it’s usually not so cool when a different developer goes in and tries to debug or modify that code. But if the readability of a for statement, for example, isn’t an issue in C#, then it seems rather arbitrary to me that if statements would be.

Sure, there’s room for abuse, but I don’t need C# to save me from myself. Perhaps C# needs to save itself from developers, such as those from dynamic or scripting languages, where this practice is more common, and clean code is sparser.

Non-integral Enums

Ever find yourself switch()ing an enum, and assigning different values to an object based on the case of that enum? For example, say we’ve got this enumeration:

enum AmericanHoliday
{
	NewYearDay,
	IndependenceDay,
	Christmas
}

And depending on the AmericanHoliday selection, we have a switch that returns a corresponding date:

private DateTime GetHolidayDate(AmericanHoliday holiday)
{
	switch (holiday)
	{
		case AmericanHoliday.NewYearDay:
			return DateTime.Parse("1/1");
		case AmericanHoliday.IndependenceDay:
			return DateTime.Parse("7/4");
		case AmericanHoliday.Christmas:
			return DateTime.Parse("12/25");
		default:
			throw new NotImplementedException();
	}
}

We have this switch because AmericanHoliday enumerations are can only be byte, sbyte, short, ushort, int, uint, long, or ulong. The problem with integral-only is that a switch (or if) statement has to be used if you want that enum to represent anything other than an integral type (i.e. a number).

But what if we could use different types, that didn’t have to be constants, in an enum, like a DateTime?

enum AmericanHoliday
{
	NewYearDay = DateTime.Parse("1/1"),
	IndependenceDay = DateTime.Parse("7/4"),
	Christmas = DateTime.Parse("12/25")
}

We could eliminate the conversion from an enum to a DateTime performed in GetHolidayDate(), and simply write:

DateTime holidayDate = AmericanHoliday.NewYearDay;

Or how about we had an enum that represented an error:

enum AccessDeniedReason
{
	UnknownUser = "The specified user is unknown.",
	InvalidPassword = "The specified password was invalid.",
	AccountDeactivated = "The account has been deactivated."
}

There are ways of achieving something similar in C#, but none of the workarounds are very good. Perhaps implementing such a thing just isn’t practical in a statically-typed language.

kick it on DotNetKicks.com

Comments (3)

Returning null from a class constructor?!

Just as an exercise, I thought I’d examine common creation patterns in C# and how operator overloading can expand these.

Typically, if the parameters passed into a class constructor are not valid and thus cannot create an instance of itself, an Exception is thrown. ArgumentNullException is most common when a parameter is null and shouldn’t be. Also common are ArgumentExceptions and FormatExceptions, especially when a constructor takes in a string that must abide by a syntax and doesn’t.

public class Expression
{
	public Expression(string pattern)
	{
		if (Expression.IsInvalidPattern(pattern))
		{
			throw new ArgumentException("pattern is invalid");
		}

		// Do something
	}
}

Then a try/catch block is used to catch Exceptions when instantiating the type:

Expression expression;
try
{
	expression = new Expression(pattern);
}
catch(ArgumentException)
{
	// Abort
}

An alternative would be to simply indicate that the instantiated class is in an invalid state:

public class Expression
{
	public readonly bool IsInvalid = false;

	public Expression(string pattern)
	{
		if (Expression.IsInvalidPattern(pattern))
		{
			this.IsInvalid = true;
			return;
		}
		// Do something
	}
}

The invalid state is readonly so that it can only be set by the constructor, and is then checked for before performing any operations with the new Expression:

Expression expression = new Expression(pattern);

if (expression.IsInvalid)
{
	// Abort
}

The Parse/TryParse pattern is also common. Instead of an Exception, readonly field, or get-only property, a static TryParse method is used that returns true or false, depending on whether the type can be created:

public class Expression
{
	public static bool TryParse(string pattern,
		out Expression expression)
	{
		if (Expression.IsInvalidPattern(pattern))
		{
			expression = null;
			return false;
		}

		expression = new Expression(pattern);

		return true;
	}
}

TryParse() is then used as in this example:

Expression expression;

if (!Expression.TryParse(pattern, out expression))
{
	// Abort
}

Another pattern which I haven’t seen used allows an invalid object to emulate a null reference. There are better solutions than this, which is why it isn’t used, but by using operator overloading, it can be accomplished nonetheless.

public abstract class Errorable
{
	protected bool isError = false;

	public static bool operator ==(Errorable left, object right)
	{
		if (right == null)
		{
			return left.isError;
		}

		return (left.GetHashCode() == right.GetHashCode());
	}

	public static bool operator !=(Errorable left, object right)
	{
		if (right == null)
		{
			return !left.isError;
		}

		return (left.GetHashCode() != right.GetHashCode());
	}
} 

public class Expression : Errorable
{
	public Expression(string pattern)
	{
		if (Expression.IsInvalidPattern(pattern))
		{
			base.isError = true;
			return;
		}
		// Do something
    }
} 

An Expression can then be evaluated to test for null, although we know that “expression” really isn’t a null referenced, which is one reason why this pattern is probably a bad idea.

Expression expression = new Expression(pattern);

if (expression == null)
{
	// Abort
}

Lastly, overloading the ! operator allows us to evaluate an object as fasly, which other languages (e.g. JavaScript) do by default:

public abstract class Errorable
{
	protected bool isError = false;

	public static bool operator !(Errorable errorable)
	{
		if (errorable == null || errorable.isError);
	}
}

public class Expression : Errorable
{
	public Expression(string pattern)
	{
		if (Expression.IsInvalidPattern(pattern))
		{
			base.isError = true;
			return;
		}
		// Do something
    }
}

To test an Expression as invalid, the ! operator is used:

Expression expression = new Expression(pattern);

if (!expression)
{
	// Abort
}

The first two patterns are common, and probably why the last two aren’t used. Generally, if there’s already a good, well-established pattern, there’s no reason to try something else. The alternative shown creation patterns were simply an exercise in operator overloading.

kick it on DotNetKicks.com

Comments (3)

Setting Meta Tags in ASP.NET with C# 3.0 Extensions

kick it on DotNetKicks.com

Here’s an extension on the System.Web.UI.Page class (the class for all “.aspx” files). The idea is to have keys in the Web.config appSettings that are the web site’s meta description and keyword tags, and they are set by calling this.SetMetaTags() on Page_Load() of your Page class.

using System;
using System.Web.UI;
using System.Web.UI.HtmlControls;

public partial class _Default: Page
{
	protected void Page_Load(object sender, EventArgs e)
	{
		// Set meta header tags; SetMetaTags throws
		// ArgumentNullException
		this.SetMetaTags(
			Website.AppSettings["DefaultMetaDescription"],
			Website.AppSettings["DefaultMetaKeywords"]);

	}
}

// This is actually in App_Code/Extensions, and PageExtensions is
// not within a namespace

public static class PageExtensions
{
	/// <summary>
	/// Sets the HTML meta description and keyword tags of a
	/// <c>Page</c>.
	/// </summary>
	/// <exception cref="ArgumentNullException">
	/// If description or keywords are null </exception>
	/// <exception cref="InvalidOperationException">
	/// If page.Header is null</exception>
	public static void SetMetaTags(this Page page, string description,
		string keywords)
	{
		if (description == null)
			throw new ArgumentNullException("description");

		if (keywords == null)
			throw new ArgumentNullException("keywords");

		HtmlHead header;

		if ((header = page.Header) == null)
				throw new InvalidOperationException
				("The page's markup must have runat=server in its <head>");

		ControlCollection headerControls = header.Controls;

		// New C# 3.0 feature: object initializers
		headerControls.Add(new HtmlMeta
		{
				Content = description,
				HttpEquiv = "description"
		});
		headerControls.Add(new HtmlMeta
		{
				Content = keywords,
				HttpEquiv = "keywords"
		});
	}
}

Comments (3)

Older Posts »