Problem
Given an XElement, what is its XPath?
Preface
Many XPaths can be used that will select an XML element. To select one and only one element, indexes need to be used. For example, given the following XML data:
<people> <person> <name> <first>Chris</first> </name> </person> <person> ... </people>
The XPath “/people/person” will select all the “person” elements that are children to “people.” To refer to the first “person” element, its index needs to be used in the XPath.
Like an array, an [index] can be used to refer to a specific node. In XPath, indexes begin at 1, not 0. To select the first “person” element, then, the XPath “/people/person[1]” must be used.
I think of this as an “absolute” XPath, because it points directly to one specific element, and is not relative. Therefore, when I ask, “given an element, what is its XPath?”, I am referring to its absolute XPath—an XPath expression that will always return a specific element and its children, if it exists.
Solution
The following code snippet uses extension methods to get an absolute XPath to an XElement:
using System;
using System.Linq.Xml;
/// <summary>Extension methods for the .NET 3.5 System.Xml.Linq namespace</summary>
public static class XExtensions
{
/// <summary>
/// Get the absolute XPath to a given XElement
/// (e.g. "/people/person[6]/name[1]/last[1]").
/// </summary>
/// <param name="element">
/// The element to get the index of.
/// </param>
public static string AbsoluteXPath(this XElement element)
{
if (element == null)
{
throw new ArgumentNullException("element");
}
Func<XElement, string> relativeXPath = e =>
{
int index = e.IndexPosition();
string name = e.Name.LocalName;
// If the element is the root, no index is required
return (index == -1) ? "/" + name : string.Format
(
"/{0}[{1}]",
name,
index.ToString()
);
};
var ancestors = from e in element.Ancestors()
select relativeXPath(e);
return string.Concat(ancestors.Reverse().ToArray()) +
relativeXPath(element);
}
/// <summary>
/// Get the index of the given XElement relative to its
/// siblings with identical names. If the given element is
/// the root, -1 is returned.
/// </summary>
/// <param name="element">
/// The element to get the index of.
/// </param>
public static int IndexPosition(this XElement element)
{
if (element == null)
{
throw new ArgumentNullException("element");
}
if (element.Parent == null)
{
return -1;
}
int i = 1; // Indexes for nodes start at 1, not 0
foreach (var sibling in element.Parent.Elements(element.Name))
{
if (sibling == element)
{
return i;
}
i++;
}
throw new InvalidOperationException
("element has been removed from its parent.");
}
}
Skip Sailors said
I like it for a display mechanism, but it’s not doing namespaces, so it’s not drag-and-drop locator of for documents with namespaces.
[TestMethod]
public void TestALotOfPaths()
{
IEnumerable elementList = testSchema.Descendants();
foreach (XElement expectedElement in elementList)
{
string path = expectedElement.AbsoluteXpath();
XElement actualElement = testSchema.XPathSelectElement(path);
// if expected element has a namespace then actual doesn’t match
Assert.AreEqual(expectedElement, actualElement);
}
}
jawahar said
That’s working.
Thank you that it saved a lot of time.