Pretty Printing XML Attributes
by Josh Staiger
Bill poses an interesting question on his blog.
Why is element-based XML easier for humans to read than attribute-based XML?
IE:
<person> <firstName>Bill</firstName> <lastName>Higgins</lastName> <emailAddress>bhiggins@us.ibm.com</emailAddress> <city>Durham</city> <state>NC</state> </person>
is easier to read than:
<person firstName="Bill" lastName="Higgins" emailAddress="bhiggins@us.ibm.com" city="Durham" state="NC"/>
I think the answer is that we never learned good formating conventions for long lists of XML attributes. In such cases, most people are content to let the attributes run horizontally off the side of the page (sucks).
If we did the same thing with element-based XML, it would be hard to read as well:
<person><firstName>Bill</firstName><lastName>Higgins</lastName><emailAddress>bhiggins@us.ibm.com</emailAddress><city>Durham</city><state>NC</state></person>
I use the following convention for long attribute lists, which I've kind of borrowed from Lisp (all roads lead to Lisp):
<person firstName="Josh" lastName="Staiger" emailAddress="joshstaiger@gmail.com" city="Durham" state="NC" />
Much easier on the eyes, huh?
My algorithm goes something like this:
Write the first attribute on the same line as the tag name. Each consecutive attribute gets its own line and is indented to the level of the first attribute. Write the closing angle bracket on the same line as the last attribute.
I even wrote a SAX-based Python script which slurps XML and pretty prints it in this manner. I can dig it up and post if anyone is interested.
Comments
Interesting. With just that simple formatting, it becomes very easy for humans to read the attribute-based XML, indeed easier than the element-based XML, for the same reason that it's easier for the parser to read the attribute-based XML - there's more structure. So it was just a matter of putting the attribute-based XML on a level playing field with the element-based XML by providing a similar level of formatting.
I'll ask Balaji if he can throw a bone to the humans and allow us to optionally turn on pretty printing.
Perhaps there's a bit more to it?
Here's how I'd 'pretty print' it. Actually, this is frequently how I do any medium-complex XML stuff, like Ant scripts:
I think the prettiest attributed systems are still pretty dissatisfying when you have more than a simple linear structure:
versus:
Just to compare, the "Lisp" way:
If we were hardcore, we would put the person closing tag on the same line as the number attribute, but I'm not sure the world is ready for that ;)
Wow, that makes a world of difference! It very cleanly structures the relationship between element , attribute, and sub-element. The clean corners immediately direct your eyes to distinguish the difference between element nodes. The stairway shaped whitespace gives you an overview of the fragment's structure before examining any particular node. Lisp taught us that code could be data. Lo and behold, even XML can be data too ;)
Yes, it would be very interesting to get the code.