|More Microsoft Fun
||[Feb. 17th, 2004|02:40 pm]
Just spent quite a bit of time on this one....|
You have a block of xml of the form <foo>bar</foo>, and you want to easily get out the "bar" bit. XmlTextReader seems to be the ideal thing - it's a really lightweight parser. You just call have a "while(xml.Read())" and a "if(xml.NodeName=="foo" && xml.NodeType == XmlNodeType.Element)" and you're away.
Then I came to add another "foo" tag to my xml. After hacking around with some code to spot when the user hit a button (without MCMS eating my events for me), I found a slight problem. It seems that XmlTextReader silently eats all subsequent tags of the same name at a given depth. No mention of this in the documentation, no talk of how to fix this. You only find it out by enabling debugging, and watching it step through.
Time to go investigate another XML parser.
All this when I was just starting to warm towards Dot Net again...
Update: It was a Microsoft bug. After another round of serious debugging, and a bit of old fashoned debugging (read printing stuff out to the console), I found the issue. If you have <foo>bar1</foo><foo>bar2</foo> then the parser treats the second tag as a null tag. However, if you put a \r\n between the two tags, then it works fine. Adding a third tag causes tags one and three to show up, and tag two's contents to be considered a text block. Own up, who was the muppet who didn't think to test on XML without some form of whitespace between the tags?