Page 1 of 1

Parsing complex XML files

Posted: Fri Feb 13, 2009 1:41 pm
by h.a.n.d
Hi,

I'm recently using irrXML with my Irrlich projects and have some difficulties parsing an XML file with more than one of the same element:

i.e.

Code: Select all

<?xml version="1.0"?>
<config>
   <!-- This is a config file for the mesh viewer -->
    <models>
      <model file="dwarf.dea" />
      <model file="dwarf1.dea" />
      <model file="dwarf2.dea" />
   </models>
   <messageText caption="Irrlicht Engine Mesh Viewer">
     Welcome to the Mesh Viewer of the "Irrlicht Engine".
   </messageText>
</config>
<config>
   <!-- This is a config file for the mesh viewer -->
   <models>
      <model file="dwarf.dea" />
   </models>
   <messageText caption="Irrlicht Engine Mesh Viewer">
     Welcome to the Mesh Viewer of the "Irrlicht Engine".
   </messageText>
</config>
<config>
   <!-- This is a config file for the mesh viewer -->
   <models>
      <model file="dwarf.dea" />
   </models>
   <messageText caption="Irrlicht Engine Mesh Viewer">
     Welcome to the Mesh Viewer of the "Irrlicht Engine".
   </messageText>
</config>

Getting all config-elements is not that difficult, but how do I get i.e. all model-elements from the first config-element or all model-elements from all config-elements?

I've already used tinyXML for parsing XML and with that lib you can access directly all child elements of an parent element.
(i.e. give me all model(s) of the first config)

How can i accomplish that with the irrXML lib?

Thank you for your help!

Posted: Fri Feb 13, 2009 1:49 pm
by JP
Is it even legal (technically) for XML to have the same named elements in sequence like that?

Posted: Fri Feb 13, 2009 2:26 pm
by geckoman
Yes, but a root element is missing

<?xml version="1.0"?>

<root>

<config>
<!-- This is a config file for the mesh viewer -->
<models>
<model file="dwarf.dea" />
<model file="dwarf1.dea" />
<model file="dwarf2.dea" />
</models>
<messageText caption="Irrlicht Engine Mesh Viewer">
Welcome to the Mesh Viewer of the "Irrlicht Engine".
</messageText>
</config>
<config>
<!-- This is a config file for the mesh viewer -->
<models>
<model file="dwarf.dea" />
</models>
<messageText caption="Irrlicht Engine Mesh Viewer">
Welcome to the Mesh Viewer of the "Irrlicht Engine".
</messageText>
</config>
<config>
<!-- This is a config file for the mesh viewer -->
<models>
<model file="dwarf.dea" />
</models>
<messageText caption="Irrlicht Engine Mesh Viewer">
Welcome to the Mesh Viewer of the "Irrlicht Engine".
</messageText>
</config>

</root>

Posted: Fri Feb 13, 2009 2:40 pm
by JP
the MeshViewer example in the irr SDK has an ok example of how to parse XML files and I should think that IrrXML provides some more complex samples too, have you looked at any of those?

Posted: Fri Feb 13, 2009 2:58 pm
by h.a.n.d
Thanks for the fast replies:

@geckoman
You are right ...
Actually there is a root element in my live-project. For this post I used an extended version of the tutorial example from the irrXML page.
http://www.ambiera.com/irrxml/index.html

@JP
Thx for the tip looking into the examples I didnt do that yet ... I will reply
if this example covers what I am looking for.

Posted: Fri Feb 13, 2009 3:57 pm
by h.a.n.d
Okay I've checked the MeshViewer example, unfortunately this example only uses the easy case of reading XML files with only one element per tag like in the example of the irrXML tutorial.

So my question remains ...

Posted: Fri Feb 13, 2009 4:15 pm
by aanderse
Nest "while(reader->read())" loops.

Originally I didn't like IrrXml, but that was simply because I didn't understand it. Once I read the loadScene function in CSceneManager.cpp IrrXml made perfect sense and I'm quite fond of it now. Just read that function (and the sub functions it calls) and everything will make sense.

Posted: Fri Feb 13, 2009 4:53 pm
by JP
yep that's a good point loadScene should be a prime example of how to read XML!

Posted: Sat Feb 14, 2009 5:03 pm
by h.a.n.d
I toke a look to the loadScene() function and it really helps to understand how the irrXML lib works.

So one questions has been solved and one new is arising ;-)

In all read...() functions:

Code: Select all

				// read materials from attribute list
				io::IAttributes* attr = FileSystem->createEmptyAttributes(Driver);
				attr->read(reader);
is used to get the child elements of the parent element. (I'm guessing here)
Is this right and is this the only way to get only the children of the parent node(element) and not all elements with the same node name?

Posted: Sat Feb 14, 2009 5:35 pm
by bitplane
Those methods only load IAttributes, which are Irrlicht's way of holding values for serializable types. They don't really have much to do with the XML reader or writer, and they can't have duplicate keys.

IrrXML just reads the file sequentially, you call read() to advance one node, followed by getNodeType then call the other methods to extract the data.

In your example you'd have a EXN_ELEMENT node named "config", then a EXN_COMMENT, followed by an EXN_ELEMENT node named models, an EXN_ELEMENT named "model" with an attribute count of 1, with the name "file" and the data "dwarf.dea", then a EXN_ELEMENT_END ( the "/>" ), then another EXN_ELEMENT node named "model" and so on. Parsing this would be as trivial as the meshviewer example.

Posted: Sat Feb 14, 2009 7:54 pm
by h.a.n.d
Thank you for all your help!

After studying the loadScene() function for a little bit longer and knowing that irrXML parses the XML file sequentially, I finally got it to work!

@bitplane Your suggestion was the tipping point ...

Thanks guys!!!

Here is my example code (only the important stuff):

Code: Select all

<?xml version="1.0"?>
<root>
<config>
   <!-- This is a config file for the mesh viewer -->
    <models>
      <model file="dwarf.dea" />
      <model file="dwarf1.dea" />
      <model file="dwarf2.dea" />
   </models>
   <messageText caption="Irrlicht Engine Mesh Viewer">
     Welcome to the Mesh Viewer of the "Irrlicht Engine".
   </messageText>
</config>
<config>
   <!-- This is a config file for the mesh viewer -->
   <models>
      <model file="dwarf.dea" />
   </models>
   <messageText caption="Irrlicht Engine Mesh Viewer">
     Welcome to the Mesh Viewer of the "Irrlicht Engine".
   </messageText>
</config>
<config>
   <!-- This is a config file for the mesh viewer -->
   <models>
      <model file="dwarf.dea" />
   </models>
   <messageText caption="Irrlicht Engine Mesh Viewer">
     Welcome to the Mesh Viewer of the "Irrlicht Engine".
   </messageText>
</config>
</root>

Code: Select all

typedef std::map<int, Config*> ConfigMap;
	ConfigMap configs;

void readXML() {

	irr::io::IrrXMLReader* xml = irr::io::createIrrXMLReader("../data/data.xml");
	int i = 0;

	// parse the file until end reached
	while(xml && xml->read())
	{
		bool endreached = false;
		
		switch(xml->getNodeType())
		{
			case irr::io::EXN_ELEMENT_END:
				if (irr::core::stringw(L"root")==xml->getNodeName()) {
					endreached = true;
				}
				break;
			case irr::io::EXN_ELEMENT:
				if (irr::core::stringw(L"config")==xml->getNodeName()) {
					readConfig(xml, i++);
				}
				break;
			default:
				break;
		}
		if (endreached)
			break;
	}
	// delete the xml parser after usage
	delete xml;

}

void readConfig(irr::io::IrrXMLReader* xml, int i) {
	while(xml && xml->read())
	{
		const irr::core::stringw name = xml->getNodeName();

		switch (xml->getNodeType())
		{
		case irr::io::EXN_ELEMENT_END:
			if (irr::core::stringw(L"config")==name) {
				return;
			}
			break;
		case irr::io::EXN_ELEMENT:
			if (irr::core::stringw(L"models")==name) {
				readModel(xml, i);
			}
			else
			if (irr::core::stringw(L"messageText")==name) {
				configs[i]->setMessageTextCaption(xml->getAttributeValue("caption"));
				configs[i]->setMessageText(xml->getNodeData());

			}
			else
			{
				std::cout << "Found unknown element in xml file" << irr::core::stringc(xml->getNodeName()).c_str() << std::endl;
			}
			break;
		default:
			break;
		}
	}
}
void readModel(irr::io::IrrXMLReader* xml, int i) {

	int s = 0;

	while(xml && xml->read())
	{
		switch (xml->getNodeType())
		{
		case irr::io::EXN_ELEMENT_END:
			if (irr::core::stringw(L"models")==xml->getNodeName()) {
				return;
			}
			break;
		case irr::io::EXN_ELEMENT:
			if (irr::core::stringw(L"model")==xml->getNodeName()) {
				configs[i]->setModels(xml->getAttributeValue("file"), s++);
			}
			break;
		default:
			break;
		}
	}
}