Parsing complex XML files

You are an experienced programmer and have a problem with the engine, shaders, or advanced effects? Here you'll get answers.
No questions about C++ programming or topics which are answered in the tutorials!
Post Reply
h.a.n.d
Posts: 15
Joined: Fri Feb 13, 2009 1:38 pm

Parsing complex XML files

Post by h.a.n.d »

Hi,

I'm recently using irrXML with my Irrlich projects and have some difficulties parsing an XML file with more than one of the same element:

i.e.

Code: Select all

<?xml version="1.0"?>
<config>
   <!-- This is a config file for the mesh viewer -->
    <models>
      <model file="dwarf.dea" />
      <model file="dwarf1.dea" />
      <model file="dwarf2.dea" />
   </models>
   <messageText caption="Irrlicht Engine Mesh Viewer">
     Welcome to the Mesh Viewer of the "Irrlicht Engine".
   </messageText>
</config>
<config>
   <!-- This is a config file for the mesh viewer -->
   <models>
      <model file="dwarf.dea" />
   </models>
   <messageText caption="Irrlicht Engine Mesh Viewer">
     Welcome to the Mesh Viewer of the "Irrlicht Engine".
   </messageText>
</config>
<config>
   <!-- This is a config file for the mesh viewer -->
   <models>
      <model file="dwarf.dea" />
   </models>
   <messageText caption="Irrlicht Engine Mesh Viewer">
     Welcome to the Mesh Viewer of the "Irrlicht Engine".
   </messageText>
</config>

Getting all config-elements is not that difficult, but how do I get i.e. all model-elements from the first config-element or all model-elements from all config-elements?

I've already used tinyXML for parsing XML and with that lib you can access directly all child elements of an parent element.
(i.e. give me all model(s) of the first config)

How can i accomplish that with the irrXML lib?

Thank you for your help!
JP
Posts: 4526
Joined: Tue Sep 13, 2005 2:56 pm
Location: UK
Contact:

Post by JP »

Is it even legal (technically) for XML to have the same named elements in sequence like that?
Image Image Image
geckoman
Posts: 143
Joined: Thu Nov 27, 2008 11:05 am
Location: Germany
Contact:

Post by geckoman »

Yes, but a root element is missing

<?xml version="1.0"?>

<root>

<config>
<!-- This is a config file for the mesh viewer -->
<models>
<model file="dwarf.dea" />
<model file="dwarf1.dea" />
<model file="dwarf2.dea" />
</models>
<messageText caption="Irrlicht Engine Mesh Viewer">
Welcome to the Mesh Viewer of the "Irrlicht Engine".
</messageText>
</config>
<config>
<!-- This is a config file for the mesh viewer -->
<models>
<model file="dwarf.dea" />
</models>
<messageText caption="Irrlicht Engine Mesh Viewer">
Welcome to the Mesh Viewer of the "Irrlicht Engine".
</messageText>
</config>
<config>
<!-- This is a config file for the mesh viewer -->
<models>
<model file="dwarf.dea" />
</models>
<messageText caption="Irrlicht Engine Mesh Viewer">
Welcome to the Mesh Viewer of the "Irrlicht Engine".
</messageText>
</config>

</root>
JP
Posts: 4526
Joined: Tue Sep 13, 2005 2:56 pm
Location: UK
Contact:

Post by JP »

the MeshViewer example in the irr SDK has an ok example of how to parse XML files and I should think that IrrXML provides some more complex samples too, have you looked at any of those?
Image Image Image
h.a.n.d
Posts: 15
Joined: Fri Feb 13, 2009 1:38 pm

Post by h.a.n.d »

Thanks for the fast replies:

@geckoman
You are right ...
Actually there is a root element in my live-project. For this post I used an extended version of the tutorial example from the irrXML page.
http://www.ambiera.com/irrxml/index.html

@JP
Thx for the tip looking into the examples I didnt do that yet ... I will reply
if this example covers what I am looking for.
h.a.n.d
Posts: 15
Joined: Fri Feb 13, 2009 1:38 pm

Post by h.a.n.d »

Okay I've checked the MeshViewer example, unfortunately this example only uses the easy case of reading XML files with only one element per tag like in the example of the irrXML tutorial.

So my question remains ...
aanderse
Posts: 155
Joined: Sun Aug 10, 2008 2:02 pm
Location: Canada

Post by aanderse »

Nest "while(reader->read())" loops.

Originally I didn't like IrrXml, but that was simply because I didn't understand it. Once I read the loadScene function in CSceneManager.cpp IrrXml made perfect sense and I'm quite fond of it now. Just read that function (and the sub functions it calls) and everything will make sense.
JP
Posts: 4526
Joined: Tue Sep 13, 2005 2:56 pm
Location: UK
Contact:

Post by JP »

yep that's a good point loadScene should be a prime example of how to read XML!
Image Image Image
h.a.n.d
Posts: 15
Joined: Fri Feb 13, 2009 1:38 pm

Post by h.a.n.d »

I toke a look to the loadScene() function and it really helps to understand how the irrXML lib works.

So one questions has been solved and one new is arising ;-)

In all read...() functions:

Code: Select all

				// read materials from attribute list
				io::IAttributes* attr = FileSystem->createEmptyAttributes(Driver);
				attr->read(reader);
is used to get the child elements of the parent element. (I'm guessing here)
Is this right and is this the only way to get only the children of the parent node(element) and not all elements with the same node name?
bitplane
Admin
Posts: 3204
Joined: Mon Mar 28, 2005 3:45 am
Location: England
Contact:

Post by bitplane »

Those methods only load IAttributes, which are Irrlicht's way of holding values for serializable types. They don't really have much to do with the XML reader or writer, and they can't have duplicate keys.

IrrXML just reads the file sequentially, you call read() to advance one node, followed by getNodeType then call the other methods to extract the data.

In your example you'd have a EXN_ELEMENT node named "config", then a EXN_COMMENT, followed by an EXN_ELEMENT node named models, an EXN_ELEMENT named "model" with an attribute count of 1, with the name "file" and the data "dwarf.dea", then a EXN_ELEMENT_END ( the "/>" ), then another EXN_ELEMENT node named "model" and so on. Parsing this would be as trivial as the meshviewer example.
Submit bugs/patches to the tracker!
Need help right now? Visit the chat room
h.a.n.d
Posts: 15
Joined: Fri Feb 13, 2009 1:38 pm

Post by h.a.n.d »

Thank you for all your help!

After studying the loadScene() function for a little bit longer and knowing that irrXML parses the XML file sequentially, I finally got it to work!

@bitplane Your suggestion was the tipping point ...

Thanks guys!!!

Here is my example code (only the important stuff):

Code: Select all

<?xml version="1.0"?>
<root>
<config>
   <!-- This is a config file for the mesh viewer -->
    <models>
      <model file="dwarf.dea" />
      <model file="dwarf1.dea" />
      <model file="dwarf2.dea" />
   </models>
   <messageText caption="Irrlicht Engine Mesh Viewer">
     Welcome to the Mesh Viewer of the "Irrlicht Engine".
   </messageText>
</config>
<config>
   <!-- This is a config file for the mesh viewer -->
   <models>
      <model file="dwarf.dea" />
   </models>
   <messageText caption="Irrlicht Engine Mesh Viewer">
     Welcome to the Mesh Viewer of the "Irrlicht Engine".
   </messageText>
</config>
<config>
   <!-- This is a config file for the mesh viewer -->
   <models>
      <model file="dwarf.dea" />
   </models>
   <messageText caption="Irrlicht Engine Mesh Viewer">
     Welcome to the Mesh Viewer of the "Irrlicht Engine".
   </messageText>
</config>
</root>

Code: Select all

typedef std::map<int, Config*> ConfigMap;
	ConfigMap configs;

void readXML() {

	irr::io::IrrXMLReader* xml = irr::io::createIrrXMLReader("../data/data.xml");
	int i = 0;

	// parse the file until end reached
	while(xml && xml->read())
	{
		bool endreached = false;
		
		switch(xml->getNodeType())
		{
			case irr::io::EXN_ELEMENT_END:
				if (irr::core::stringw(L"root")==xml->getNodeName()) {
					endreached = true;
				}
				break;
			case irr::io::EXN_ELEMENT:
				if (irr::core::stringw(L"config")==xml->getNodeName()) {
					readConfig(xml, i++);
				}
				break;
			default:
				break;
		}
		if (endreached)
			break;
	}
	// delete the xml parser after usage
	delete xml;

}

void readConfig(irr::io::IrrXMLReader* xml, int i) {
	while(xml && xml->read())
	{
		const irr::core::stringw name = xml->getNodeName();

		switch (xml->getNodeType())
		{
		case irr::io::EXN_ELEMENT_END:
			if (irr::core::stringw(L"config")==name) {
				return;
			}
			break;
		case irr::io::EXN_ELEMENT:
			if (irr::core::stringw(L"models")==name) {
				readModel(xml, i);
			}
			else
			if (irr::core::stringw(L"messageText")==name) {
				configs[i]->setMessageTextCaption(xml->getAttributeValue("caption"));
				configs[i]->setMessageText(xml->getNodeData());

			}
			else
			{
				std::cout << "Found unknown element in xml file" << irr::core::stringc(xml->getNodeName()).c_str() << std::endl;
			}
			break;
		default:
			break;
		}
	}
}
void readModel(irr::io::IrrXMLReader* xml, int i) {

	int s = 0;

	while(xml && xml->read())
	{
		switch (xml->getNodeType())
		{
		case irr::io::EXN_ELEMENT_END:
			if (irr::core::stringw(L"models")==xml->getNodeName()) {
				return;
			}
			break;
		case irr::io::EXN_ELEMENT:
			if (irr::core::stringw(L"model")==xml->getNodeName()) {
				configs[i]->setModels(xml->getAttributeValue("file"), s++);
			}
			break;
		default:
			break;
		}
	}
}
Post Reply