XML Parser

Discussion about everything. New games, 3d math, development tips...
Post Reply
RustyNail
Posts: 168
Joined: Fri Jun 02, 2006 1:49 pm
Contact:

XML Parser

Post by RustyNail »

What is wrong with me? I CANNOT read & understand other people's code.

Can anyone help me out with writing my own simple (very simple [extremely simple ] ) parser?
I would use one of the many ones that are floating around, but I just HAVE to understand their inner workings, and when I dig into the source code, it turns out to be too complex for me :evil: . :roll:
I could write a simple text format and work with that, but I've reached a point where it would be simpler to make it an XML file :? ...
Any Help?
I have recently discovered that both the Flu and my Algebra teacher have exact the same effect on my health: it quickly degrades.
Halifax
Posts: 1424
Joined: Sun Apr 29, 2007 10:40 pm
Location: $9D95

Post by Halifax »

Well why is it necessary to know the inner workings of the parser? It really isn't anything hard if you think of parsing as a simple set of rules. Believe me, the irrXML source isn't that hard to read, and you just need to practice, practice, and practice some more.

Reading code from other people is an art, and Irrlicht/irrXML is one of the best places you could learn from. Don't expect it to happen in just a day. For some it takes from 2 months->2 years for things to click together, etc. This occured with me when I used to read GCC (for fun). I didn't get it at first, but after working hard and practicing some coding, some two years later I became familiar with it, and it was very easy to read.

So my stance on this topic is just study some more. If you really want to learn parsing, then go look up some tutorials on google. There isn't anything short of 100 parsing tutorials out there on the internet.
TheQuestion = 2B || !2B
RustyNail
Posts: 168
Joined: Fri Jun 02, 2006 1:49 pm
Contact:

Post by RustyNail »

:oops: I was afraid I'd get that answer :)
Anyway it looks like I'm sticking to a simple text file for now...
while I dig through the source code of irrXML... ><
I guess part of the problem is my lack of knowledge of the finer points of C++ programming...
(And then there's the little thing that I want all the code to be my own :twisted: )
I have recently discovered that both the Flu and my Algebra teacher have exact the same effect on my health: it quickly degrades.
Halifax
Posts: 1424
Joined: Sun Apr 29, 2007 10:40 pm
Location: $9D95

Post by Halifax »

Don't take this the wrong way, but that is the sign of an immature programmer. That's a bad habit that you are most definitely going to need to get rid of if you ever intend to work in a team environment. Be it the game industry or an indie team online, people hate others that want everything to be their own.

This habit can lead you into a hole in which you are wasting your time by reinventing the wheel, instead of opting for using the already made wheel to produce a bigger and better object.
TheQuestion = 2B || !2B
RustyNail
Posts: 168
Joined: Fri Jun 02, 2006 1:49 pm
Contact:

Post by RustyNail »

heh.
that I know.
But still, I'm just so used to being dependant only on my self... It's difficult to get used to... ><
I have recently discovered that both the Flu and my Algebra teacher have exact the same effect on my health: it quickly degrades.
monkeycracks
Posts: 1029
Joined: Thu Apr 06, 2006 12:45 am
Location: Tennesee, USA
Contact:

Post by monkeycracks »

The only time I want everything to be my own is if everything I need is virally licensed D:
night_hawk
Posts: 153
Joined: Mon Mar 03, 2008 8:42 am
Location: Suceava - Romania
Contact:

Post by night_hawk »

@RustyNail: You could try a better, faster solution. Binary writing/reading. Google'it up and see :D It's easy and WAY better (on some aspects) than parsing from a text file. But then again... you would need a second tool to write such binary files, but that's the cost of performance.
RustyNail
Posts: 168
Joined: Fri Jun 02, 2006 1:49 pm
Contact:

Post by RustyNail »

@Halifax - btw, how many mature 14-year-olds who can program better than their computer class teacher have you seen :lol: ?
@night_hawk - I could have done that, but I need to be able to edit the file by hand...
I have recently discovered that both the Flu and my Algebra teacher have exact the same effect on my health: it quickly degrades.
CuteAlien
Admin
Posts: 9930
Joined: Mon Mar 06, 2006 2:25 pm
Location: Tübingen, Germany
Contact:

Post by CuteAlien »

Maybe you can check out an ini-file reader. I think you find some when searching the forum. That's a little easier to parse, so the code will be easier to read. Or you can try to learn about parsers. Like many topics thats an area that ranges from simple to extremely complex. Parsing xml is still more on the simple side of things, but parsing ini-files will be easier.

The basic structure of all parsers is usually like this:
1. Get some input. In this case probably by reading from a file
2. The lexical analysis - Tokenize the input. That means split the input into small parts (tokens). Every keyword will be one token. Every operator (like: '=' or '<') will be a token. Every value belonging together will be a token (like: "somestring"). The important thing is that you can define each token exactly and that tokens within the input will never overlap (so no character will ever belong to 2 tokens).
There exist a few techniques to help there. For example often so called "automata" are used. A finished token will usually know it's type (like: operator_assign or user_string) and also contain the string of the token (like: "=" or "hello world"). Whitespace is not a token and will be removed during this step.
3. The grammatical analysis. In this step you analyze the structure of the tokens. For example in xml a typical node might have to look like: (open_node, node_name, (attribute_type, operator_assign, user_string)*, close_node). The * stood here for: must not be there, but can be repeated as often as the user wants to repeat. It's a little bit more tricky in xml as nodes can contain other nodes (that's the main reason why parsing ini-files will be easier). Once again this sort of analysis can for example be done with automata.

In more complex parsers (like if you want to write a c++ parser) the steps are not that easy to separate (you will have additional pre-processing for commands like #define that modify the input itself and the tokens in c++ will depend on the grammar), but for xml those steps should suffice. Also in simpler parsers you sometimes mix the steps 2 and 3 simply together.

Parsing is a very useful technique which every programmer will need at some time, but it's not exactly the most trivial thing to do. And you can get around this rather long without having to do it. You might notice some similarity to doing regular expressions. If you learn to work with those first doing parsers will be a lot easier afterwards. Certainly the web is always your friend, there are lots of examples and tutorials for parsing out there.

Unlike night_hawk I would not recommend using binary files. Working with text-files is in nearly every case the better solution. The usual exception is if you have a speed problem.
IRC: #irrlicht on irc.libera.chat
Code snippet repository: https://github.com/mzeilfelder/irr-playground-micha
Free racer made with Irrlicht: http://www.irrgheist.com/hcraftsource.htm
Halifax
Posts: 1424
Joined: Sun Apr 29, 2007 10:40 pm
Location: $9D95

Post by Halifax »

RustyNail wrote:@Halifax - btw, how many mature 14-year-olds who can program better than their computer class teacher have you seen :lol: ?
@night_hawk - I could have done that, but I need to be able to edit the file by hand...
I have seen quite a few. And as I said, don't take it the wrong way. It isn't your age that makes you immature, it is your thought process. (Which is sometimes affected by your age. :D )

But anyways, yeah CuteAlien is leading you in the right direction if you want to go that way.
TheQuestion = 2B || !2B
night_hawk
Posts: 153
Joined: Mon Mar 03, 2008 8:42 am
Location: Suceava - Romania
Contact:

Post by night_hawk »

Well, I'm perfectly aware of issues that could appear when you want to edit such a binary file, but you could make a quick application to edit them and save them. Such an app should be quite easy to make... more or less :D

But then again, weird and lengthy solutions are always my idea as long as they're faster performance-wise. :D
RustyNail
Posts: 168
Joined: Fri Jun 02, 2006 1:49 pm
Contact:

Post by RustyNail »

Thanks, guys! :D
But, at the cost of an 'F' today during history, I managed to think up a workable parser...
Then wrote it when I got home (after an 'F' in chemistry and finding out that I didn't get into a school I wanted to get into), found that I was garbage, and re-wrote more along the lines of CuteAlien's design, but not separating bits into tokens, working a lot with delimiters 8) ...
I only need this as a working model, to read simple scene files, which is why I chose the XML format - for the ease of making a tree-like structure; That also happens to be the reason why I cannot really work with simple text files - It is a hell of a lot harder to make a tree structure; using the method I worked with them, each node would have had to have an ID, and it's parent's ID and so on... Not to mention huge amounts of nested if-else's in the parser :) .
Once I get this working, I'll play around with parsing files using the token method... Square equations, anyone? :)

EDIT: YES! It works! With attributes, and multiple nested elements! I can now make my INI File XML-based.... and actually load scenes... HUARRY! :P
I have recently discovered that both the Flu and my Algebra teacher have exact the same effect on my health: it quickly degrades.
Post Reply