faqts : Computers : Programming : Languages : Tse : XML : Parser

+ Search
Add Entry AlertManage Folder Edit Entry Add page to http://del.icio.us/
Did You Find This Entry Useful?

0 of 1 people (0%) answered Yes
Recently 0 of 1 people (0%) answered Yes

Entry

TSE: XML: Parser: Simplest: Can you supply a simple XML parser with tag match check?

Dec 23rd, 2004 16:04
Knud van Eeden,


----------------------------------------------------------------------
--- Knud van Eeden --- 21 December 2004 - 09:27 pm -------------------

TSE: XML: Parser: Simplest: Can you supply a simple XML parser with 
tag match check?

---

Given e.g. the following very simple XML example
(thus only tags, no attributes, and no text)
can you give a parser which parses this very simple XML,
and also checks if the tags match?

---

This is an example with matching tags.
The macro should not give an error.

--- cut here: begin --------------------------------------------------

<a>

  <b>



  <c>


  </c>


  <d>


 </d>

   </b>


 </a>

--- cut here: end ----------------------------------------------------

---
---

This is an example with non matching tags.
The macro should give an error
(that tag 'a' does not match with 'g',
because there is no corresponding </a> tag
found here):

--- cut here: begin --------------------------------------------------

<a>

  <b>



  <c>


  </c>


  <d>


 </d>

   </b>


 </g>

--- cut here: end ----------------------------------------------------

---
---

The structure it has to parse is the following:

Informal Backus Naur Form:

 -(<)-+-(/)-+-[name]-(>)-
      |     |
      +-->--+

---

Description:

 First you get a '<'

 Then you get possibly a '/'

 Then you get a name

 Then you get a '>'

---
---

The following (recursive) macro will walk this tree.

1. It starts with a start value of an integer depth.

2. When it does not find a slash in the current tag, it increases this
   integer depth with one.

3. When it does find a slash in the current tag, it decreases this
   integer depth with one.

4. When the integer depth is back to the start value it returns, else
   it calls the procedure recursively.

---

Error checking of matching of tag names is done.
The macro will stop if an error is found here.

The method used to check the tag matching is

1. to store each found begin tag on a stack (a string is used as a
   stack here).
   So that stack contains all found begin tags, with the latest begin
   tag on top.

2. If an end tag is found, it should have the same latest begin tag on
   top of the stack (that is a property of correctly nested tags)

   1. If so it matches OK, and you just remove that begin tag from the
      stack

   2. If not, it gives a matching error and stops.

---

The macro will also stop here if the total amount of open tags equals
the total amount of closed tags.

---
---

--- cut here: begin --------------------------------------------------

[file: see: checblxu.s]

--- cut here: end ----------------------------------------------------

---
---

Internet: see also:

---

TSE: XML: Parser: Link: Overview: Can you give me an overview of links?
http://www.faqts.com/knowledge_base/view.phtml/aid/32677/fid/1734

---

Datastructure: Stack: What is a stack?
http://www.faqts.com/knowledge_base/view.phtml/aid/32760/fid/1279

----------------------------------------------------------------------