faqts : Computers : Programming : Languages : Tse : XML : Parser

+ Search
Add Entry AlertManage Folder Edit Entry Add page to http://del.icio.us/
Did You Find This Entry Useful?

1 of 1 people (100%) answered Yes
Recently 1 of 1 people (100%) answered Yes

Entry

TSE: XML: Parser: Simplest: Can you supply a very simple XML parser?

Dec 21st, 2004 11:38
Knud van Eeden,


----------------------------------------------------------------------
--- Knud van Eeden --- 21 December 2004 - 08:26 pm -------------------

TSE: XML: Parser: Simplest: Can you supply a very simple XML parser?

---

Given e.g. the following very simple XML example
(thus only tags, no attributes, and no text)
can you give a parser which parses this very simple XML?

--- cut here: begin --------------------------------------------------

<a>

  <b>



  <c>


  </c>


  <d>


 </d>

   </b>


 </a>

--- cut here: end ----------------------------------------------------

---
---

The structure it has to parse is the following:

Informal Backus Naur Form:

 -(<)-+-(/)-+-[name]-(>)-
      |     |
      +-->--+

---

Description:

 First you get a '<'

 Then you get possibly a '/'

 Then you get a name

 Then you get a '>'

---
---

The following (recursive) macro will walk this tree.

1. It starts with a start value of an integer depth.

2. When it does not find a slash in the current tag, it increases this
   integer depth with one.

3. When it does find a slash in the current tag, it decreases this
   integer depth with one.

4. When the integer depth is back to the start value it returns, else
   it calls the procedure recursively.

---

No error checking of matching of tag names is done.

---

The macro will so stop here if the total amount of open tags equals the
total amount of closed tags.

---
---

--- cut here: begin --------------------------------------------------

// library: block: check: tree: xml: simplest 
(filenamemacro=checblxt.s) [kn, ri, tu, 21-12-2004 20:23:23]

PROC PROCBlockCheckTreeXmlSimplest( INTEGER depthI )

 STRING nameS[255] = ""

 INTEGER slashfoundB = FALSE

 //

 LFind( "[~ ]", "xl" ) // skip spaces and empty lines

 //

 IF NOT ( CurrChar() == Asc( "<" ) ) // check if found the '<' of 
tag

  Warn( "XML: Error: No tag '<' found in XML block" )

  RETURN()

 ENDIF

 //

 NextChar() // current character processed, so skip it and goto next 
character

 //

 IF ( CurrChar() == Asc( "/" ) ) // check if found the '/' of an end 
tag

  slashfoundB = TRUE

  NextChar() // current character processed, so skip it and goto next 
character

 ENDIF

 //

 IF NOT LFind( "[a-zA-Z][a-zA-Z0-9]@\c", "xl" ) // get name of tag and 
skip it

  Warn( "XML: Error: No tag name found in XML block" )

  RETURN()

 ENDIF

 //

 nameS = GetFoundText() // get the found tag name

 //

 IF NOT( CurrChar() == Asc( ">" ) ) // check if found the '>' of tag

  Warn( "XML: Error: No tag '>' found in XML block" )

  RETURN()

 ENDIF

 //

 NextChar() // current character processed, so skip it and goto next 
character

 //

 IF slashfoundB

  Warn( "<" + "/" + nameS + ">" )

  depthI = depthI - 1

 ELSE

  Warn( "depth" + " " + "=" + " " + STR( depthI ) )

  Warn( "<" + nameS + ">" )

  depthI = depthI + 1

 ENDIF

 //

 IF depthI <= 1

  RETURN()

 ELSE

  PROCBlockCheckTreeXmlSimplest( depthI ) // recursive call

 ENDIF

 //

END


PROC Main()

 IF NOT ( IsBlockInCurrFile() )

  Warn( "XML: Error: Please mark a block first" )

  RETURN()

 ENDIF

 GotoBlockBegin()

 PROCBlockCheckTreeXmlSimplest( 1 )

END

--- cut here: end ----------------------------------------------------

---
---

Internet: see also:

---

TSE: XML: Parser: Link: Overview: Can you give me an overview of links?
http://www.faqts.com/knowledge_base/view.phtml/aid/32677/fid/1734

----------------------------------------------------------------------