Entry
TSE: XML: Parser: Simplest: Can you supply a very simple XML parser?
Dec 21st, 2004 11:38
Knud van Eeden,
----------------------------------------------------------------------
--- Knud van Eeden --- 21 December 2004 - 08:26 pm -------------------
TSE: XML: Parser: Simplest: Can you supply a very simple XML parser?
---
Given e.g. the following very simple XML example
(thus only tags, no attributes, and no text)
can you give a parser which parses this very simple XML?
--- cut here: begin --------------------------------------------------
<a>
<b>
<c>
</c>
<d>
</d>
</b>
</a>
--- cut here: end ----------------------------------------------------
---
---
The structure it has to parse is the following:
Informal Backus Naur Form:
-(<)-+-(/)-+-[name]-(>)-
| |
+-->--+
---
Description:
First you get a '<'
Then you get possibly a '/'
Then you get a name
Then you get a '>'
---
---
The following (recursive) macro will walk this tree.
1. It starts with a start value of an integer depth.
2. When it does not find a slash in the current tag, it increases this
integer depth with one.
3. When it does find a slash in the current tag, it decreases this
integer depth with one.
4. When the integer depth is back to the start value it returns, else
it calls the procedure recursively.
---
No error checking of matching of tag names is done.
---
The macro will so stop here if the total amount of open tags equals the
total amount of closed tags.
---
---
--- cut here: begin --------------------------------------------------
// library: block: check: tree: xml: simplest
(filenamemacro=checblxt.s) [kn, ri, tu, 21-12-2004 20:23:23]
PROC PROCBlockCheckTreeXmlSimplest( INTEGER depthI )
STRING nameS[255] = ""
INTEGER slashfoundB = FALSE
//
LFind( "[~ ]", "xl" ) // skip spaces and empty lines
//
IF NOT ( CurrChar() == Asc( "<" ) ) // check if found the '<' of
tag
Warn( "XML: Error: No tag '<' found in XML block" )
RETURN()
ENDIF
//
NextChar() // current character processed, so skip it and goto next
character
//
IF ( CurrChar() == Asc( "/" ) ) // check if found the '/' of an end
tag
slashfoundB = TRUE
NextChar() // current character processed, so skip it and goto next
character
ENDIF
//
IF NOT LFind( "[a-zA-Z][a-zA-Z0-9]@\c", "xl" ) // get name of tag and
skip it
Warn( "XML: Error: No tag name found in XML block" )
RETURN()
ENDIF
//
nameS = GetFoundText() // get the found tag name
//
IF NOT( CurrChar() == Asc( ">" ) ) // check if found the '>' of tag
Warn( "XML: Error: No tag '>' found in XML block" )
RETURN()
ENDIF
//
NextChar() // current character processed, so skip it and goto next
character
//
IF slashfoundB
Warn( "<" + "/" + nameS + ">" )
depthI = depthI - 1
ELSE
Warn( "depth" + " " + "=" + " " + STR( depthI ) )
Warn( "<" + nameS + ">" )
depthI = depthI + 1
ENDIF
//
IF depthI <= 1
RETURN()
ELSE
PROCBlockCheckTreeXmlSimplest( depthI ) // recursive call
ENDIF
//
END
PROC Main()
IF NOT ( IsBlockInCurrFile() )
Warn( "XML: Error: Please mark a block first" )
RETURN()
ENDIF
GotoBlockBegin()
PROCBlockCheckTreeXmlSimplest( 1 )
END
--- cut here: end ----------------------------------------------------
---
---
Internet: see also:
---
TSE: XML: Parser: Link: Overview: Can you give me an overview of links?
http://www.faqts.com/knowledge_base/view.phtml/aid/32677/fid/1734
----------------------------------------------------------------------