faqts : Computers : Programming : Languages : Tse : Parser

+ Search
Add Entry AlertManage Folder Edit Entry Add page to http://del.icio.us/
Did You Find This Entry Useful?

Entry

TSE: Parser: Simple: Syntax: Structure: Parameter: How to parse '(parameter1,..., parameterlast)'?

Sep 24th, 2003 14:48
Knud van Eeden,


----------------------------------------------------------------------
--- Knud van Eeden --- 24 September 2003 - 07:09 pm ------------------

TSE: Parser: Simple: Syntax: Structure: Parameter: How to parse '( 
parameter1, ..., parameterlast )'?

To possibly parse the actual parameters of a procedure or a function:

For example given:

 PROCMyProcedure( parameter1, parameter2, parameter3, parameter4 )

how could you possibly parse '( parameter1, parameter2, parameter3, 
parameter4 )'?

---

Syntax diagram:

                                +----->---------------+
                                |                     |
-->--[(]-->-+->--[parameter]-->-+->--[,]-->-+-->-+-->-+-[)]-->--
            |                                    |
            +--------------<---------------------+

---

Backus Naur Form (BNF) in words:

'parameters' = 1. not followed by 1 'open parenthesis': error

               2. followed by 1 'open parenthesis'

                 1. not followed by 1 or more 'parameters': error

                 2. followed by 1 or more 'parameters'

                    1. not followed by a 'comma'

                       1. not followed by a 'closed parenthesis': error

                       2. followed by a 'closed parenthesis'

                    2. followed by a 'comma'

---

Note: so this structure is a clear binary tree,
all the time totally 2 possibilities per branch (a correct path, and a 
NOT correct path)

       NOT ok
      +-------
      |
      |
      |
      |             NOT ok
 -----+            +-------
      |            |
      |            |
      |            |
      |            |             NOT ok
      +------------+            +-------
       ok          |            |
                   |            |
                   |            |
                   |            |
                   +------------+
                    ok          |
                                |
                                |
                                |
                                +-------
                                 ok


---

If it is OK, you take off that word, and continue.

If it is NOT OK, you show an error message, and exit (=return) from 
this subroutine.

---

Pseudocode:

if not '(', error, return false
get '('
 repeat
  if not 'parameter', error, return false
  get 'parameter'
  if ',' get ',' : commaB=true
 until ( not commaB ) or ')'
 if not ')', error, return false
 get ')'

---

So you see that this source code strictly follows the binary tree, with
3 times a possible error, as a direct reflection of the 3 possible
error branches of the binary tree.

---

Translation of pseudocode in a TSE program:

 Message( FNParserSyntaxParameterB( "( parameter1 , parameter2 , 
parameter3 , parameter4 , parameter5 )" ) // gives e.g. TRUE // note: 
use also spaces between the commas ',' for simplicity


--- cut here ----------------------------------------------------------

FORWARD INTEGER PROC FNParserSyntaxParameterB( STRING s1 )

FORWARD INTEGER PROC FNStringCheckEqualB( STRING s1, STRING s2 )

FORWARD INTEGER PROC FNStringCheckIsVariableB( STRING s1 )

FORWARD INTEGER PROC FNStringGetLengthI( STRING s1 )

FORWARD INTEGER PROC FNTokenCheckIsCommaB( STRING s1 )

FORWARD INTEGER PROC FNTokenCheckIsParameterB( STRING s1 )

FORWARD INTEGER PROC FNTokenCheckIsParenthesisCloseB( STRING s1 )

FORWARD INTEGER PROC FNTokenCheckIsParenthesisOpenB( STRING s1 )

FORWARD INTEGER PROC FNTokenCheckIsVariableB( STRING s1 )

FORWARD PROC Main()

FORWARD STRING PROC FNStringDeleteWordFrontS( STRING s1, STRING s2 )

FORWARD STRING PROC FNStringGetCarS( STRING s1 )

FORWARD STRING PROC FNStringGetCdrS( STRING s1 )

FORWARD STRING PROC FNStringRemoveSpaceBeginS( STRING s1 )

// --- MAIN --- //

PROC Main()

 Warn( FNParserSyntaxParameterB( "( parameter1 , parameter2 , 
parameter3 , parameter4 , parameter5 )" ) ) // gives e.g. TRUE // 
note: use also spaces between the commas ',' for simplicity

 Warn( FNParserSyntaxParameterB( "( parameter1 parameter2 , 
parameter3 , parameter4 , parameter5 )" ) ) // gives e.g. FALSE // 
note: use also spaces between the commas ',' for simplicity

END

<F12> Main()

// --- LIBRARY --- //

// library: parser: syntax: parameter (filenamemacro=syntpasp.s) [kn, 
ni, we, 24-09-2003 18:55:42]

INTEGER PROC FNParserSyntaxParameterB( STRING inS )

 // e.g. PROC Main()

 // e.g.  Warn( FNParserSyntaxParameterB( "( parameter1 , parameter2 , 
parameter3 , parameter4 , parameter5 )" ) ) // gives e.g. TRUE // 
note: use also spaces between the commas ',' for simplicity

 // e.g.  Warn( FNParserSyntaxParameterB( "( parameter1 parameter2 , 
parameter3 , parameter4 , parameter5 )" ) ) // gives e.g. FALSE // 
note: use also spaces between the commas ',' for simplicity

 // e.g. END

 // e.g.

 // e.g. <F12> Main()

 STRING s[255] = inS

 INTEGER tokencommaB = FALSE

 IF NOT FNTokenCheckIsParenthesisOpenB( s ) Warn( 'error: no open 
parenthesis found:' + ' ^' + s ) RETURN( FALSE ) ENDIF // if not '(', 
error, return false

 s = FNStringGetCdrS( s ) // take off first word of the remaining 
string, thus take off '(' // get '('

 REPEAT // repeat

  IF NOT FNTokenCheckIsParameterB( s ) Warn( 'error: no parameter 
found:' + ' ^' + s ) RETURN( FALSE ) ENDIF // if not 'parameter', 
error, return false

  s = FNStringGetCdrS( s ) // take off first word of the remaining 
string, thus take off parameter // get 'parameter'

  IF FNTokenCheckIsCommaB( s ) s = FNStringGetCdrS( s ) tokencommaB = 
TRUE ENDIF // if ',' get ',' : commaB=true

 UNTIL ( NOT tokencommaB ) OR FNTokenCheckIsParenthesisCloseB( s ) // 
until ( not commaB ) or ')'

 IF NOT FNTokenCheckIsParenthesisCloseB( s ) Warn( 'error: no comma or 
close parenthesis found:' + ' ^' + s ) RETURN( FALSE ) ENDIF // if 
not ')', error, return false

 s = FNStringGetCdrS( s ) // take off first word of the remaining 
string, thus take off ')' // get ')'

 RETURN( TRUE )

END

// library: token: check: is: parenthesis: open 
(filenamemacro=chectopo.s) [kn, ni, we, 24-09-2003 18:49:37]

INTEGER PROC FNTokenCheckIsParenthesisOpenB( STRING s )

 // e.g. PROC Main()

 // e.g.  Message( FNTokenCheckIsParenthesisOpenB( "( test )" ) ) // 
gives TRUE

 // e.g. END

 // e.g.

 // e.g. <F12> Main()

 STRING firstwordS[255] = ""

 firstwordS = FNStringGetCarS( s ) // get first word

 RETURN( FNStringCheckEqualB( firstwordS, "(" ) )

END

// library: string: get: word: token: get: rest: FNCdr(): Get a 
string, without the first word (words delimited by a space " " (=space 
delimited list)). E.g. Message( FNStringGetCarS( "Knud is the 
best" ) ) gives "Knud" // (filenamemacro=getstgcd.s) [kn, ni, zo, 02-
08-1998 15:54:17]

STRING PROC FNStringGetCdrS( STRING s )

 // e.g. PROC Main()

 // e.g.  STRING s1[255] = FNInitializeNewStringS()

 // e.g.  s1 = FNStringGetInputS( "string: get: word: token: get: 
rest: s = ", "this is a test" )

 // e.g.  IF FNEscapeB( s1 ) RETURN() ENDIF

 // e.g.  Message( FNStringGetCdrS( s1 ) ) // gives e.g. "this is a 
test"

 // e.g. END

 // e.g.

 // e.g. <F12> Main()

 RETURN( FNStringRemoveSpaceBeginS( FNStringDeleteWordFrontS( 
FNStringRemoveSpaceBeginS( s ), FNStringGetCarS( s ) ) ) ) // Remove 
trailing spaces, determine the First word. Delete the first word 
(=FNCarS), and finally remove the trailing spaces from the result

END

// library: token: check: is: parameter (filenamemacro=chectoip.s) 
[kn, ni, we, 24-09-2003 18:50:18]

INTEGER PROC FNTokenCheckIsParameterB( STRING s )

 // e.g. PROC Main()

 // e.g.  Message( FNTokenCheckIsParameterB( "parameter1" ) ) // gives 
e.g. TRUE

 // e.g. END

 // e.g.

 // e.g. <F12> Main()

 STRING firstwordS[255] = ""

 firstwordS = FNStringGetCarS( s ) // get first word

 RETURN( FNTokenCheckIsVariableB( firstwordS ) ) // assume a parameter 
(to keep the simplicity) is a (usual) variable name

END

// library: token: check: is: comma (filenamemacro=chectoic.s) [kn, 
ni, we, 24-09-2003 18:50:22]

INTEGER PROC FNTokenCheckIsCommaB( STRING s )

 // e.g. PROC Main()

 // e.g.  Message( FNTokenCheckIsCommaB( ", test" ) ) // gives TRUE

 // e.g. END

 // e.g.

 // e.g. <F12> Main()

 STRING firstwordS[255] = ""

 firstwordS = FNStringGetCarS( s ) // get first word

 RETURN( FNStringCheckEqualB( firstwordS, "," ) )

END

// library: token: check: is: parenthesis: close 
(filenamemacro=chectopc.s) [kn, ni, we, 24-09-2003 18:50:15]

INTEGER PROC FNTokenCheckIsParenthesisCloseB( STRING s )

 // e.g. PROC Main()

 // e.g.  Message( FNTokenCheckIsParenthesisCloseB( ") test" ) ) // 
gives TRUE

 // e.g. END

 // e.g.

 // e.g. <F12> Main()

 STRING firstwordS[255] = ""

 firstwordS = FNStringGetCarS( s ) // get first word

 RETURN( FNStringCheckEqualB( firstwordS, ")" ) )

END

// library: string: get: word: token: get: first: FNStringGetCarS(): 
Get the first word of a string (words delimited by a space " " (=space 
delimited list)). (filenamemacro=getstgca.s) [kn, ni, zo, 02-08-1998 
15:54:17]

STRING PROC FNStringGetCarS( STRING s )

 // e.g. PROC Main()

 // e.g.  STRING s1[255] = FNInitializeNewStringS()

 // e.g.  s1 = FNStringGetInputS( "string: get: word: token: get: 
first: s = ", "this is a test" )

 // e.g.  IF FNEscapeB( s1 ) RETURN() ENDIF

 // e.g.  Message( FNStringGetCarS( s1 ) ) // gives e.g. "this"

 // e.g. END

 // e.g.

 // e.g. <F12> Main()

 //

 // variation: RETURN( FNStringGetTokenFirstS( s, " " ) )

 RETURN( GetToken( s, " ", 1 ) ) // faster, but not central

END

// library: string: equal: are two given strings equal? (stored 
in 'checstcf.s') [kn, zoe, wo, 04-10-2000 18:23:27]

INTEGER PROC FNStringCheckEqualB( STRING s1, STRING s2 )

 // e.g. PROC Main()

 // e.g.  STRING s1[255] = FNInitializeNewStringS()

 // e.g.  STRING s2[255] = FNInitializeNewStringS()

 // e.g.  s1 = FNStringGetInputS( "string: check: equal: first string 
= ", "a" )

 // e.g.  IF FNEscapeB( s1 ) RETURN() ENDIF

 // e.g.  s2 = FNStringGetInputS( "string: check: equal: second string 
= ", "a" )

 // e.g.  IF FNEscapeB( s2 ) RETURN() ENDIF

 // e.g.  Message( FNStringCheckEqualB( s1, s2 ) ) // gives e.g. TRUE 
when string1 is equal to string2

 // e.g. END

 // e.g.

 // e.g. <F12> Main()

 //

 // // <F12> PROCMessage( FNStringCheckEqualB( "knud", "knud" ) ) // 
gives TRUE

 // // <F12> PROCMessage( FNStringCheckEqualB( "knud", "van" ) ) // 
gives FALSE

 RETURN( s1 == s2 )

END

// library: string: space: remove: begin (filenamemacro=remostsb.s) 
[kn, ri, th, 15-02-2001 06:06:51]

STRING PROC FNStringRemoveSpaceBeginS( STRING s )

 // e.g. PROC Main()

 // e.g.  STRING s[255] = FNInitializeNewStringS()

 // e.g.  s = FNStringGetInputS( "string: space: remove: begin: string 
= ", "   test    " )

 // e.g.  IF FNEscapeB( s ) RETURN() ENDIF

 // e.g.  Message( "'", FNStringRemoveSpaceBeginS( s ), "'" ) // gives 
e.g. "test    "

 // e.g. END

 // e.g.

 // e.g. <F12> Main()

 RETURN( LTrim( s ) )

END

// library: string: word: token: delete: first: delete 1 occurence of 
a given other string in front of a given string. E.g. 
StringWordDeleteFront( "00001234567", "0" ) gives "0001234567". 
Possible application: deleting the '0' in front of a phonenumber (e.g. 
when dialing internationally). [kn, ni, zo, 02-08-1998 16:30:45]

STRING PROC FNStringDeleteWordFrontS( STRING s, STRING deleteS )

 // e.g. <F12> PROCMessage( FNStringDeleteWordFrontS( "this 
is", "this" ) ) // gives " is"

 // e.g. <F12> PROCMessage( FNStringDeleteWordFrontS( "the 
girl", "the" ) ) // gives " girl"

 // STRING PROC FNStringWordDeleteFirstS( STRING s, STRING deleteS )

 INTEGER lengthdeleteI = FNStringGetLengthI( deleteS )

 RETURN( SubStr( s, lengthdeleteI + 1, FNStringGetLengthI( s ) - 
lengthdeleteI ) )

END

// library: token: check: is: variable (filenamemacro=chectoiw.s) [kn, 
ni, we, 24-09-2003 23:29:05]

INTEGER PROC FNTokenCheckIsVariableB( STRING s )

 // e.g. PROC Main()

 // e.g.  Warn( FNTokenCheckIsVariableB( "thisisavariable1" ) ) // 
gives 'TRUE'

 // e.g.  Warn( FNTokenCheckIsVariableB( "thisis avariable1" ) ) // 
gives 'TRUE' // because of first word 'thisis' in string separated by 
spaces is a variable

 // e.g.  Warn( FNTokenCheckIsVariableB( "9thisis avariable1" ) ) // 
gives 'FALSE' // because of first word '9this' starting with a digit

 // e.g.  Warn( FNTokenCheckIsVariableB( "thisis&avariable1" ) ) // 
gives 'FALSE' // because of '&' character

 // e.g. END

 // e.g.

 // e.g. <F12> Main()

 STRING firstwordS[255] = FNStringGetCarS( s ) // take off first word 
(in a string separated by spaces)

 RETURN( FNStringCheckIsVariableB( firstwordS ) )

END

// library: string: line: length: what is the length 
(filenamemacro=getstgle.s) [kn, ri, wo, 25-11-1998 20:20:58]

INTEGER PROC FNStringGetLengthI( STRING s )

 // e.g. PROC Main()

 // e.g.  STRING s1[255] = FNInitializeNewStringS()

 // e.g.  s1 = FNStringGetInputS( "string: line: length: string 
= ", "this is a test" )

 // e.g.  IF FNEscapeB( s1 ) RETURN() ENDIF

 // e.g.  Message( FNStringGetLengthI( s1 ) ) // gives e.g. 14

 // e.g. END

 // e.g.

 // e.g. <F12> Main()

 //

 // e.g. // <F12> Message( FNStringGetLengthI( "knud" ) ) // gives 4

 // e.g. // <F12> Message( FNStringGetLengthI( "the" ) ) // gives 3

 RETURN( Length( s ) )

END

// library: string: check: is: variable (filenamemacro=chectoiv.s) 
[kn, ni, we, 24-09-2003 18:23:35]

INTEGER PROC FNStringCheckIsVariableB( STRING s )

 // e.g. PROC Main()

 // e.g.  Warn( FNStringCheckIsVariableB( "thisisavariable1" ) ) // 
gives 'TRUE'

 // e.g.  Warn( FNStringCheckIsVariableB( "thisis avariable1" ) ) // 
gives 'FALSE' // because of space

 // e.g.  Warn( FNStringCheckIsVariableB( "9thisis avariable1" ) ) // 
gives 'FALSE' // because of starting with a digit

 // e.g.  Warn( FNStringCheckIsVariableB( "thisis&avariable1" ) ) // 
gives 'FALSE' // because of '&' character

 // e.g. END

 // e.g.

 // e.g. <F12> Main()

 // RETURN( IsAlpha( s ) ) // variation

 INTEGER I = 0

 INTEGER totalI = FNStringGetLengthI( s )

 IF NOT ( s[1] IN 'A'..'Z', 'a'..'z', '_' ) RETURN( FALSE ) ENDIF // 
first character alphanumeric character or underscore

 I = 2 - 1

 REPEAT

  I = I + 1

  IF NOT ( s[I] IN 'A'..'Z', 'a'..'z', '_', '0'..'9' ) RETURN( FALSE ) 
ENDIF // each character (after the first) should be alphanumeric, 
underscore or a digit

 UNTIL I >= totalI

 RETURN( TRUE )

END

--- cut here ----------------------------------------------------------

---
---

Internet: see also:

TSE: Parser: Simple: Syntax: Structure: Parameter: How to parse the 
initialization of a new object?
http://www.faqts.com/knowledge_base/view.phtml/aid/24689/fid/1236

----------------------------------------------------------------------