Parsing Techniques: A Practical Guide (Monographs in Computer Science)
Format: PDF / Kindle (mobi) / ePub
This second edition of Grune and Jacobs’ brilliant work presents new developments and discoveries that have been made in the field. Parsing, also referred to as syntax analysis, has been and continues to be an essential part of computer science and linguistics. Parsing techniques have grown considerably in importance, both in computer science, ie. advanced compilers often use general CF parsers, and computational linguistics where such parsers are the only option. They are used in a variety of software products including Web browsers, interpreters in computer devices, and data compression programs; and they are used extensively in linguistics.
be used as often as needed, a strong parser saves effort in the long run. Although the notion of a “strong” parser is intuitively clear, confusion can arise when it is applied to the combination of parser and grammar. The stronger the parser is, the fewer restrictions the grammars need to obey and the “weaker” they can afford to be. Usually methods are named after the grammar and it is here where the 3.7 Representations of Parse Trees Top-down Non-directional methods Directional Methods,
4.1.2 Unger’s Method with ε-Rules So far, we only have dealt with grammars without ε-rules, and not without reason. Complications arise when the grammar contains ε-rules, as is demonstrated by the following example: consider the grammar rule S → ABC and input sentence pqr. If we want to examine whether this rule derives the input sentence, and we allow for ε-rules, many more partitions will have to be investigated, because each of the non-terminals A, B, and C may derive the empty string. In
depends on the properties of the grammar. The entries along the V and W arrows can each contain at most |VN | non-terminals, where |VN | is the number of non-terminals in the grammar, the size of the set VN from the formal deﬁnition of a grammar in Section 2.2. But again the actual number is usually much lower, since usually only a very limited subset of the non-terminals can produce a segment of the input of a given length in a given position; we will indicate the number by |VN |occ . So the
1 for all non-terminals A in the grammar for all length k from 1 to i compute Ti,A,k The cost of computing one entry Ti,A,k is O(n|Pav |), where n is the length of the input and |Pav | the average number of production rules of a non-terminal. As we saw, this computation is repeated O(n|G|n) times, where |G| is proportional to the size of the grammar. So the time requirements of this parsing algorithm is O(n3 |G||Pav |) or O(n3 ) × O(|G||Pav |). Bottom-up tabular parsing is applied in a number of
proﬁt from all simple and efﬁcient techniques known for regular grammars. The required duplications 5.2 Producing from a Regular Grammar 141 and modiﬁcations are mechanical and can be done by a program. Dewar, Bratley and Thorne  describe an early example of this approach, Blank  a more recent one. 5.1.3 Pattern Searching Many linear patterns, especially text patterns, have a structure that is easily expressed by a (quasi-)regular grammar. Notations that indicate amounts of money in