cs164: Introduction to Programming Languages and Compilers

cs164: Introduction to Programming Languages and Compilers

Lecture 10 Syntax-Directed Translation grammar disambiguation, Earley parser, syntax-directed translation Ras Bodik Shaon Barman Thibaud Hottelier Hack Your Language! CS164: Introduction to Programming Languages and Compilers, Spring 2012 UC Berkeley 1

Hidden slides This slide deck contains hidden slides that may help in studying the material. These slides show up in the exported pdf file but when you view the ppt file in Slide Show mode. 2 Today Ambiguous grammars pros and cons Grammar disambiguation select desired parse trees without rewriting the grammar Grammar rewriting remove undesirable parse trees by changing the gramamar

Syntax-directed translation its an evaluation / rewrite of the parse tree 3 Ambiguous Grammars One parse tree only! The role of the grammar distinguish between syntactically legal and illegal programs But thats not enough: it must also define a parse tree the parse tree conveys the meaning of the program associativity: left or right precedence: * before +

What if a string is parseable with multiple 5 How many parse trees are in this CYK graph? grammar: E id | E + E | E * E input: id+id*id E11 E9 * E8 E11 E6 + E10 E9 E 6 + E7 E6 id1 id1 ambiguous E10 E7 * E8 E8 id5

E7 id3 + id3 * id5 6 We can see the two trees better here E11 E9 * E8 E9 E 6 + E7 E6 id1 id1 E8 id5 E7 id3

+ id3 * id5 E10 E7 * E8 E11 E6 + E10 7 Ambiguity (Cont.) Ambiguity is bad Leaves meaning of some programs ill-defined Ambiguity is common in programming languages Arithmetic expressions

if E then if E then E else E 9 Ambiguity in expression grammars Grammar E E + E | E * E | ( E ) | int Strings int + int + int int * int + int 10 Ambiguity. Example This string has two parse trees E

E int E E + E E + E in t in E + E t

int int int + is left-associative + E 11 Ambiguity. Example This string has two parse trees E E int E

E + E E * E in t in E + E t int int int

* E * has higher precedence than + 12 Dealing with Ambiguity No general (automatic) way to handle ambiguity Impossible to convert automatically an ambiguous grammar to an unambiguous one (we must state which tree desired) Used with care, ambiguity can simplify the grammar Sometimes allows more natural definitions We need disambiguation mechanisms

There are two ways to remove ambiguity: 1) Declare to the parser which productions to prefer works on most but not all ambiguities 13 Summary G is ambiguous iff there is any s in L(G) such that s has multiple parse trees In expression grammars (such as E+E | E*E), ambiguity shows up as - precedence: does + or * have higher priority? - associaitivity: is 1-2-3 parsed as (1-2)-3 or 1-(23)? 14

Test yourself This question will come up in PA5. Work out the CYK graph for the input id+id*id+id Notice there are multiple ambiguous edges ie, edges inserted by/due to multiple productions How many edges are in the CYK graph? - polynomial? N2 or N3? Exponential? How many parse trees are represented by CYK graph? 15 Disambiguation with precedence

and associativity declarations 16 Precedence and Associativity Declarations Instead of rewriting the grammar Use the more natural (ambiguous) grammar Along with disambiguating declarations Bottom-up parsers like CYK and Earley allow declaration to disambiguate grammars you will implement those in PA5 Examples 17 Associativity Declarations Consider the grammar

E E + E | int Ambiguous: two parse trees of int + int + int E E E int + + E int E E int E +

int E int E + E int Left-associativity declaration: %left + 18 Precedence Declarations Consider the grammar E E + E | E * E | int And the string int + int * int E E E int

* + E E int E E + E int E * int

Precedence declarations: int E int %left + %left * Implementing disambiguation declarations To disambiguate, we need to answer these questions: Assume we reduced the input to E+E*E. Now do we want parse tree (E+E)*E or E+ (E*E)?

Similarly, given E+E+E, do we want parse tree (E+E)+E or E+(E+E)? 21 Example id1 + id3 * id5 22 22 Implementing the declarations in CYK/

Earley precedence declarations when multiple productions compete for being a child in the parse tree, select the one with least precedence left associativity when multiple productions compete for being a child in the parse tree, select the one with largest left subtree Where is ambiguity manifested in CYK? for i=0,N-1 do enqueue( (i,i+1,input[i]) ) -- create terminal edges while queue not empty do (j,k,B)=dequeue() for each edge (i,j,A) do -- for each edge left-adjacent to

(j,k,B) for each rule TAB do if edge (i,k,T) does not exists then add (i,k,T) to graph enqueue( (i,k,T) ) else -- Edge (i,k,T) already exists, hence potential ambiguity: -- Edges (i,j,A)(j,k,B) may be another way to reduce to (i,k,T). -- That is, they may be the desired child of (i,k,T) 26 in the parse tree. %dprec: another declaration Another disambiguating declaration (see bison) E if E then E

| if E then E | OTHER else E % dprec 1 % dprec 2 Without %dprec, wed have to rewrite the grammar: E MIF | UIF -- all then are matched -- some then are unmatched

MIF if E then MIF else MIF | OTHER UIF if E then E | if E then MIF else UIF 28 Rewriting Rewrite the grammar into a unambiguous grammar While describing the same language and eliminating undesirable prase trees Example: Rewrite E E + E | E * E | ( E ) | int

into EE+T | T T T * int | int | ( E ) Draw a few parse trees and you will see that new grammar enforces precedence of * over + enforces left-associativity of + and * 31 Parse tree with the new grammar The int * int + int has ony one parse tree now E E + T

T * int E T E in t in E + E t int int *

E int 32 Syntax-directed translation evaluate the parse (to produce a value, AST, ) 40 Example grammar in CS164 E -> E '+' T | T ; T -> T '*' F | F ; F -> /[0-9]+/

| '(' E ')' ; 41 Build a parse tree for 10+2*3, and evaluate 42 Same SDT in the notation of the cs164 parser Syntax-directed translation for evaluating an expression %% E -> %} | ;

T -> }% | ; F -> E '+' T %{ return n1.val + n3.val T %{ return n1.val %} T '*' F %{ return n1.val * n3.val F

/[0-9]+/ %{ return int(n1.val) }% 43 Build AST for a regular expression %ignore /\n+/ %% // A regular expression grammar in the 164 parser R -> 'a' | R '*' | R R n2.val) %} | R '|' R n3.val) %} | '(' R ')' ;

%{ return n1.val %} %{ return ('*', n1.val) %} %{ return ('.', n1.val, %{ return ('|', n1.val, %{ return n2.val %} 44

Recently Viewed Presentations

  • Arithmetic & Geometric Sequences

    Arithmetic & Geometric Sequences

    The primary focus will be on arithmetic and geometric sequences. - Linear and exponential functions can be constructed based off a graph, a description of a relationship and an input/output table. ... is the factor between the terms (called the...
  • Simulating Retinal Images Following Refractive Surgery

    Simulating Retinal Images Following Refractive Surgery

    1866 Research director at Zeiss Optical Works. 1868 Developed apochromatic lenses which further reduce chromatic aberration beyond what an achromat is capable of. 1871 Describes a refractometer for measure index of refraction at various wavelengths. Abbe sine condition describes requirements...
  • MANAGEMENT OF THE COMPLICATIONS OF THYROID SURGERY -

    MANAGEMENT OF THE COMPLICATIONS OF THYROID SURGERY -

    Incidence - 1-3% Occurs 2 - 5% after operation. Can be delayed for 2-3 weeks or hypocalcemia may be asymptomatic. Classic triad - Carpopedal spasm Stridor Convulsions Latent tetany Trousseau's sign Chvostek's sign Persistant - grand mal epilepsy, cataracts, psychosis,...
  • Fake News and Bad Buzz in Infection Prevention and Control

    Fake News and Bad Buzz in Infection Prevention and Control

    Ecological fallacy. Inferences without sufficient evidence. A possible way of analyzing if hypotheses are supported is to use the Koch postulates / Bradford Hill criteria (but do not always apply) Scientists. Journal/press release. Print/ online Media.
  • Conflict Ask someone else what they interpret this

    Conflict Ask someone else what they interpret this

    Learning Outcomes Process: Explore the theme of conflict through the stimuli of poetry, drama mediums of mime, movement and gesture and the drama elements of contrasts and characterisation. Poem Exploration In groups of 5/6 you will receive a copy of...
  • Campaign Strategy - University of Wisconsin-Madison

    Campaign Strategy - University of Wisconsin-Madison

    Account/Media Strategy and Planning. Campaign Strategy. Objectives - What are your goals? Targeting - Who are your best prospects? Timing - When to market the product/service? Geography - Where to concentrate your efforts?
  • CRYSTALLOGRAPHY - Missouri State University

    CRYSTALLOGRAPHY - Missouri State University

    CRYSTALLOGRAPHY INTRODUCTION crystallography is the study of crystal shapes based on symmetry atoms combine to form geometric shapes on smallest scale-- these in turn combine to form seeable crystal shapes if mineral forms in a nonrestrictive space (quartz crystal vs...
  • ATI Cleanroom facilities Contact: T.E. Sale, t.sale@surrey.ac.uk Advanced

    ATI Cleanroom facilities Contact: T.E. Sale, [email protected] Advanced

    ATI Cleanroom facilities Contact: T.E. Sale, [email protected] Advanced Technology Institute, School of Electronics and Physical Sciences, University of Surrey, Guildford, GU2 7XH ATI Cleanroom facilities Contact: T.E. Sale, [email protected] Advanced Technology Institute, School of Electronics and Physical Sciences, University of...