Search Top Index
TEACH FASTGRAM2 Aaron Sloman October 2011 Based on older teach files by several authors GENERATING AND ANALYSING SENTENCES (PART 2) ------------------------------------------ [NB THIS IS A FIRST DRAFT -- USE AT YOUR OWN RISK!] This is a sequel to TEACH FASTGRAM1, which introduced the idea of a formal grammar as a sort of program and showed how it could be used to control generation of linguistic structures (phrases, clauses, sentences) This sequel goes into more detail than I originally intended. It may be better to split it into smaller separate TEACH FILES. Please provide feedback. There will be further sequels, combining the grammar ideas with other ideas, e.g. planning, communicating, perceiving. Please look at the mini introduction to editing commands if you have not previously done so or need revision: <ENTER> teach minived <RETURN> CONTENTS - (Use <ENTER> g to access required sections) -- Motivation: why we need to be able to analyse sentences -- A blocks-world grammar and lexicon -- -- EXERCISE 1: Extending the examples given -- -- EXERCISE 2: Create test examples and devise a notation -- A very incomplete grammar and lexicon -- -- EXERCISE 3: extend and enrich blocks_gram blocks_lex -- -- EXERCISE 4: use generate with your grammar and lexicon -- -- EXERCISE 5: using setup to create parsers, to test sentences -- Using showtree -- EXERCISE: Make a summary of what you have learnt from this file -- TO BE ADDED ======================================================================= -- Motivation: why we need to be able to analyse sentences If someone says Please put the box next to the green chair on the table that request could have two very different interpretations. Both share these assumptions: There is a box There is a green chair There is a table On one interpretation the box is next to the green chair, and the request is to move it from there to a location on the table. On the other interpretation there is a green chair on the table (perhaps a model chair?) and the request is to move the box on the table next to the green chair. How could a robot deal with such requests? It is impossible to work out simply from grammatical structures and the meanings of the words which interpretation is correct. But at least it is possible to work out that there are those two interpretations and then use some aspect of the context to decide which interpretation is correct. This teach file introduces ways in which a grammar can guide the interpretation of sentences by breaking them up into meaningful parts, in this case Please put the box next to the green chair on the table where some of those parts also have parts that contribute to the meaning, e.g. the determiner "the", the adjective "green" and the noun "chair" -- A blocks-world grammar and lexicon This section could lead into a major project. For now just skim it, and come back later (or dwell on it) if you are interested. Using the ideas in the previous teach file we could try to construct a grammar that can be used to give simple commands and then later extend it to ask simple questions, and then show how the Pop11 grammar library can parse those commands and questions, or in some cases why a more powerful library (called TPARSE, introduced later) is needed. Suppose we want to allow questions and commands, and we want both to be able to refer to things with properties (e.g. size, colour, shape), spatial relations (e.g. next to, inside, on,) and actions that can be requested or commanded (e.g. put, place, pick up). We may allow some objects to have unique names, e.g. Box2, Block3, Table1). Using the ideas of the previous teach file, and the above comments we can allow things to be referred to by different sorts of constructs using: names: Box2, Block3, Table1, Fred, Mary ... size adjectives: big, small, tiny ... colour adjectives: blue, brown, yellow, ... material adjectives: wooden, glass, plastic, ... "determiner" words that can be followed by a noun or a qualified noun phrase: the, a, some, any, that ... prepositions that indicate spatial relations next to, inside, above, ... action words put, place, fetch, and for questions thing query word: which, what person query word: who place query word: where -- -- EXERCISE 1: Extending the examples given Try extending those lists. E.g. what other query words could be required in an intelligent domestic robot? If you wish to work on this topic it may be a good idea to copy the above into a new file owned by you. You can mark the range of text you want to copy (F1 or ESC m, to start, and F2, or ESC M) Then start a new file e.g. ENTER ved my_gram2.p RETURN (NB: don't try to use spaces in file names: they cause trouble in linux and Ved will be confused by them. Underscores are fine). Then, with the editor cursor in the new file, you can invoke Ved's "Transcribe In" command, abbreviated to 'ti': ENTER mi RETURN Then save the new file, to be safe: ENTER w1 RETURN -- -- EXERCISE 2: Create test examples and devise a notation Using the lists of components in Exercise 1, create some examples of questions, commands and assertions, and devise a notation for showing how your sentences are constructed from the components specified. Tip 1: It is useful to separate the lexicon, which contains lists of words of the various allowed types, from the grammar that specifies ways in which larger structures can be composed from smaller structures. The previous teach file TEACH * GRAMMAR1 illustrated that separation. (you can go back to it by putting the ved cursor where the above asterisk is and typing: ESC h) Tip 2: Use abbreviations, similar to those used in the previous teach file. E.g. S for sentence QS for question sentence CS for command sentence SizeAdj, ColourAdj, ... V (for verbs) TQ for Thing query word PQ for place query word Det for determiner. You probably won't have time to do a complete job of specifying such a grammar and lexicon. But even starting the task can be very illuminating and help us not only to think about the problems of modelling linguistic communication, but also the problems of giving a machine even to think about or perceive its environment. -- A very incomplete grammar and lexicon ------------------------------ Here is a fairly simple version of a grammar and lexicon for a tiny subset of the blocks world. You can try extending it, after playing with it. vars blocks_gram blocks_lex; [ ;;; A sentence is a command or a question ;;; LIB GRAMMAR requires the grammar to start with 's' [s [COM] [QUEST]] ;;; A question asks about [QUEST ;;; the location of an object [WH_LOC VBE NP] ;;; what is in some spatial relation [WH_THING VBE PP] ;;; which member of a category is in some relationship [WH_SELECT SNP VBE PP] ;;; whether a specified object is in some relationship [VBE NP PP] ] ;;; Two sorts of commands [COM [V NP PP] [V NP ONTO_PP]] ;;; Noun phrases can be [NP ;;; a pronoun [PN] ;;; determiner and simple NP [DET SNP] ;;; determiner, simple NP and prepositional phrase [DET SNP PP] ] ;;; Simple NP can be a noun or adjective phrase + noun [SNP [NOUN] [AP NOUN]] ;;; adjective phrase is one or more adjectives [AP [ADJ] [ADJ AP]] ;;; prepositional phrase is preposition + NP [PP [PREP NP]] ;;; Target prepositional phrase indicating a location [ONTO_PP [ONTO_PREP NP]] ] -> blocks_gram; ;;; Now the lexicon. Try adding some words where you think more could fit. [ [NOUN block box table one] [PN it] ;;; we cheat by inventing compound words [V put move pickup putdown] [VBE is] ;;; question words location, identity, selection [WH_LOC where] [WH_THING what] [WH_SELECT which] ;;; It might be better to divide this into different kinds of ;;; adjectives, e.g. COLOUR_ADJ SIZE_ADJ [ADJ white red blue green big small large little] [PREP on above over] [ONTO_PREP onto] [DET each every the a some] ] -> blocks_lex; Copy that into a new file: ENTER ved blocks_gram.p RETURN And then edit it, as you wish, as suggested below. -- -- EXERCISE 3: extend and enrich blocks_gram blocks_lex Try extending the above grammar and lexicon, and perhaps splitting some of the categories into sub-categories that bring out important differences between similar types of verbal expression. -- -- EXERCISE 4: use generate with your grammar and lexicon Use the generate command illustrated in TEACH GRAMMAR1 to generate sentences from the above grammar and lexicon. First compile the grammar library (use ESC d on the command): uses grammar Try a few test runs, and see if you get any surprises: generate(blocks_gram, blocks_lex) ==> or mark this range (F1, F2) then compile it (CTRL-d), after increasing the maximum recursion level from its default (10) to 15: 15 -> maxlevel; repeat 5 times generate(blocks_gram, blocks_lex) ==> endrepeat; Try to explain the difference between values 10 and 15 for maxlevel. Examples with maxlevel set to 10 ** [which blue table is on it] ** [is it on it] ** [which little table is above it] ** [putdown it onto it] ** [move some one above it] ** [move it over it] Examples with maxlevel set to 15 ** [put it onto a blue table above a box] ** [which little big table is above some big block] ** [put each one over every small one above it over each one] ** [where is every large box on the one] ** [is the table on some white blue one on each table] Are there some sentences here, or in your output, which you think should be excluded by the grammar. Could you modify the grammar so as to exclude them? Note: if you wish to copy examples of output from the output.p file into your file blocks_gram.p you can edit the output file ENTER ved output.p RETURN Then using F1 and F2 mark the lines of output you wish to save. Then using ESC x (or an ENTER ved .... command) return to your previous file. Select the location where you want the output to go. Create an empty comment using Pop11 comment brackets, like this /* */ Put the VED cursor in the blank space. Then to this to Move the text In: ENTER mi RETURN That will give you commented out examples of output, which will be ignored if ever you compile the whole file. You can also type some extra text in at the top of the comment saying how the examples were generated, e.g. /* Using 'generate'. defined in LIB GRAMMAR, with maxlevel set to 15, the blocks_gram and blocks_lex, produced the following: ** [pickup it onto each white little block on a block] ...etc... */ -- -- EXERCISE 5: using setup to create parsers, to test sentences Try using the parser generator in lib grammar to test sentences that you think your grammar and lexicon can handle. If you have not compiled your grammar and lexicon since this editing session started, go to your file containing them, mark the whole range, from the 'vars' declaration to the end of the second assignment, to blocks_lex. Then compile that range (CTRL-d). Test that you have compiled both by printing out the grammar and the lexicon (ESC d on each line): blocks_gram ==> blocks_lex ==> Now make sure that the grammar library is loaded ( ESC d ): uses grammar A procedure you have not yet met is provided to create a collection of parsers from your grammar and your lexicon. The procedure is called setup. You can ask Pop11 what setup is Use ESC-d: setup => that prints out: ** <procedure setup> NB in Pop11 procedures are objects just as lists, numbers, words, strings, and arrays are (among other things). We use setup to create parsers by applying the procedure to the grammar and the lexicon (ESC-d): setup(blocks_gram, blocks_lex); That command does not print anything out but it creates a collection of procedures for recognising items that the grammar deals with, one procedure for each of the capitalised sentence components in the grammar, and one for each type of word in the lexicon. You can ask pop11 to print out some of the procedures s => ** <procedure s> QUEST => ** <procedure QUEST> ONTO_PP => ** <procedure ONTO_PP> NOUN => ** <procedure NOUN> DET => ** <procedure DET> (You can think of these procedures as compiled versions of the grammar rules. But they are procedures for recognition, whereas the grammar rules can also be used for generation, as we saw previously.) Try using the recognisers, and see if they do what they should do. The lexical recognisers can only recognise words (represented in double quotes in Pop11, if not in a list expression). E.g. notice that DET returns a description of what it recognized: DET("the") => ** [DET the] otherwise it returns one of Pop11's two booleans (true and false are booleans): DET("blue") => ** <false> Copy and edit the DET command and try it with other words from the lexicon. Do the same with some of the other newly created procedures for recognising lexical entries, e.g. ADJ("red") => ** [ADJ red] ADJ("the") => ** <false> Now try some recognisers for complex constructs. Reminder: these are the constructs (unless you changed the grammar): s QUEST COM NP SNP AP PP ONTO_PP Each now is the name of a procedure, e.g. (use ESC d): QUEST, COM, PP => ** <procedure QUEST> <procedure COM> <procedure PP> So they can also be run. But because they come from the grammar, not the lexicon, they must be applied to lists of words, not individual words. E.g. run these (using ESC d on each), and try editing them to see what results you get: AP([big red]) ==> ** [AP [ADJ big] [AP [ADJ red]]] The adjectival phrase is an adjectival phrase starting with and adjective indicated by [ADJ big] and followed by an adjectival phrase indicated by [AP [ADJ red]] Compare these: AP([the red]) ==> ** <false> AP([big red box]) ==> ** <false> But compare NP([the big red box]) ==> ** [NP [DET the] [SNP [AP [ADJ big] [AP [ADJ red]]] [NOUN box]]] This is noun phrase because it contains a determiner: [DET the] Followed by a simple noun phrase SNP, which is an AP followed by a NOUN [SNP [AP [ADJ big] [AP [ADJ red]]] [NOUN box]]] -- Using showtree ----------------------------------------------------- The library program showtree can be used to display the NP description in a graphical format. (This will work in XVed, and also in Ved if you are using an 'xterm' window, or using PuTTY in windows to access Poplog remotely: Compile the library (ESC d) uses showtree Apply the procedure to the full NP description (using ESC d): showtree([NP [DET the] [SNP [AP [ADJ big] [AP [ADJ red]]] [NOUN box]]]) If it works on your terminal it will look something like this (only prettier): |NP| ----------- |DET| |SNP| | --------- | | | the |AP| |NOUN| ------- | |ADJ| |AP| box | | | | big |ADJ| | | red Now try testing sentences that you construct to see if the "top level" parsing procedure "s" recognises them: Examples: s([which big box is above it]) ==> ** [s [QUEST [WH_SELECT which] [SNP [AP [ADJ big]] [NOUN box]] [VBE is] [PP [PREP above] [NP [PN it]]]]] Examine that closely to work out where all the descriptions come from. Why doesn't this one work: s([which large box is below it]) ==> ** <false> Can you fix either the grammar or the lexicon, then recompile it, then re-run the setup procedure so that that example is accepted? The showtree library includes a pop11 macro "---" defined so that showtree([ word word word ..... ] ) ==> can be abbreviated like this and compiled as before. --- word word word ..... e.g. try --- is some box above it ** [s [QUEST [VBE is] [NP [DET some] [SNP [NOUN box]]] [PP [PREP above] [NP [PN it]]]]] --- where are all the large boxes ** <false> --- where is every large green box ** [s [QUEST [WH_LOC where] [VBE is] [NP [DET every] [SNP [AP [ADJ large] [AP [ADJ green]]] [NOUN box]]]]] Try using some of the output of the generate(<grammar>, <lexicon>) command, and see if the s procedure recognises everything generated, and parses it correctly into sub-structures. -- EXERCISE: Make a summary of what you have learnt from this file ---- -- TO BE ADDED -------------------------------------------------------- Further developments of these ideas Designing and implementing a parser, instead of using lib grammar Combining parsing with other components in a complete architecture, e.g. for a chatbot or simple expert system, or travel adviser. Combine the above ideas with an image analyser to generate descriptions of what is in various images. Combine the above ideas with a program for making pictures, and make it draw a picture when given a sentence describing the desired contents. What other applications can you think of? Further reading TEACH * ISASENT TEACH * PARSESENT TEACH * PARSING --- $usepop/pop/teach/fastgram2 --- University of Birmingham 2011. All rights reserved. ------