1
0
mirror of https://github.com/2martens/uni.git synced 2026-05-06 11:26:25 +02:00

Prosem: Conclusion finished. Introduction finished. Abstract finished.

This commit is contained in:
Jim Martens
2013-11-23 11:53:13 +01:00
parent 53c2f20454
commit ba8e658c6b

View File

@ -170,7 +170,7 @@
% Abstract gives a brief summary of the main points of a paper: % Abstract gives a brief summary of the main points of a paper:
\section*{Abstract} \section*{Abstract}
Your text here... Syntactic parsing and semantic analysis are two important methods for understanding natural language. Each of them has their individual strengths and weaknesses. But both of them have major issues with ambiguity once a restricted environment is left. Understanding unrestricted natural language is therefore far from being reached.
% Lists: % Lists:
\setcounter{tocdepth}{2} % depth of the table of contents (for Seminars 2 is recommented) \setcounter{tocdepth}{2} % depth of the table of contents (for Seminars 2 is recommented)
@ -188,47 +188,49 @@
\section{Introduction} \section{Introduction}
\label{sec:introduction} \label{sec:introduction}
\begin{itemize} It's the dream of many Science-Fiction fans: A fully sentient AI. Let's ignore for a moment all the odds that are against it (morality, physics, etc.) and concentrate on one aspect that is mandatory for even much less ambitious dreams. Imagine a computer game in which you can talk natural language to the NPC counterparts so that they react appropriately to it. Well maybe that is still too ambitious. What about writing what you want to say? In that case the computer needs to understand what you are writing so that it can react to it.
\item two kinds of natural language: spoken language and written language
\item will concentrate on written language This process of understanding natural language contains multiple methods. The first one is the syntactic parsing, the second one the semantic analysis. Syntactic parsing relies on a grammar that describes the set of possible input, also called syntax. The syntax specifies what are allowed sentence structures and how these are built.
\item definition of syntax, semantics and pragmatics The semantic analysis relies on the semantics of a given input. That means what the given input means. An example: ``You run around the bush''. The semantic meaning of this sentence is that you are running around a bush. The pragmatics define what is the intended meaning of an input. In this example it's not that you run around the bush but actually that you take a long time to get to the point in a discussion. It's a so called idiom. This difference between semantic meaning, where just the sentence as it is written is considered, and pragmatic meaning, where the intended meaning is considered, generates ambiguity that is easy for humans to resolve but difficult for computers. But even the pragmatics in this example are ambigious, because it depends on the context what it actually means. If two persons are walking around in a forest and one starts running around the bush, the sentence of this example, would have the semantic meaning as it's pragmatic meaning.
\item two important methods: syntactic parsing and semantic analysis
\end{itemize} On top of that the semantic meaning itself isn't always clear either. Sometimes words have multiple meanings, so that even the semantic meaning can have different possible interpretations.
In this paper both syntactic parsing and semantic analysis are presented. After the presentation of the methods, they are critically discussed to finally come to a conclusion.
\section{Evaluation of methods} \section{Evaluation of methods}
\label{sec:evalMethods} \label{sec:evalMethods}
\subsection{Syntactic Parsing} \subsection{Syntactic Parsing}
\label{subSec:syntacticParsing} \label{subSec:syntacticParsing}
Syntactic Parsing is used to create parse trees. These can be used for grammar checks in a text editor: ``A sentence that cannot be parsed may have grammatical errors''\footnote{\cite{Jurafsky2009b}, page 461}. But they more likely ``serve as an important intermediate stage of representation for semantic analysis''\footnote{\cite{Jurafsky2009b}, page 461}. There are different algorithms available to create such trees. The CYK\footnote{named after inventors John Cocke, Daniel Younger and Tadeo Kasami} algorithm will be explained further. But before the CYK algorithm is explained, the reason for its existance is presented. Syntactic Parsing is used to create parse trees. These can be used for grammar checks in a text editor: ``A sentence that cannot be parsed may have grammatical errors''\cite[p.~461]{Jurafsky2009b}. But they more likely ``serve as an important intermediate stage of representation for semantic analysis''\cite[p.~461]{Jurafsky2009b}. There are different algorithms available to create such trees. The CYK\footnote{named after inventors John Cocke, Daniel Younger and Tadeo Kasami} algorithm will be explained further. But before the CYK algorithm is explained, the reason for its existance is presented.
There are two classical ways of parsing a sentence. The one is bottom-up and the other one is top-down. Both approaches have their own advantages and disadvantages. In addition the ambiguity creates problems. To implement bottom-up and top-down search algorithms in the face of ambiguity, ``an agenda-based backtracking strategy''\footnote{\cite{Jurafsky2009b}, page 468} is used. The problem here is that every time the parser recognizes that the current parse tree is wrong, it has to backtrack and explore other parts of the sentence. This creates a huge amount of work duplication and is therefore inefficient. There are two classical ways of parsing a sentence. The one is bottom-up and the other one is top-down. Both approaches have their own advantages and disadvantages. In addition the ambiguity creates problems. To implement bottom-up and top-down search algorithms in the face of ambiguity, ``an agenda-based backtracking strategy''\cite[p.~468]{Jurafsky2009b} is used. The problem here is that every time the parser recognizes that the current parse tree is wrong, it has to backtrack and explore other parts of the sentence. This creates a huge amount of work duplication and is therefore inefficient.
A solution to these problems is offered by ``dynamic programming parsing methods''\footnote{\cite{Jurafsky2009b}, page 469}. The CYK algorithm is one of multiple algorithms based on dynamic programming. A solution to these problems is offered by ``dynamic programming parsing methods''\cite[p.~469]{Jurafsky2009b}. The CYK algorithm is one of multiple algorithms based on dynamic programming.
The CYK does only work with grammars in the Chomsky Normal Form (CNF). Every context-free grammar can be converted to CNF without loss in expressiveness. Therefore this restriction does no harm but simplifies the parsing. For information on how context-free grammars can be converted to CNF, refer to Jurafsky\cite{Jurafsky2009b}. The CYK does only work with grammars in the Chomsky Normal Form (CNF). Every context-free grammar can be converted to CNF without loss in expressiveness. Therefore this restriction does no harm but simplifies the parsing. For information on how context-free grammars can be converted to CNF, refer to Jurafsky\cite{Jurafsky2009b}.
CYK requires $\mathcal{O}(n^{2}m)$ space for the $P$ table (a table with probabilities), where ``$m$ is the number of nonterminal symbols in the grammar''\footnote{\cite{Russel2010}, page 893}, and uses $\mathcal{O}(n^{3}m)$ time. ``$m$ is constant for a particular grammar, [so it] is commonly described as $\mathcal{O}(n^{3})$''\footnote{\cite{Russel2010}, page 893}. There is no algorithm that is better than CYK for general context-free grammars\cite{Russel2010}. CYK requires $\mathcal{O}(n^{2}m)$ space for the $P$ table (a table with probabilities), where ``$m$ is the number of nonterminal symbols in the grammar''\cite[p.~893]{Russel2010}, and uses $\mathcal{O}(n^{3}m)$ time. ``$m$ is constant for a particular grammar, [so it] is commonly described as $\mathcal{O}(n^{3})$''\cite[p.~893]{Russel2010}. There is no algorithm that is better than CYK for general context-free grammars\cite{Russel2010}.
But how does CYK work? CYK doesn't examine all parse trees. It just examines the most probable one and computes the probability of that tree. All the other parse trees are present in the $P$ table and could be enumerated with a little work (in exponential time). But the strength and beauty of CYK is, that they don't have to be enumerated. CYK defines ``the complete state space defined by the `apply grammar rule' operator''\footnote{\cite{Russel2010}, page 894}. You can search just a part of this space with $A^{*}$ search.\cite{Russel2010} ``With the $A^{*}$ algorithm [...] the first parse found will be the most probable''\footnote{\cite{Russel2010}, page 895}. But how does CYK work? CYK doesn't examine all parse trees. It just examines the most probable one and computes the probability of that tree. All the other parse trees are present in the $P$ table and could be enumerated with a little work (in exponential time). But the strength and beauty of CYK is, that they don't have to be enumerated. CYK defines ``the complete state space defined by the `apply grammar rule' operator''\cite[p.~894]{Russel2010}. You can search just a part of this space with $A^{*}$ search.\cite{Russel2010} ``With the $A^{*}$ algorithm [...] the first parse found will be the most probable''\cite[p.~895]{Russel2010}.
But these probabilities need to be learned from somewhere. This somewhere is usually a ``treebank''\footnote{\cite{Russel2010}, page 895}, which contains a corpus of correctly parsed sentences. The best known is the Penn Treebank\cite{Russel2010}, which ``consists of 3 million words which have been annotated with part of speech and parse-tree structure, using human labor assisted by some automated tools''\footnote{\cite{Russel2010}, page 895}. The probabilities are then computed by counting and smoothing in the given data.\cite{Russel2010} There are other ways to learn the probabilities that are more difficult. For more information refer to Russel\cite{Russel2010}. But these probabilities need to be learned from somewhere. This somewhere is usually a ``treebank''\cite[p.~895]{Russel2010}, which contains a corpus of correctly parsed sentences. The best known is the Penn Treebank\cite{Russel2010}, which ``consists of 3 million words which have been annotated with part of speech and parse-tree structure, using human labor assisted by some automated tools''\cite[p.~895]{Russel2010}. The probabilities are then computed by counting and smoothing in the given data.\cite{Russel2010} There are other ways to learn the probabilities that are more difficult. For more information refer to Russel\cite{Russel2010}.
\subsection{Semantic Analysis} \subsection{Semantic Analysis}
\label{subSec:semanticAnalysis} \label{subSec:semanticAnalysis}
Semantic analysis provides multiple approaches. In this paper the approach of ``syntax-driven semantic analysis''\footnote{\cite{Jurafsky2009}, page 617} is explained further. In this approach the output of a parser, the syntactic analysis, ``is passed as input to a semantic analyzer to produce a meaning representation''\footnote{\cite{Jurafsky2009}, page 618}. Semantic analysis provides multiple approaches. In this paper the approach of ``syntax-driven semantic analysis''\cite[p.~617]{Jurafsky2009} is explained further. In this approach the output of a parser, the syntactic analysis, ``is passed as input to a semantic analyzer to produce a meaning representation''\cite[p.~618]{Jurafsky2009}.
Therefore context-free grammar rules are augmented with ``semantic attachments''\footnote{\cite{Jurafsky2009}, page 618}. Every word and syntactic structure in a sentence gets such a semantic attachment. The tree with syntactic components is now traversed in a bottom-up manner. On the way the semantic attachments are combined to finally produce ``First-Order Logic''\footnote{\cite{Jurafsky2009a}, page 589} that can be interpreted in a meaningful way. This procedure has some prerequisites that will be explained first. Therefore context-free grammar rules are augmented with ``semantic attachments''\cite[p.~618]{Jurafsky2009}. Every word and syntactic structure in a sentence gets such a semantic attachment. The tree with syntactic components is now traversed in a bottom-up manner. On the way the semantic attachments are combined to finally produce ``First-Order Logic''\cite[p.~589]{Jurafsky2009a} that can be interpreted in a meaningful way. This procedure has some prerequisites that will be explained first.
The mentioned \textit{First-Order Logic} can be represented by a context-free grammar specification. It is beyond this paper to describe this specification completely. Jurafsky\cite{Jurafsky2009a} provides a detailed picture of the specification with all elements in figure 17.3. The most important aspects of this specification are explained here. The logic provides terms which can be functions, constants and variables. Functions have a term as argument. Syntactically they are the same as single-argument predicates. But functions represent one unique object. The mentioned \textit{First-Order Logic} can be represented by a context-free grammar specification. It is beyond this paper to describe this specification completely. Jurafsky\cite{Jurafsky2009a} provides a detailed picture of the specification with all elements in figure 17.3. The most important aspects of this specification are explained here. The logic provides terms which can be functions, constants and variables. Functions have a term as argument. Syntactically they are the same as single-argument predicates. But functions represent one unique object.
Predicates can have multiple terms as arguments. In addition the logic provides quantifiers ($\forall, \exists$) and connectives ($\wedge, \vee, \Rightarrow$). Predicates can have multiple terms as arguments. In addition the logic provides quantifiers ($\forall, \exists$) and connectives ($\wedge, \vee, \Rightarrow$).
Another prerequisite is the ``lambda notation''\footnote{\cite{Jurafsky2009a}, page 593}. A simple example of this notation is an expression of the following form\footnote{examples taken from Jurafsky\cite{Jurafsky2009a}, pp. 593-594}: Another prerequisite is the ``lambda notation''\cite[p.~593]{Jurafsky2009a}. A simple example of this notation is an expression of the following form\footnote{examples taken from Jurafsky\cite[pp.~593-594]{Jurafsky2009a}}:
\[ \[
\lambda x.P(x) \lambda x.P(x)
\] \]
The $\lambda$ can be reduced in a so called ``$\lambda$-reduction''\footnote{\cite{Jurafsky2009a}, page 593}. The expression above could be reduced in the following way: The $\lambda$ can be reduced in a so called ``$\lambda$-reduction''\cite[p.~593]{Jurafsky2009a}. The expression above could be reduced in the following way:
\begin{alignat*}{2} \begin{alignat*}{2}
\lambda x.&P(x)&(A) \\ \lambda x.&P(x)&(A) \\
@ -247,16 +249,16 @@
&Near(Bacaro, Centro) &Near(Bacaro, Centro)
\end{alignat*} \end{alignat*}
This technique is called ``currying''\footnote{\cite{Jurafsky2009a}, page 594} and is used to convert ``a predicate with multiple arguments into a sequence of single-argument predicates''\footnote{\cite{Jurafsky2009a}, page 594}. This technique is called ``currying''\cite[p.~594]{Jurafsky2009a} and is used to convert ``a predicate with multiple arguments into a sequence of single-argument predicates''\cite[p.~594]{Jurafsky2009a}.
After the prerequisites are now explained, it is time to start with the actual syntax-driven semantic analysis. It will be shown with an example provided by Jurafsky. Assume the sentence \textit{Every restaurant closed}. ``The target representation for this example should be the following''\footnote{\cite{Jurafsky2009}, page 621}. After the prerequisites are now explained, it is time to start with the actual syntax-driven semantic analysis. It will be shown with an example provided by Jurafsky. Assume the sentence \textit{Every restaurant closed}. ``The target representation for this example should be the following''\cite[p.~621]{Jurafsky2009}.
\begin{equation} \begin{equation}
\label{eq:tarRep} \label{eq:tarRep}
\forall x \,Restaurant(x) \Rightarrow \exists e \,Closed(e) \wedge ClosedThing(e,x) \forall x \,Restaurant(x) \Rightarrow \exists e \,Closed(e) \wedge ClosedThing(e,x)
\end{equation} \end{equation}
The first step is to determine what the meaning representation of \textit{Every restaurant} should be. \textit{Every} is responsible for the $\forall$ quantifier and \textit{restaurant} specifies the category over which is quantified. This is called the ``restriction''\footnote{\cite{Jurafsky2009}, page 622} of the noun phrase. The meaning representation could be $\forall x\,Restaurant(x)$. It is a valid logical formula but it doesn't make much sense. ``It says that everything is a restaurant.''\footnote{\cite{Jurafsky2009}, page 622} ``Noun phrases like [this] are [usually] embedded in expressions that [say] something about the universally quantified variable. That is, we're probably trying to \textit{say something} about all restaurants. This notion is traditionally referred to as the \textit{NP}'s nuclear scope''\footnote{\cite{Jurafsky2009}, page 622}. In the given example, the nuclear scope is \textit{closed}. To represent this notion in the target representation, a dummy predicate $Q$ is added, which results in this expression: The first step is to determine what the meaning representation of \textit{Every restaurant} should be. \textit{Every} is responsible for the $\forall$ quantifier and \textit{restaurant} specifies the category over which is quantified. This is called the ``restriction''\cite[p.~622]{Jurafsky2009} of the noun phrase. The meaning representation could be $\forall x\,Restaurant(x)$. It is a valid logical formula but it doesn't make much sense. ``It says that everything is a restaurant.''\cite[p.~622]{Jurafsky2009} ``Noun phrases like [this] are [usually] embedded in expressions that [say] something about the universally quantified variable. That is, we're probably trying to \textit{say something} about all restaurants. This notion is traditionally referred to as the \textit{NP}'s nuclear scope''\cite[p.~622]{Jurafsky2009}. In the given example, the nuclear scope is \textit{closed}. To represent this notion in the target representation, a dummy predicate $Q$ is added, which results in this expression:
\[ \[
\forall x\,Restaurant(x) \Rightarrow Q(x) \forall x\,Restaurant(x) \Rightarrow Q(x)
\] \]
@ -278,7 +280,7 @@
\section{Critical discussion} \section{Critical discussion}
\label{sec:critDiscussion} \label{sec:critDiscussion}
Now that both methods have been presented with one selected approach each, it is time to discuss them critically. The CYK algorithm solves many problems like ambiguity; at least to a certain degree. But it also is problematic, because of the restriction to CNF. While in theory every context-free grammar can be converted to CNF, in practice it poses ``some non-trivial problems''\footnote{\cite{Jurafsky2009b}, page 475}. One of this problems can be explored in conjunction with the second presented method (semantic analysis). ``[T]he conversion to CNF will complicate any syntax-driven approach to semantic analysis''\footnote{\cite{Jurafsky2009b}, page 475}. A solution to this problem is some kind of post-processing in which the trees are converted back to the original grammar.\cite{Jurafsky2009b} Another option is to use a more complex dynamic programming algorithm that accepts any kind of context-free grammar. Such an algorithm is the ``Earley Algorithm''\footnote{\cite{Jurafsky2009b}, page 477}. Now that both methods have been presented with one selected approach each, it is time to discuss them critically. The CYK algorithm solves many problems like ambiguity; at least to a certain degree. But it also is problematic, because of the restriction to CNF. While in theory every context-free grammar can be converted to CNF, in practice it poses ``some non-trivial problems''\cite[p.~475]{Jurafsky2009b}. One of this problems can be explored in conjunction with the second presented method (semantic analysis). ``[T]he conversion to CNF will complicate any syntax-driven approach to semantic analysis''\cite[p.~475]{Jurafsky2009b}. A solution to this problem is some kind of post-processing in which the trees are converted back to the original grammar.\cite{Jurafsky2009b} Another option is to use a more complex dynamic programming algorithm that accepts any kind of context-free grammar. Such an algorithm is the ``Earley Algorithm''\cite[p.~477]{Jurafsky2009b}.
The syntax-driven semantic analysis, as it has been presented, is a powerful method that is easy to understand. But it has one essential problem. It relies upon an existing set of grammar rules with semantic attachments to them. In a real world example such a table would contain thousands of grammar rules. While it is relatively easy to compute the final meaning representation with such a given table, it is very hard to create the table in the first place. The difficulty to create this table is split into two main issues. The first one being that you must find a grammar specification that fits all your use cases. This problem applies for the syntactic parsing as well. The second issue is that one has to find out the semantic attachments to the grammar rules. The syntax-driven semantic analysis, as it has been presented, is a powerful method that is easy to understand. But it has one essential problem. It relies upon an existing set of grammar rules with semantic attachments to them. In a real world example such a table would contain thousands of grammar rules. While it is relatively easy to compute the final meaning representation with such a given table, it is very hard to create the table in the first place. The difficulty to create this table is split into two main issues. The first one being that you must find a grammar specification that fits all your use cases. This problem applies for the syntactic parsing as well. The second issue is that one has to find out the semantic attachments to the grammar rules.
@ -290,13 +292,23 @@
Both methods require a unique work for one specific usage. This unique workload is the grammar creation for the parsing and the extension of the grammar with semantic attachments for the semantic analysis. The less restricted the usage environment, the more complex the initial workload becomes. The same is true for the recurring workload inside one specific usage. Both methods require a unique work for one specific usage. This unique workload is the grammar creation for the parsing and the extension of the grammar with semantic attachments for the semantic analysis. The less restricted the usage environment, the more complex the initial workload becomes. The same is true for the recurring workload inside one specific usage.
Judging by the state-of-the-art of computer technology, parsing does still pose a significant challenge once the restricted field of programming languages is left. The semantic analysis as the second method in the chain has therefore even more problems to date. As the presented syntax-driven approach does only work with parse trees, a semantic analysis can only be undertaken once the syntactic parsing succeeds. The dream of a sentient AI capable of communicating with humans in natural language is therefore far from being reached, judging by the previously described problems alone (leaving out all other problems related with this dream). Judging by the state-of-the-art of computer technology, parsing does still pose a significant challenge once the restricted field of programming languages is left. The semantic analysis as the second method in the chain has therefore even more problems to date. As the presented syntax-driven approach does only work with parse trees, a semantic analysis can only be undertaken once the syntactic parsing succeeds.
The ambiguity remains one of the bigges issues for both methods. Especially the syntax-driven semantic analysis does only consider the semantic meaning alone. It's not it's fault as the analysis doesn't know the context. The presented approach looks at each sentence in a sandbox. The generated meaning representations are therefore only of limited use for a less restricted grammar.
\section{Conclusion} \section{Conclusion}
\label{sec:concl} \label{sec:concl}
Syntactic parsing is an important method on the way to understand natural language. The usage of dynamic programming algorithms circumvents many of the issues that classical top-down or bottom-up parsing algorithms face. Ambiguity is the most prominent of those issues. The best algorithm for context-free grammars is the CYK algorithm, which is a dynamic programming algorithm. But in practice it is very restricted, because it only works with grammars in CNF. But there are more complex dynamic programming algorithms that allow any kind of context-free grammar.
Semantic analysis is the second method in the chain to understand natural language and therefore important as well. There are different approaches to the analysis. One of them is the syntax-driven approach that depends on parse trees. This dependency creates a delay effect: As long as a certain peace of text cannot be parsed, it definitely can't be analyzed for it's semantic meaning either. This is not an issue for restricted environments like programming languages or a very restricted subset of a natural language's grammar. But it is a major issue for real natural language, because there already the parsing does pose significant challenges.
Looking into the future both methods require substantial improvements on the algorithm side to reach a point where understanding non-restricted natural languages becomes possible. But as it is right now it is not possible to create dialog systems that interact fully natural with humans. To make any kind of language interaction, the set of possible words and sentence structures must be restricted. But even if that is given (like in a flight check-in automaton), the computer has only a finite set of possible cases. The programmer can add tons of if-clauses or comparable statements to check for different cases but in the end it's all finite so that many of the user inputs must lead to the same output or no output at all. This fact has led to the current situation in which the most interaction with a computer happens via a restricted interface in which the user can only choose from a limited set of options (by clicking on a button, selecting an item of a list, etc.).
In addition the ambiguity of natural language is a major issue. Going back to the example in the introduction, the syntax-driven semantic analysis does only work properly if the semantic meaning of the input has no ambiguity. But even than the generated meaning representation does not represent the pragmatic meaning. A dialog system is therefore far from being reached, because every input of a human can have dozens of different meanings. The intended meaning can sometimes depend on a thought that this human had while typing the input. As the computer doesn't have the ability to read thoughts, it would be impossible for the computer to determine the intended meaning of the input.
In a mission critical environment this ambiguity could lead to catastrophic results, because the computer, simply put, ``didn't get it''. This risk limits the usability of natural language communication with a computer for propably a long time to a very restricted set of use cases.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% hier werden - zum Ende des Textes - die bibliographischen Referenzen % hier werden - zum Ende des Textes - die bibliographischen Referenzen
% eingebunden % eingebunden