MIME-Version: 1.0
Content-Type: multipart/alternative;
	boundary="----_=_NextPart_001_01C09524.16304F80"
In-Reply-To:  <14984.9874.886663.642855@istrati.zdv.uni-mainz.de>
References: <v03110702b6ad97205449@[195.100.226.145]>            <v03110700b6aa13b0f08b@[195.100.226.132]>            <v03110702b6ad97205449@[195.100.226.145]>
Content-class: urn:content-classes:message
Subject:      Re: Why markup?
Date: Mon, 12 Feb 2001 19:44:31 +0100
Message-ID:  <v03110700b6addd26c875@[195.100.226.129]>
From: "Hans Aberg" <haberg@MATEMATIK.SU.SE>
Sender: "Mailing list for the LaTeX3 project" <LATEX-L@URZ.UNI-HEIDELBERG.DE>
To: "Multiple recipients of list LATEX-L" <LATEX-L@URZ.UNI-HEIDELBERG.DE>
Reply-To: "Mailing list for the LaTeX3 project" <LATEX-L@URZ.UNI-HEIDELBERG.DE>
Status: R

This is a multi-part message in MIME format.

------_=_NextPart_001_01C09524.16304F80
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable

At 19:08 +0100 2001/02/12, Frank Mittelbach wrote:
> > -- Again, if there was a better parser at hand, one would not need =
to have
> > any markup at all, because it would be able to see that `20' is the =
object
>
>give up, unless you want to wait for a parser which "understands" human
>language (or write one but don't hand wave it)

Writing a parser for English is clearly a research project; see for =
example
  http://lands.let.kun.nl/TSpublic/tosca/

One does not get very far with LALR(1) on this problem (parsing =
English),
even less far when trying work with a grammar that depends on semantic
information. Computer languages such as C and C++ are not strictly =
LALR(1),
but can be made to parse in such chunks.

>how do you do without markup in this case:
>
>  The $a$ in the formula is a variable

The usual remark on this: Can you parse it? :-) -- If you can parse it, =
it
must be possible. Right?

-- The general picture, though, is that the more general grammars the
parser can handle, the less markup will be needed.

>while contrieved i came across that particular problem in math when i =
tried to
>understand an article in Hungarian (i think) about number theory and =
misstook
>an "a" being text as part of a longer inline formula because it was
>incorrectly coded (by you perhaps?) ie not identifiable easily as math =
not
>text.

Sorry, I don't know Hungarian.

  Hans Aberg

------_=_NextPart_001_01C09524.16304F80
Content-Type: text/html;
	charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN">
<HTML>
<HEAD>
<META HTTP-EQUIV=3D"Content-Type" CONTENT=3D"text/html; =
charset=3Diso-8859-1">
<META NAME=3D"Generator" CONTENT=3D"MS Exchange Server version =
6.5.7654.12">
<TITLE>     Re: Why markup?</TITLE>
</HEAD>
<BODY>
<!-- Converted from text/plain format -->

<P><FONT SIZE=3D2>At 19:08 +0100 2001/02/12, Frank Mittelbach =
wrote:</FONT>

<BR><FONT SIZE=3D2>&gt; &gt; -- Again, if there was a better parser at =
hand, one would not need to have</FONT>

<BR><FONT SIZE=3D2>&gt; &gt; any markup at all, because it would be able =
to see that `20' is the object</FONT>

<BR><FONT SIZE=3D2>&gt;</FONT>

<BR><FONT SIZE=3D2>&gt;give up, unless you want to wait for a parser =
which &quot;understands&quot; human</FONT>

<BR><FONT SIZE=3D2>&gt;language (or write one but don't hand wave =
it)</FONT>
</P>

<P><FONT SIZE=3D2>Writing a parser for English is clearly a research =
project; see for example</FONT>

<BR><FONT SIZE=3D2>&nbsp; <A =
HREF=3D"http://lands.let.kun.nl/TSpublic/tosca/">http://lands.let.kun.nl/=
TSpublic/tosca/</A></FONT>
</P>

<P><FONT SIZE=3D2>One does not get very far with LALR(1) on this problem =
(parsing English),</FONT>

<BR><FONT SIZE=3D2>even less far when trying work with a grammar that =
depends on semantic</FONT>

<BR><FONT SIZE=3D2>information. Computer languages such as C and C++ are =
not strictly LALR(1),</FONT>

<BR><FONT SIZE=3D2>but can be made to parse in such chunks.</FONT>
</P>

<P><FONT SIZE=3D2>&gt;how do you do without markup in this case:</FONT>

<BR><FONT SIZE=3D2>&gt;</FONT>

<BR><FONT SIZE=3D2>&gt;&nbsp; The $a$ in the formula is a =
variable</FONT>
</P>

<P><FONT SIZE=3D2>The usual remark on this: Can you parse it? :-) -- If =
you can parse it, it</FONT>

<BR><FONT SIZE=3D2>must be possible. Right?</FONT>
</P>

<P><FONT SIZE=3D2>-- The general picture, though, is that the more =
general grammars the</FONT>

<BR><FONT SIZE=3D2>parser can handle, the less markup will be =
needed.</FONT>
</P>

<P><FONT SIZE=3D2>&gt;while contrieved i came across that particular =
problem in math when i tried to</FONT>

<BR><FONT SIZE=3D2>&gt;understand an article in Hungarian (i think) =
about number theory and misstook</FONT>

<BR><FONT SIZE=3D2>&gt;an &quot;a&quot; being text as part of a longer =
inline formula because it was</FONT>

<BR><FONT SIZE=3D2>&gt;incorrectly coded (by you perhaps?) ie not =
identifiable easily as math not</FONT>

<BR><FONT SIZE=3D2>&gt;text.</FONT>
</P>

<P><FONT SIZE=3D2>Sorry, I don't know Hungarian.</FONT>
</P>

<P><FONT SIZE=3D2>&nbsp; Hans Aberg</FONT>
</P>

</BODY>
</HTML>
------_=_NextPart_001_01C09524.16304F80--