MIME-Version: 1.0
Content-Type: multipart/alternative;
	boundary="----_=_NextPart_001_01C0DC83.40102400"
In-Reply-To:  Lars Hellström's message of "Mon, 14 May 2001 15:05:35 +0200"
Lines: 33
References: <l03130300b72583874ed7@[130.239.20.144]>
Content-class: urn:content-classes:message
Subject:      Re: Multilingual Encodings Summary 2.2
Date: Mon, 14 May 2001 15:36:16 +0100
Message-ID:  <r0sni8ngkf.fsf@sun06.ams.org>
From: "Michael John Downes" <mjd@AMS.ORG>
Sender: "Mailing list for the LaTeX3 project" <LATEX-L@URZ.UNI-HEIDELBERG.DE>
To: "Multiple recipients of list LATEX-L" <LATEX-L@URZ.UNI-HEIDELBERG.DE>
Reply-To: "Mailing list for the LaTeX3 project" <LATEX-L@URZ.UNI-HEIDELBERG.DE>
Status: R

This is a multi-part message in MIME format.

------_=_NextPart_001_01C0DC83.40102400
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable

> Javier Bezos wrote:
> >> >For example, removing the fi ligature in Turkish. Or using an =
alternate
> >> >ortography in languages with contextual analysis.
...
> >Actually, they aren't, but for some reason Knuth
> >very likely understands, this information is included
> >in the tfm files (text font *metrics*).

It is because the pattern-matching required to identify ligature points
is almost identical (in the standard English cases) to the
pattern-matching required for kerning.

For kerns between pairs of characters, it is natural to store the
information in the .tfm file because it is highly dependent on the glyph
shapes. Then one needs to run some sort of pattern-matching process to
catch pairs of characters in the typesetting sequence in order to insert
kerns where applicable. For ligatures a very similar pattern-matching
process is needed, only somewhat more generalized. (For more complicated
requirements one needs something even more general, like the Omega
OCPs.)

The idea of repeating the similar processing in two separate steps
instead of combining them as much as possible into a single subroutine
would doubtless have seemed horrifyingly wasteful to Knuth. This would
fall within the fabled "inner loop" that he mentions so often in
tex.web as an area of special concern.

And then it is natural to store the ligature pattern data in the same
place (the .tfm file) to make using it as simple as possible.

But consequently if one wants to typeset some material with ligatures
turned off, the need to call a different 'no-ligatures' font tends to be
a real hindrance in practice.

------_=_NextPart_001_01C0DC83.40102400
Content-Type: text/html;
	charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN">
<HTML>
<HEAD>
<META HTTP-EQUIV=3D"Content-Type" CONTENT=3D"text/html; =
charset=3Diso-8859-1">
<META NAME=3D"Generator" CONTENT=3D"MS Exchange Server version =
6.5.7654.12">
<TITLE>     Re: Multilingual Encodings Summary 2.2</TITLE>
</HEAD>
<BODY>
<!-- Converted from text/plain format -->

<P><FONT SIZE=3D2>&gt; Javier Bezos wrote:</FONT>

<BR><FONT SIZE=3D2>&gt; &gt;&gt; &gt;For example, removing the fi =
ligature in Turkish. Or using an alternate</FONT>

<BR><FONT SIZE=3D2>&gt; &gt;&gt; &gt;ortography in languages with =
contextual analysis.</FONT>

<BR><FONT SIZE=3D2>...</FONT>

<BR><FONT SIZE=3D2>&gt; &gt;Actually, they aren't, but for some reason =
Knuth</FONT>

<BR><FONT SIZE=3D2>&gt; &gt;very likely understands, this information is =
included</FONT>

<BR><FONT SIZE=3D2>&gt; &gt;in the tfm files (text font =
*metrics*).</FONT>
</P>

<P><FONT SIZE=3D2>It is because the pattern-matching required to =
identify ligature points</FONT>

<BR><FONT SIZE=3D2>is almost identical (in the standard English cases) =
to the</FONT>

<BR><FONT SIZE=3D2>pattern-matching required for kerning.</FONT>
</P>

<P><FONT SIZE=3D2>For kerns between pairs of characters, it is natural =
to store the</FONT>

<BR><FONT SIZE=3D2>information in the .tfm file because it is highly =
dependent on the glyph</FONT>

<BR><FONT SIZE=3D2>shapes. Then one needs to run some sort of =
pattern-matching process to</FONT>

<BR><FONT SIZE=3D2>catch pairs of characters in the typesetting sequence =
in order to insert</FONT>

<BR><FONT SIZE=3D2>kerns where applicable. For ligatures a very similar =
pattern-matching</FONT>

<BR><FONT SIZE=3D2>process is needed, only somewhat more generalized. =
(For more complicated</FONT>

<BR><FONT SIZE=3D2>requirements one needs something even more general, =
like the Omega</FONT>

<BR><FONT SIZE=3D2>OCPs.)</FONT>
</P>

<P><FONT SIZE=3D2>The idea of repeating the similar processing in two =
separate steps</FONT>

<BR><FONT SIZE=3D2>instead of combining them as much as possible into a =
single subroutine</FONT>

<BR><FONT SIZE=3D2>would doubtless have seemed horrifyingly wasteful to =
Knuth. This would</FONT>

<BR><FONT SIZE=3D2>fall within the fabled &quot;inner loop&quot; that he =
mentions so often in</FONT>

<BR><FONT SIZE=3D2>tex.web as an area of special concern.</FONT>
</P>

<P><FONT SIZE=3D2>And then it is natural to store the ligature pattern =
data in the same</FONT>

<BR><FONT SIZE=3D2>place (the .tfm file) to make using it as simple as =
possible.</FONT>
</P>

<P><FONT SIZE=3D2>But consequently if one wants to typeset some material =
with ligatures</FONT>

<BR><FONT SIZE=3D2>turned off, the need to call a different =
'no-ligatures' font tends to be</FONT>

<BR><FONT SIZE=3D2>a real hindrance in practice.</FONT>
</P>

</BODY>
</HTML>
------_=_NextPart_001_01C0DC83.40102400--