Received: from webgate.proteosys.de (mail.proteosys-ag.com [62.225.9.49]) by lucy.proteosys (8.11.0/8.9.3/SuSE Linux 8.9.3-0.1) with ESMTP id f4EEaOf24092 for ; Mon, 14 May 2001 16:36:24 +0200 Received: by webgate.proteosys.de (8.11.0/8.11.0) with ESMTP id f4EEaO721892 . for ; Mon, 14 May 2001 16:36:24 +0200 Received: from mail.Uni-Mainz.DE (mailserver1.zdv.Uni-Mainz.DE [134.93.8.30]) by mailgate2.zdv.Uni-Mainz.DE (8.11.0/8.10.2) with ESMTP id f4EEaN003623 for ; Mon, 14 May 2001 16:36:23 +0200 (MET DST) MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----_=_NextPart_001_01C0DC83.40102400" Received: from mailgate2.zdv.Uni-Mainz.DE (mailgate2.zdv.Uni-Mainz.DE [134.93.8.57]) by mail.Uni-Mainz.DE (8.9.3/8.9.3) with ESMTP id QAA22959 for ; Mon, 14 May 2001 16:36:23 +0200 (MEST) X-MimeOLE: Produced By Microsoft Exchange V6.5 Received: from mail.listserv.gmd.de (mail.listserv.gmd.de [192.88.97.5]) by mailgate2.zdv.Uni-Mainz.DE (8.11.0/8.10.2) with ESMTP id f4EEaM003618 for ; Mon, 14 May 2001 16:36:22 +0200 (MET DST) Received: from mail.listserv.gmd.de (192.88.97.5) by mail.listserv.gmd.de (LSMTP for OpenVMS v1.1a) with SMTP id <3.3F596378@mail.listserv.gmd.de>; Mon, 14 May 2001 16:34:45 +0200 Received: from RELAY.URZ.UNI-HEIDELBERG.DE by RELAY.URZ.UNI-HEIDELBERG.DE (LISTSERV-TCP/IP release 1.8b) with spool id 495993 for LATEX-L@RELAY.URZ.UNI-HEIDELBERG.DE; Mon, 14 May 2001 16:36:19 +0200 Received: from ix.urz.uni-heidelberg.de (mail.urz.uni-heidelberg.de [129.206.119.234]) by relay.urz.uni-heidelberg.de (8.8.8/8.8.8) with ESMTP id QAA15970 for ; Mon, 14 May 2001 16:36:17 +0200 (MET DST) Received: from relay.uni-heidelberg.de (relay.uni-heidelberg.de [129.206.100.212]) by ix.urz.uni-heidelberg.de (8.8.8/8.8.8) with ESMTP id QAA33260 for ; Mon, 14 May 2001 16:36:18 +0200 Received: from ams.org (sun06.ams.org [130.44.1.6]) by relay.uni-heidelberg.de (8.10.2+Sun/8.10.2) with ESMTP id f4EEaHQ04021 for ; Mon, 14 May 2001 16:36:17 +0200 (MET DST) Received: (from mjd@localhost) by ams.org (8.11.2/8.11.2) id f4EEaGA19643; Mon, 14 May 2001 10:36:16 -0400 (EDT) In-Reply-To: Lars Hellström's message of "Mon, 14 May 2001 15:05:35 +0200" Lines: 33 References: Return-Path: X-Mailer: Gnus v5.7/Emacs 20.7 Content-class: urn:content-classes:message Subject: Re: Multilingual Encodings Summary 2.2 Date: Mon, 14 May 2001 15:36:16 +0100 Message-ID: X-MS-Has-Attach: X-MS-TNEF-Correlator: From: "Michael John Downes" Sender: "Mailing list for the LaTeX3 project" To: "Multiple recipients of list LATEX-L" Reply-To: "Mailing list for the LaTeX3 project" Status: R X-Status: X-Keywords: X-UID: 4062 This is a multi-part message in MIME format. ------_=_NextPart_001_01C0DC83.40102400 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable > Javier Bezos wrote: > >> >For example, removing the fi ligature in Turkish. Or using an = alternate > >> >ortography in languages with contextual analysis. ... > >Actually, they aren't, but for some reason Knuth > >very likely understands, this information is included > >in the tfm files (text font *metrics*). It is because the pattern-matching required to identify ligature points is almost identical (in the standard English cases) to the pattern-matching required for kerning. For kerns between pairs of characters, it is natural to store the information in the .tfm file because it is highly dependent on the glyph shapes. Then one needs to run some sort of pattern-matching process to catch pairs of characters in the typesetting sequence in order to insert kerns where applicable. For ligatures a very similar pattern-matching process is needed, only somewhat more generalized. (For more complicated requirements one needs something even more general, like the Omega OCPs.) The idea of repeating the similar processing in two separate steps instead of combining them as much as possible into a single subroutine would doubtless have seemed horrifyingly wasteful to Knuth. This would fall within the fabled "inner loop" that he mentions so often in tex.web as an area of special concern. And then it is natural to store the ligature pattern data in the same place (the .tfm file) to make using it as simple as possible. But consequently if one wants to typeset some material with ligatures turned off, the need to call a different 'no-ligatures' font tends to be a real hindrance in practice. ------_=_NextPart_001_01C0DC83.40102400 Content-Type: text/html; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Re: Multilingual Encodings Summary 2.2

> Javier Bezos wrote:
> >> >For example, removing the fi = ligature in Turkish. Or using an alternate
> >> >ortography in languages with = contextual analysis.
...
> >Actually, they aren't, but for some reason = Knuth
> >very likely understands, this information is = included
> >in the tfm files (text font = *metrics*).

It is because the pattern-matching required to = identify ligature points
is almost identical (in the standard English cases) = to the
pattern-matching required for kerning.

For kerns between pairs of characters, it is natural = to store the
information in the .tfm file because it is highly = dependent on the glyph
shapes. Then one needs to run some sort of = pattern-matching process to
catch pairs of characters in the typesetting sequence = in order to insert
kerns where applicable. For ligatures a very similar = pattern-matching
process is needed, only somewhat more generalized. = (For more complicated
requirements one needs something even more general, = like the Omega
OCPs.)

The idea of repeating the similar processing in two = separate steps
instead of combining them as much as possible into a = single subroutine
would doubtless have seemed horrifyingly wasteful to = Knuth. This would
fall within the fabled "inner loop" that he = mentions so often in
tex.web as an area of special concern.

And then it is natural to store the ligature pattern = data in the same
place (the .tfm file) to make using it as simple as = possible.

But consequently if one wants to typeset some material = with ligatures
turned off, the need to call a different = 'no-ligatures' font tends to be
a real hindrance in practice.

------_=_NextPart_001_01C0DC83.40102400--