Received: from mail.proteosys.com ([213.139.130.197]) by nummer-3.proteosys with Microsoft SMTPSVC(5.0.2195.6713); Mon, 5 Jul 2004 15:37:43 +0200 Received: by mail.proteosys.com (8.12.10/8.12.2) with ESMTP id i65DbbUO032341 for ; Mon, 5 Jul 2004 15:37:38 +0200 Received: from listserv.uni-heidelberg.de (listserv.uni-heidelberg.de [129.206.119.176]) by relay2.uni-heidelberg.de (8.12.10/8.12.10) with ESMTP id i65DTHBp013697; Mon, 5 Jul 2004 15:29:17 +0200 (MET DST) MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----_=_NextPart_001_01C46295.3F9B2D80" Received: from listserv (listserv.uni-heidelberg.de [129.206.119.176]) by listserv.uni-heidelberg.de (8.12.7/8.12.7/SuSE Linux 0.6) with ESMTP id i63A8ag8004682; Mon, 5 Jul 2004 15:27:16 +0200 Received: from LISTSERV.UNI-HEIDELBERG.DE by LISTSERV.UNI-HEIDELBERG.DE (LISTSERV-TCP/IP release 1.8e) with spool id 364794 for LATEX-L@LISTSERV.UNI-HEIDELBERG.DE; Mon, 5 Jul 2004 15:27:16 +0200 X-MimeOLE: Produced By Microsoft Exchange V6.5 Received: from relay.uni-heidelberg.de (relay.uni-heidelberg.de [129.206.100.212]) by listserv.uni-heidelberg.de (8.12.7/8.12.7/SuSE Linux 0.6) with ESMTP id i65DRFh6013470 for ; Mon, 5 Jul 2004 15:27:15 +0200 Received: from atlas.informatik.uni-freiburg.de (atlas.informatik.uni-freiburg.de [132.230.150.3]) by relay.uni-heidelberg.de (8.12.10/8.12.10) with ESMTP id i65DT1IY000815 for ; Mon, 5 Jul 2004 15:29:02 +0200 (MET DST) Received: from remote218-04.home.uni-freiburg.de ([132.230.218.4] helo=m0A02325D.vpn.uni-freiburg.de) by atlas.informatik.uni-freiburg.de with asmtp (TLSv1:AES256-SHA:256) (Exim 4.32) id 1BhTWn-0006XL-6r for LATEX-L@LISTSERV.UNI-HEIDELBERG.DE; Mon, 05 Jul 2004 15:29:01 +0200 Received: by m0A02325D.vpn.uni-freiburg.de (Postfix, from userid 500) id 1533238666; Mon, 5 Jul 2004 15:30:06 +0200 (CEST) In-Reply-To: <20040705.073134.197330345.wl@gnu.org> References: <20040705.073134.197330345.wl@gnu.org> Return-Path: X-OriginalArrivalTime: 05 Jul 2004 13:37:43.0343 (UTC) FILETIME=[3FCF83F0:01C46295] X-Scanned-By: MIMEDefang 2.28 (www . roaringpenguin . com / mimedefang) X-ProteoSys-SPAM-Score: 0 () x-spam-auto-whitelist: Content-class: urn:content-classes:message Subject: Re: accents and inputenc Date: Mon, 5 Jul 2004 14:30:06 +0100 Message-ID: A<20040705133005.GA3295@m0A02325D.vpn.uni-freiburg.de> X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: accents and inputenc Thread-Index: AcRilT/sK83d4ypwR+ek8MmlciHcpQ== From: "Heiko Oberdiek" Sender: "Mailing list for the LaTeX3 project" To: Reply-To: "Mailing list for the LaTeX3 project" Status: R X-Status: X-Keywords: X-UID: 4797 This is a multi-part message in MIME format. ------_=_NextPart_001_01C46295.3F9B2D80 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable On Mon, Jul 05, 2004 at 07:31:34AM +0200, Werner LEMBERG wrote: > [LaTeX 2e 2003/12/01] > > Is the following a known limitation or a bug? And if it is a > limitation, where is it documented? > > \documentclass{article} > > \usepackage[latin3]{inputenc} > > \begin{document} > \tableofcontents > \section{\'^^b9} > \end{document} > > ^^b9 is the dotless i in latin 3 -- in the TOC, the accent is > formatted incorrectly. BTW, it doesn't matter whether OT1 or T1 is > used. Package inputenc translates the input characters that it controls into TeX code: ^^b9 becomes: \show^^b9 ->\IeC {\i } Actually 4 tokens instead of one ^^b9 token. This goes into the .aux and .toc file: \contentsline {section}{\numberline {1}\'\IeC {\i }}{1} The function of \IeC is that spaces after the character are detected correctly: ^^b9 foobar --> space between \i foobar --> no space \IeC{\i} foobar --> space between Because of the four tokens you need braces around such characters: \section{\'{^^b9}} Of course it is possible to change the behaviour of inputenc: The translation into TeX code is deferred in protecting environments, so that the 8-bit character goes into the .aux and .toc file: \contentsline {section}{\numberline {1}\'^^b9}{1} The disadvantage of this approach is, that the \section command and \tableofcontents are processed at different times perhaps with different input encodings. Then the wrong input encoding can apply to the section title in the table of contents. Then changes of the input encoding has to be recorded in the .toc file, too. Yours sincerely Heiko ------_=_NextPart_001_01C46295.3F9B2D80 Content-Type: text/html; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Re: accents and inputenc

On Mon, Jul 05, 2004 at 07:31:34AM +0200, Werner = LEMBERG wrote:

> [LaTeX 2e 2003/12/01]
>
> Is the following a known limitation or a = bug?  And if it is a
> limitation, where is it documented?
>
>   \documentclass{article}
>
>   \usepackage[latin3]{inputenc}
>
>   \begin{document}
>   \tableofcontents
>   \section{\'^^b9}
>   \end{document}
>
> ^^b9 is the dotless i in latin 3 -- in the TOC, = the accent is
> formatted incorrectly.  BTW, it doesn't = matter whether OT1 or T1 is
> used.

Package inputenc translates the input characters that = it controls
into TeX code: ^^b9 becomes:
  \show^^b9
  ->\IeC {\i }
Actually 4 tokens instead of one ^^b9 token.

This goes into the .aux and .toc file:
  \contentsline {section}{\numberline {1}\'\IeC = {\i }}{1}

The function of \IeC is that spaces after the = character
are detected correctly:
  ^^b9 foobar     --> = space between
  \i foobar       = --> no space
  \IeC{\i} foobar --> space between

Because of the four tokens you need braces around such = characters:
  \section{\'{^^b9}}

Of course it is possible to change the behaviour of = inputenc:
The translation into TeX code is deferred in = protecting environments,
so that the 8-bit character goes into the .aux and = .toc file:
  \contentsline {section}{\numberline = {1}\'^^b9}{1}

The disadvantage of this approach is, that the = \section command
and \tableofcontents are processed at different times = perhaps with
different input encodings. Then the wrong input = encoding can
apply to the section title in the table of contents. = Then changes
of the input encoding has to be recorded in the .toc = file, too.

Yours sincerely
  Heiko <oberdiek@uni-freiburg.de>

------_=_NextPart_001_01C46295.3F9B2D80--