Received: from webgate.proteosys.de (mail.proteosys-ag.com [62.225.9.49]) by lucy.proteosys (8.11.0/8.9.3/SuSE Linux 8.9.3-0.1) with ESMTP id f119oE728938 for ; Thu, 1 Feb 2001 10:50:15 +0100 Received: by webgate.proteosys.de (8.11.0/8.11.0) with ESMTP id f119ov702737 . for ; Thu, 1 Feb 2001 10:51:04 +0100 Received: from mail.Uni-Mainz.DE (mailserver1.zdv.Uni-Mainz.DE [134.93.8.30]) by mailgate1.zdv.Uni-Mainz.DE (8.11.0/8.10.2) with ESMTP id f119o2M11322 for ; Thu, 1 Feb 2001 10:50:02 +0100 (MET) MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----_=_NextPart_001_01C08C34.60682580" Received: from mailgate2.zdv.Uni-Mainz.DE (mailgate2.zdv.Uni-Mainz.DE [134.93.8.57]) by mail.Uni-Mainz.DE (8.9.3/8.9.3) with ESMTP id KAA02086 for ; Thu, 1 Feb 2001 10:50:01 +0100 (MET) Received: from mail.listserv.gmd.de (mail.listserv.gmd.de [192.88.97.5]) by mailgate2.zdv.Uni-Mainz.DE (8.11.0/8.10.2) with ESMTP id f119o0717740 for ; Thu, 1 Feb 2001 10:50:01 +0100 (MET) X-MimeOLE: Produced By Microsoft Exchange V6.5 Received: from mail.listserv.gmd.de (192.88.97.5) by mail.listserv.gmd.de (LSMTP for OpenVMS v1.1a) with SMTP id <7.2DEDB94E@mail.listserv.gmd.de>; Thu, 1 Feb 2001 10:49:56 +0100 Received: from RELAY.URZ.UNI-HEIDELBERG.DE by RELAY.URZ.UNI-HEIDELBERG.DE (LISTSERV-TCP/IP release 1.8b) with spool id 485685 for LATEX-L@RELAY.URZ.UNI-HEIDELBERG.DE; Thu, 1 Feb 2001 10:49:56 +0100 Received: from ix.urz.uni-heidelberg.de (mail.urz.uni-heidelberg.de [129.206.119.234]) by relay.urz.uni-heidelberg.de (8.8.8/8.8.8) with ESMTP id KAA03945 for ; Thu, 1 Feb 2001 10:49:55 +0100 (MET) Received: from relay.uni-heidelberg.de (relay.uni-heidelberg.de [129.206.100.212]) by ix.urz.uni-heidelberg.de (8.8.8/8.8.8) with ESMTP id KAA42618 for ; Thu, 1 Feb 2001 10:49:55 +0100 Received: from moutvdom00.kundenserver.de (moutvdom00.kundenserver.de [195.20.224.149]) by relay.uni-heidelberg.de (8.10.2+Sun/8.10.2) with ESMTP id f119nup25260 for ; Thu, 1 Feb 2001 10:49:56 +0100 (MET) Received: from [195.20.224.220] (helo=mrvdom04.kundenserver.de) by moutvdom00.kundenserver.de with esmtp (Exim 2.12 #2) id 14OGNG-0002Iu-00 for LATEX-L@urz.uni-heidelberg.de; Thu, 1 Feb 2001 10:49:54 +0100 Received: from manz-3e3648b1.pool.mediaways.net ([62.54.72.177] helo=istrati.zdv.uni-mainz.de) by mrvdom04.kundenserver.de with esmtp (Exim 2.12 #2) id 14OGN8-0000Qh-01 for LATEX-L@URZ.UNI-HEIDELBERG.DE; Thu, 1 Feb 2001 10:49:46 +0100 Received: (from latex3@localhost) by istrati.zdv.uni-mainz.de (8.9.3/8.9.3/SuSE Linux 8.9.3-0.1) id KAA22329; Thu, 1 Feb 2001 10:48:38 +0100 In-Reply-To: <200101312200.XAA09346@bar.loria.fr> References: <14968.34118.306909.315983@istrati.zdv.uni-mainz.de> <200101312200.XAA09346@bar.loria.fr> Return-Path: X-Mailer: VM 6.75 under Emacs 20.4.1 X-Authentication-Warning: istrati.zdv.uni-mainz.de: latex3 set sender to frank@mittelbach-online.de using -f Content-class: urn:content-classes:message Subject: Re: default inputenc/fontenc tight to language Date: Thu, 1 Feb 2001 10:48:37 +0100 Message-ID: <14969.12533.759505.917813@istrati.zdv.uni-mainz.de> X-MS-Has-Attach: X-MS-TNEF-Correlator: From: "Frank Mittelbach" Sender: "Mailing list for the LaTeX3 project" To: "Multiple recipients of list LATEX-L" Reply-To: "Mailing list for the LaTeX3 project" Status: R X-Status: X-Keywords: X-UID: 3691 This is a multi-part message in MIME format. ------_=_NextPart_001_01C08C34.60682580 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Denis, > Well, when I brought up that issue, it was not pure theory. didn't expect it otherwise. > (inputenc2 is a variant of inputenc where you can switch the input = encoding > within a paragraph; it is possible that there is a standard package > achieving this now) think it is called inputenc these days > Yes, I had such a problem with differed layouts like the table of = contents. > For instance, say you have a French, then a Russian section. If you = write > > switch encoding to French > > \tableofcontents > > \section{French} > > switch encoding to Russian > > \section{Russian} > > you'll end up with `French' appearing in Cyrillic, because > the state at the end of the \tableofcontents is not restored. > You have to add an explicit change of encoding, for instance after > \tableofcontents, or at the end of your document. basically what you are saying is that moving text needs to keep = information about its state with it, right? it fortunately doesn't need to keep information about its input encoding since that got all normalised into = the internal representation but unfortunately you need to keep information = about the encoding used (or rather the encoding intended) a bit inconsistent that, isn't it? but would it help if the language has = a tie to the encoding? i think current babel would handle the toc example right (if the output encoding is set by the language) but i guess this would not be true for = mark entries. concerning the output encoding: assuming there is something like \languagefontencoding which is either unset or set (per language) and if = set results in a change to the font encoding whenever a language switch = happens. (if unset for a language that should probably mean revert to the = document default encoding). for languages like Russian things are relatively clear (though not = really either) since due to the limitation of TeX we are forced to select an encoding, so there the language might as well provide a default like = T2A, but for German this it is not the case, ie one can type in OT1, T1, or even = OT4. a scheme like the above would then do well enough but the user still has to make some font encoding decisions (like what is the default = font encoding, or do i want T1 with english ...) however it would be far nicer if TeX (or LaTeX) would be able to take a = bit of text written in the internal LaTeX representation (plus language tags), = ie ascii + font encoding specific commands and automagically figures out = behind the scene how to typeset the lot. only i can't see how to make this = happen more automatically with the above scheme, which requires a) language = tags and b) potentially customisation in the preamble. a lot of people find (and i agree) that \usepackage[T1]{fontenc} is already to much to expect (and difficult or impossible to explain) = --- and why should a user be concerned with it? but then, the same people have diametral ideas what are the right values = of fontencodings for a certain language: just look at the different = opinions on this list concerning French and OT1 or T1. so we have to offer a choice, question is, is there a better way to = present it? frank ------_=_NextPart_001_01C08C34.60682580 Content-Type: text/html; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Re: default inputenc/fontenc tight to language

Denis,

 > Well, when I brought up that issue, it was = not pure theory.

didn't expect it otherwise.

 > (inputenc2 is a variant of inputenc where = you can switch the input encoding
 > within a paragraph; it is possible that = there is a standard package
 > achieving this now)

think it is called inputenc these days

 > Yes, I had such a problem with differed = layouts like the table of contents.
 > For instance, say you have a French, then = a Russian section. If you write
 >
 >   switch encoding to = French
 >
 >   \tableofcontents
 >
 >   \section{French}
 >
 >   switch encoding to = Russian
 >
 >   \section{Russian}
 >
 > you'll end up with `French' appearing in = Cyrillic, because
 > the state at the end of the = \tableofcontents is not restored.
 > You have to add an explicit change of = encoding, for instance after
 > \tableofcontents, or at the end of your = document.

basically what you are saying is that moving text = needs to keep information
about its state with it, right? it fortunately = doesn't need to keep
information about its input encoding since that got = all normalised into the
internal representation but unfortunately you need to = keep information about
the encoding used (or rather the encoding = intended)

a bit inconsistent that, isn't it? but would it help = if the language has a tie
to the encoding?

i think current babel would handle the toc example = right (if the output
encoding is set by the language) but i guess this = would not be true for mark
entries.


concerning the output encoding: assuming there is = something like
\languagefontencoding which is either unset or set = (per language) and if set
results in a change to the font encoding whenever a = language switch happens.
(if unset for a language that should probably mean = revert to the document
default encoding).

for languages like Russian things are relatively clear = (though not really
either) since due to the limitation of TeX we are = forced to select an
encoding, so there the language might as well provide = a default like T2A, but
for German this it is not the case, ie one can type = in OT1, T1, or even OT4.

a scheme like the above would then do well enough but = the user
still has to make some font encoding decisions (like = what is the default font
encoding, or do i want T1 with english ...)

however it would be far nicer if TeX (or LaTeX) would = be able to take a bit of
text written in the internal LaTeX representation = (plus language tags), ie
ascii + font encoding specific commands and = automagically figures out behind
the scene how to typeset the lot. only i can't see = how to make this happen
more automatically with the above scheme, which = requires a) language tags and
b) potentially customisation in the preamble.

a lot of people find (and i agree) that

 \usepackage[T1]{fontenc}

is already to much to expect (and difficult or = impossible to explain) --- and
why should a user be concerned with it?

but then, the same people have diametral ideas what = are the right values of
fontencodings for a certain language: just look at = the different opinions on
this list concerning French and OT1 or T1.

so we have to offer a choice, question is, is there a = better way to present
it?


frank

------_=_NextPart_001_01C08C34.60682580--