MIME-Version: 1.0
Content-Type: multipart/alternative;
	boundary="----_=_NextPart_001_01C08C34.60682580"
In-Reply-To:  <200101312200.XAA09346@bar.loria.fr>
References: <14968.34118.306909.315983@istrati.zdv.uni-mainz.de>            <200101312200.XAA09346@bar.loria.fr>
Content-class: urn:content-classes:message
Subject:      Re: default inputenc/fontenc tight to language
Date: Thu, 1 Feb 2001 10:48:37 +0100
Message-ID:  <14969.12533.759505.917813@istrati.zdv.uni-mainz.de>
From: "Frank Mittelbach" <frank.mittelbach@LATEX-PROJECT.ORG>
Sender: "Mailing list for the LaTeX3 project" <LATEX-L@URZ.UNI-HEIDELBERG.DE>
To: "Multiple recipients of list LATEX-L" <LATEX-L@URZ.UNI-HEIDELBERG.DE>
Reply-To: "Mailing list for the LaTeX3 project" <LATEX-L@URZ.UNI-HEIDELBERG.DE>
Status: R

This is a multi-part message in MIME format.

------_=_NextPart_001_01C08C34.60682580
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable

Denis,

 > Well, when I brought up that issue, it was not pure theory.

didn't expect it otherwise.

 > (inputenc2 is a variant of inputenc where you can switch the input =
encoding
 > within a paragraph; it is possible that there is a standard package
 > achieving this now)

think it is called inputenc these days

 > Yes, I had such a problem with differed layouts like the table of =
contents.
 > For instance, say you have a French, then a Russian section. If you =
write
 >
 >   switch encoding to French
 >
 >   \tableofcontents
 >
 >   \section{French}
 >
 >   switch encoding to Russian
 >
 >   \section{Russian}
 >
 > you'll end up with `French' appearing in Cyrillic, because
 > the state at the end of the \tableofcontents is not restored.
 > You have to add an explicit change of encoding, for instance after
 > \tableofcontents, or at the end of your document.

basically what you are saying is that moving text needs to keep =
information
about its state with it, right? it fortunately doesn't need to keep
information about its input encoding since that got all normalised into =
the
internal representation but unfortunately you need to keep information =
about
the encoding used (or rather the encoding intended)

a bit inconsistent that, isn't it? but would it help if the language has =
a tie
to the encoding?

i think current babel would handle the toc example right (if the output
encoding is set by the language) but i guess this would not be true for =
mark
entries.


concerning the output encoding: assuming there is something like
\languagefontencoding which is either unset or set (per language) and if =
set
results in a change to the font encoding whenever a language switch =
happens.
(if unset for a language that should probably mean revert to the =
document
default encoding).

for languages like Russian things are relatively clear (though not =
really
either) since due to the limitation of TeX we are forced to select an
encoding, so there the language might as well provide a default like =
T2A, but
for German this it is not the case, ie one can type in OT1, T1, or even =
OT4.

a scheme like the above would then do well enough but the user
still has to make some font encoding decisions (like what is the default =
font
encoding, or do i want T1 with english ...)

however it would be far nicer if TeX (or LaTeX) would be able to take a =
bit of
text written in the internal LaTeX representation (plus language tags), =
ie
ascii + font encoding specific commands and automagically figures out =
behind
the scene how to typeset the lot. only i can't see how to make this =
happen
more automatically with the above scheme, which requires a) language =
tags and
b) potentially customisation in the preamble.

a lot of people find (and i agree) that

 \usepackage[T1]{fontenc}

is already to much to expect (and difficult or impossible to explain) =
--- and
why should a user be concerned with it?

but then, the same people have diametral ideas what are the right values =
of
fontencodings for a certain language: just look at the different =
opinions on
this list concerning French and OT1 or T1.

so we have to offer a choice, question is, is there a better way to =
present
it?


frank

------_=_NextPart_001_01C08C34.60682580
Content-Type: text/html;
	charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN">
<HTML>
<HEAD>
<META HTTP-EQUIV=3D"Content-Type" CONTENT=3D"text/html; =
charset=3Diso-8859-1">
<META NAME=3D"Generator" CONTENT=3D"MS Exchange Server version =
6.5.7654.12">
<TITLE>     Re: default inputenc/fontenc tight to language</TITLE>
</HEAD>
<BODY>
<!-- Converted from text/plain format -->

<P><FONT SIZE=3D2>Denis,</FONT>
</P>

<P><FONT SIZE=3D2>&nbsp;&gt; Well, when I brought up that issue, it was =
not pure theory.</FONT>
</P>

<P><FONT SIZE=3D2>didn't expect it otherwise.</FONT>
</P>

<P><FONT SIZE=3D2>&nbsp;&gt; (inputenc2 is a variant of inputenc where =
you can switch the input encoding</FONT>

<BR><FONT SIZE=3D2>&nbsp;&gt; within a paragraph; it is possible that =
there is a standard package</FONT>

<BR><FONT SIZE=3D2>&nbsp;&gt; achieving this now)</FONT>
</P>

<P><FONT SIZE=3D2>think it is called inputenc these days</FONT>
</P>

<P><FONT SIZE=3D2>&nbsp;&gt; Yes, I had such a problem with differed =
layouts like the table of contents.</FONT>

<BR><FONT SIZE=3D2>&nbsp;&gt; For instance, say you have a French, then =
a Russian section. If you write</FONT>

<BR><FONT SIZE=3D2>&nbsp;&gt;</FONT>

<BR><FONT SIZE=3D2>&nbsp;&gt;&nbsp;&nbsp; switch encoding to =
French</FONT>

<BR><FONT SIZE=3D2>&nbsp;&gt;</FONT>

<BR><FONT SIZE=3D2>&nbsp;&gt;&nbsp;&nbsp; \tableofcontents</FONT>

<BR><FONT SIZE=3D2>&nbsp;&gt;</FONT>

<BR><FONT SIZE=3D2>&nbsp;&gt;&nbsp;&nbsp; \section{French}</FONT>

<BR><FONT SIZE=3D2>&nbsp;&gt;</FONT>

<BR><FONT SIZE=3D2>&nbsp;&gt;&nbsp;&nbsp; switch encoding to =
Russian</FONT>

<BR><FONT SIZE=3D2>&nbsp;&gt;</FONT>

<BR><FONT SIZE=3D2>&nbsp;&gt;&nbsp;&nbsp; \section{Russian}</FONT>

<BR><FONT SIZE=3D2>&nbsp;&gt;</FONT>

<BR><FONT SIZE=3D2>&nbsp;&gt; you'll end up with `French' appearing in =
Cyrillic, because</FONT>

<BR><FONT SIZE=3D2>&nbsp;&gt; the state at the end of the =
\tableofcontents is not restored.</FONT>

<BR><FONT SIZE=3D2>&nbsp;&gt; You have to add an explicit change of =
encoding, for instance after</FONT>

<BR><FONT SIZE=3D2>&nbsp;&gt; \tableofcontents, or at the end of your =
document.</FONT>
</P>

<P><FONT SIZE=3D2>basically what you are saying is that moving text =
needs to keep information</FONT>

<BR><FONT SIZE=3D2>about its state with it, right? it fortunately =
doesn't need to keep</FONT>

<BR><FONT SIZE=3D2>information about its input encoding since that got =
all normalised into the</FONT>

<BR><FONT SIZE=3D2>internal representation but unfortunately you need to =
keep information about</FONT>

<BR><FONT SIZE=3D2>the encoding used (or rather the encoding =
intended)</FONT>
</P>

<P><FONT SIZE=3D2>a bit inconsistent that, isn't it? but would it help =
if the language has a tie</FONT>

<BR><FONT SIZE=3D2>to the encoding?</FONT>
</P>

<P><FONT SIZE=3D2>i think current babel would handle the toc example =
right (if the output</FONT>

<BR><FONT SIZE=3D2>encoding is set by the language) but i guess this =
would not be true for mark</FONT>

<BR><FONT SIZE=3D2>entries.</FONT>
</P>
<BR>

<P><FONT SIZE=3D2>concerning the output encoding: assuming there is =
something like</FONT>

<BR><FONT SIZE=3D2>\languagefontencoding which is either unset or set =
(per language) and if set</FONT>

<BR><FONT SIZE=3D2>results in a change to the font encoding whenever a =
language switch happens.</FONT>

<BR><FONT SIZE=3D2>(if unset for a language that should probably mean =
revert to the document</FONT>

<BR><FONT SIZE=3D2>default encoding).</FONT>
</P>

<P><FONT SIZE=3D2>for languages like Russian things are relatively clear =
(though not really</FONT>

<BR><FONT SIZE=3D2>either) since due to the limitation of TeX we are =
forced to select an</FONT>

<BR><FONT SIZE=3D2>encoding, so there the language might as well provide =
a default like T2A, but</FONT>

<BR><FONT SIZE=3D2>for German this it is not the case, ie one can type =
in OT1, T1, or even OT4.</FONT>
</P>

<P><FONT SIZE=3D2>a scheme like the above would then do well enough but =
the user</FONT>

<BR><FONT SIZE=3D2>still has to make some font encoding decisions (like =
what is the default font</FONT>

<BR><FONT SIZE=3D2>encoding, or do i want T1 with english ...)</FONT>
</P>

<P><FONT SIZE=3D2>however it would be far nicer if TeX (or LaTeX) would =
be able to take a bit of</FONT>

<BR><FONT SIZE=3D2>text written in the internal LaTeX representation =
(plus language tags), ie</FONT>

<BR><FONT SIZE=3D2>ascii + font encoding specific commands and =
automagically figures out behind</FONT>

<BR><FONT SIZE=3D2>the scene how to typeset the lot. only i can't see =
how to make this happen</FONT>

<BR><FONT SIZE=3D2>more automatically with the above scheme, which =
requires a) language tags and</FONT>

<BR><FONT SIZE=3D2>b) potentially customisation in the preamble.</FONT>
</P>

<P><FONT SIZE=3D2>a lot of people find (and i agree) that</FONT>
</P>

<P><FONT SIZE=3D2>&nbsp;\usepackage[T1]{fontenc}</FONT>
</P>

<P><FONT SIZE=3D2>is already to much to expect (and difficult or =
impossible to explain) --- and</FONT>

<BR><FONT SIZE=3D2>why should a user be concerned with it?</FONT>
</P>

<P><FONT SIZE=3D2>but then, the same people have diametral ideas what =
are the right values of</FONT>

<BR><FONT SIZE=3D2>fontencodings for a certain language: just look at =
the different opinions on</FONT>

<BR><FONT SIZE=3D2>this list concerning French and OT1 or T1.</FONT>
</P>

<P><FONT SIZE=3D2>so we have to offer a choice, question is, is there a =
better way to present</FONT>

<BR><FONT SIZE=3D2>it?</FONT>
</P>
<BR>

<P><FONT SIZE=3D2>frank</FONT>
</P>

</BODY>
</HTML>
------_=_NextPart_001_01C08C34.60682580--