Received: from webgate.proteosys.de (mail.proteosys-ag.com [62.225.9.49]) by lucy.proteosys (8.11.0/8.9.3/SuSE Linux 8.9.3-0.1) with ESMTP id f17KvSH26973 for ; Wed, 7 Feb 2001 21:57:28 +0100 Received: by webgate.proteosys.de (8.11.0/8.11.0) with ESMTP id f17KvSd10634 . for ; Wed, 7 Feb 2001 21:57:28 +0100 Received: from mail.Uni-Mainz.DE (mailserver1.zdv.Uni-Mainz.DE [134.93.8.30]) by mailgate1.zdv.Uni-Mainz.DE (8.11.0/8.10.2) with ESMTP id f17KvMM27656 for ; Wed, 7 Feb 2001 21:57:22 +0100 (MET) MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----_=_NextPart_001_01C09148.9469AC00" Received: from mailgate2.zdv.Uni-Mainz.DE (mailgate2.zdv.Uni-Mainz.DE [134.93.8.57]) by mail.Uni-Mainz.DE (8.9.3/8.9.3) with ESMTP id VAA25131 for ; Wed, 7 Feb 2001 21:57:22 +0100 (MET) Received: from mail.listserv.gmd.de (mail.listserv.gmd.de [192.88.97.5]) by mailgate2.zdv.Uni-Mainz.DE (8.11.0/8.10.2) with ESMTP id f17KvK705280 for ; Wed, 7 Feb 2001 21:57:20 +0100 (MET) X-MimeOLE: Produced By Microsoft Exchange V6.5 Received: from mail.listserv.gmd.de (192.88.97.5) by mail.listserv.gmd.de (LSMTP for OpenVMS v1.1a) with SMTP id <7.65002C4C@mail.listserv.gmd.de>; Wed, 7 Feb 2001 21:57:14 +0100 Received: from RELAY.URZ.UNI-HEIDELBERG.DE by RELAY.URZ.UNI-HEIDELBERG.DE (LISTSERV-TCP/IP release 1.8b) with spool id 489066 for LATEX-L@RELAY.URZ.UNI-HEIDELBERG.DE; Wed, 7 Feb 2001 21:57:16 +0100 Received: from ix.urz.uni-heidelberg.de (mail.urz.uni-heidelberg.de [129.206.119.234]) by relay.urz.uni-heidelberg.de (8.8.8/8.8.8) with ESMTP id VAA07006 for ; Wed, 7 Feb 2001 21:57:15 +0100 (MET) Received: from relay.uni-heidelberg.de (relay.uni-heidelberg.de [129.206.100.212]) by ix.urz.uni-heidelberg.de (8.8.8/8.8.8) with ESMTP id VAA06278 for ; Wed, 7 Feb 2001 21:57:16 +0100 Received: from moutvdom00.kundenserver.de (moutvdom00.kundenserver.de [195.20.224.149]) by relay.uni-heidelberg.de (8.10.2+Sun/8.10.2) with ESMTP id f17KvHu02851 for ; Wed, 7 Feb 2001 21:57:17 +0100 (MET) Received: from [195.20.224.219] (helo=mrvdom03.kundenserver.de) by moutvdom00.kundenserver.de with esmtp (Exim 2.12 #2) id 14QbeN-0000T8-00 for LATEX-L@urz.uni-heidelberg.de; Wed, 7 Feb 2001 21:57:15 +0100 Received: from manz-3e364767.pool.mediaways.net ([62.54.71.103] helo=istrati.zdv.uni-mainz.de) by mrvdom03.kundenserver.de with esmtp (Exim 2.12 #2) id 14QbeG-0005PY-00 for LATEX-L@URZ.UNI-HEIDELBERG.DE; Wed, 7 Feb 2001 21:57:09 +0100 Received: (from latex3@localhost) by istrati.zdv.uni-mainz.de (8.9.3/8.9.3/SuSE Linux 8.9.3-0.1) id VAA11612; Wed, 7 Feb 2001 21:41:53 +0100 In-Reply-To: References: <14975.56331.365469.731085@istrati.zdv.uni-mainz.de> Return-Path: X-Mailer: VM 6.75 under Emacs 20.4.1 x-mime-autoconverted: from quoted-printable to 8bit by relay.urz.uni-heidelberg.de id VAA07007 X-Authentication-Warning: istrati.zdv.uni-mainz.de: latex3 set sender to frank@mittelbach-online.de using -f Content-class: urn:content-classes:message Subject: Re: default inputenc/fontenc tight to language Date: Wed, 7 Feb 2001 21:41:53 +0100 Message-ID: <14977.45841.640881.805735@istrati.zdv.uni-mainz.de> X-MS-Has-Attach: X-MS-TNEF-Correlator: From: "Frank Mittelbach" Sender: "Mailing list for the LaTeX3 project" To: "Multiple recipients of list LATEX-L" Reply-To: "Mailing list for the LaTeX3 project" Status: R X-Status: X-Keywords: X-UID: 3738 This is a multi-part message in MIME format. ------_=_NextPart_001_01C09148.9469AC00 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable > > who will? the user groups? for many lanugages there isn't a user = group > > There are many interested experts around for those languages without = a > user group. One of the gathering places is the Omega mailing list. i know that, but that doesn't mean any of those groups and neither the = user groups are necessarily qualified to decide a standard. just pick the random example from my varioref package: i have language = support in there from users and this gets changed every now and then because i = get claims that such and such is not the right phrasing. how should I decide = if people from a single country claim their wording isn't sounding correct? and changing the default midway (as i did in case of varioref several = times) is really bad since it is making old document invalid. but i had to = change because it turned out that one or the other phrasing was indeed = incorrect you can argue that a standard defined by those people interested is = better than none. but it is also try that if at all possible you should stick = with a default once decided. so the problem is to find out when you are likely = to have enough data to make a decision so to come back to inputencs (which the above really was about): - right now LaTeX by default lets 8bit chars pass if inputenc is not loaded. this is an unfortunate fact of life and no package and only a kernel modification would change that and within 2e there will be no = such kernel modification, so with that we have to live for the moment. - but i consider this really problematical because the upper part of = 8bit is unknown territory and i do not subscribe to Thierry's approach of = using straight 8bit plus a T1 encoded font and hope all works out well. it = is true that for certain languages (including Thierry's and my own) it = does work if i'm on the right kind of computer but for others it does not = and it certainly wouldn't work if the font encoding mechanisms would be = extended to allow switching encodings according to font availability as = suggested. - one can summarize the current situation as follows: it defines a = default which is "pass whatever is coming straight to the font encoding" and = that requires the used input encoding and the font encoding to be the same = and it limits the use of fonts very very drastically. it is a straight extension of what Don did with 7bit with the slight difference that = for 7bit most keyboard encodings are identical I would propose that a follow up kernel (call it ltx3 or whatever, eg a consolidated version emerging from the currently developed x... packages = one day), would by default make the upper half an error if no input encoding = is specified. Sorry Thierry :-) but you shouldn't feel that bad about it a) = i'm known to change by mind and b) processors are that fast these days that = you can really work without problems with something like inputenc you will = not notice it. in that case only by specifying a input/keyboard encoding you get access = to using 8bit characters but at the same time you are assured that the = document contains all the necessary information to actually process it correctly elsewhere and you do not have the potential problem, reported by =C9ric, = that users do not notice that half their letters (ie those with accents) vanished. they wouldn't, they would produce error messages. now to provide default input encodings depending on language would help = a certain number of people to be able to leave out *one* line in the = preamble of the document (and if you are lucky with your choice, the larger part of = the LaTeX users) but at the same time would mean that people, who naively = just use any key on their keyboard but having an keyboard incompatible with the = default, would run in exactly the same problem =C9ric reported: they would now = get wrong output without noticing. So then, perhaps not =C9ric but somebody else = would rightly moan about such stupid defaults which make it likely that people = get incorrect documents. so in my opinion it there should be no default for = input encodings other than the one which is currently called "ascii" in = inputenc and which makes any 8bit an error. the above is only about input encodings; as I said earlier the situation = for output encodings is different and there are already defaults in current = Babel and in the implementation i'm working on they will get more generalised = trying to take into account the problems discussed concerning the use or not = use of certain encodings for certain fonts. the main problem i see with defaults for output encodings is that for = languages like French or German there isn't really a good default because you will = have always a large user group which is dead against one or the other, eg T1 = viz OT1 for other languages it is simpler. however this is more a political = than a technical question, ie who doesn't like THEM the day they make X for = language Y the default ... :-) frank ------_=_NextPart_001_01C09148.9469AC00 Content-Type: text/html; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Re: default inputenc/fontenc tight to language

 > > who will? the user groups? for many = lanugages there isn't a user group
 >
 > There are many interested experts around = for those languages without a
 > user group. One of the gathering places is = the Omega mailing list.

i know that, but that doesn't mean any of those groups = and neither the user
groups are necessarily qualified to decide a = standard.

just pick the random example from my varioref package: = i have language support
in there from users and this gets changed every now = and then because i get
claims that such and such is not the right phrasing. = how should I decide if
people from a single country claim their wording = isn't sounding correct?

and changing the default midway (as i did in case of = varioref several times)
is really bad since it is making old document = invalid. but i had to change
because it turned out that one or the other phrasing = was indeed incorrect

you can argue that a standard defined by those people = interested is better
than none. but it is also try that if at all possible = you should stick with a
default once decided. so the problem is to find out = when you are likely to
have enough data to make a decision

so to come back to inputencs (which the above really = was about):

 - right now LaTeX by default lets 8bit chars = pass if inputenc is not
   loaded. this is an unfortunate fact of = life and no package and only a
   kernel modification would change that = and within 2e there will be no such
   kernel modification, so with that we = have to live for the moment.

 - but i consider this really problematical = because the upper part of 8bit is
   unknown territory and i do not subscribe = to Thierry's approach of using
   straight 8bit plus a T1 encoded font and = hope all works out well. it is
   true that for certain languages = (including Thierry's and my own) it does
   work if i'm on the right kind of = computer but for others it does not and it
   certainly wouldn't work if the font = encoding mechanisms would be extended
   to allow switching encodings according = to font availability as suggested.

 - one can summarize the current situation as = follows: it defines a default
   which is "pass whatever is coming = straight to the font encoding" and that
   requires the used input encoding and the = font encoding to be the same and
   it limits the use of fonts very very = drastically. it is a straight
   extension of what Don did with 7bit with = the slight difference that for
   7bit most keyboard encodings are = identical

I would propose that a follow up kernel (call it ltx3 = or whatever, eg a
consolidated version emerging from the currently = developed x... packages one
day), would by default make the upper half an error = if no input encoding is
specified. Sorry Thierry :-) but you shouldn't feel = that bad about it a) i'm
known to change by mind and b) processors are that = fast these days that you
can really work without problems with something like = inputenc you will not
notice it.

in that case only by specifying a input/keyboard = encoding you get access to
using 8bit characters but at the same time you are = assured that the document
contains all the necessary information to actually = process it correctly
elsewhere and you do not have the potential problem, = reported by =C9ric, that
users do not notice that half their letters (ie those = with accents)
vanished. they wouldn't, they would produce error = messages.

now to provide default input encodings depending on = language would help a
certain number of people to be able to leave out = *one* line in the preamble of
the document (and if you are lucky with your choice, = the larger part of the
LaTeX users) but at the same time would mean that = people, who naively just use
any key on their keyboard but having an keyboard = incompatible with the default,
would run in exactly the same problem  =C9ric = reported: they would now get wrong
output without noticing. So then, perhaps not  = =C9ric but somebody else would
rightly moan about such stupid defaults which make it = likely that people get
incorrect documents. so in my opinion it there should = be no default for input
encodings other than the one which is currently = called "ascii" in inputenc and
which makes any 8bit an error.

the above is only about input encodings; as I said = earlier the situation for
output encodings is different and there are already = defaults in current Babel
and in the implementation i'm working on they will = get more generalised trying
to take into account the problems discussed = concerning the use or not use of
certain encodings for certain fonts.

the main problem i see with defaults for output = encodings is that for languages
like French or German there isn't really a good = default because you will have
always a large user group which is dead against one = or the other, eg T1 viz
OT1 for other languages it is simpler. however this = is more a political than a
technical question, ie who doesn't like THEM the day = they make X for language
Y the default ... :-)

frank

------_=_NextPart_001_01C09148.9469AC00--