Received: from webgate.proteosys.de (mail.proteosys-ag.com [62.225.9.49]) by lucy.proteosys (8.11.0/8.9.3/SuSE Linux 8.9.3-0.1) with ESMTP id f1BLMQH12216 for ; Sun, 11 Feb 2001 22:22:26 +0100 Received: by webgate.proteosys.de (8.11.0/8.11.0) with ESMTP id f1BLMQd26388 . for ; Sun, 11 Feb 2001 22:22:26 +0100 Received: from mail.Uni-Mainz.DE (mailserver1.zdv.Uni-Mainz.DE [134.93.8.30]) by mailgate1.zdv.Uni-Mainz.DE (8.11.0/8.10.2) with ESMTP id f1BLMPM16016 for ; Sun, 11 Feb 2001 22:22:25 +0100 (MET) MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----_=_NextPart_001_01C09470.BAF15500" Received: from mailgate1.zdv.Uni-Mainz.DE (mailgate1.zdv.Uni-Mainz.DE [134.93.8.56]) by mail.Uni-Mainz.DE (8.9.3/8.9.3) with ESMTP id WAA29302 for ; Sun, 11 Feb 2001 22:22:25 +0100 (MET) Received: from mail.listserv.gmd.de (mail.listserv.gmd.de [192.88.97.5]) by mailgate1.zdv.Uni-Mainz.DE (8.11.0/8.10.2) with ESMTP id f1BLMMM16006 for ; Sun, 11 Feb 2001 22:22:22 +0100 (MET) X-MimeOLE: Produced By Microsoft Exchange V6.5 Received: from mail.listserv.gmd.de (192.88.97.5) by mail.listserv.gmd.de (LSMTP for OpenVMS v1.1a) with SMTP id <5.8D9A93F3@mail.listserv.gmd.de>; Sun, 11 Feb 2001 22:22:16 +0100 Received: from RELAY.URZ.UNI-HEIDELBERG.DE by RELAY.URZ.UNI-HEIDELBERG.DE (LISTSERV-TCP/IP release 1.8b) with spool id 487824 for LATEX-L@RELAY.URZ.UNI-HEIDELBERG.DE; Sun, 11 Feb 2001 22:22:19 +0100 Received: from ix.urz.uni-heidelberg.de (mail.urz.uni-heidelberg.de [129.206.119.234]) by relay.urz.uni-heidelberg.de (8.8.8/8.8.8) with ESMTP id WAA27419 for ; Sun, 11 Feb 2001 22:22:17 +0100 (MET) Received: from relay.uni-heidelberg.de (relay.uni-heidelberg.de [129.206.100.212]) by ix.urz.uni-heidelberg.de (8.8.8/8.8.8) with ESMTP id WAA42724 for ; Sun, 11 Feb 2001 22:22:18 +0100 Received: from moutvdom00.kundenserver.de (moutvdom00.kundenserver.de [195.20.224.149]) by relay.uni-heidelberg.de (8.10.2+Sun/8.10.2) with ESMTP id f1BLMIu00798 for ; Sun, 11 Feb 2001 22:22:18 +0100 (MET) Received: from [195.20.224.219] (helo=mrvdom03.kundenserver.de) by moutvdom00.kundenserver.de with esmtp (Exim 2.12 #2) id 14S3wn-00050V-00 for LATEX-L@urz.uni-heidelberg.de; Sun, 11 Feb 2001 22:22:17 +0100 Received: from manz-3e364592.pool.mediaways.net ([62.54.69.146] helo=istrati.zdv.uni-mainz.de) by mrvdom03.kundenserver.de with esmtp (Exim 2.12 #2) id 14S3wf-0000ez-00 for LATEX-L@URZ.UNI-HEIDELBERG.DE; Sun, 11 Feb 2001 22:22:11 +0100 Received: (from latex3@localhost) by istrati.zdv.uni-mainz.de (8.9.3/8.9.3/SuSE Linux 8.9.3-0.1) id WAA13223; Sun, 11 Feb 2001 22:20:07 +0100 In-Reply-To: <14980.23750.628032.305093@gargle.gargle.HOWL> References: <200102091445.JAA00482@plmsc.psu.edu> <200102091643.RAA23818@mozart.ujf-grenoble.Fr> <14980.23750.628032.305093@gargle.gargle.HOWL> Return-Path: X-Mailer: VM 6.75 under Emacs 20.4.1 X-Authentication-Warning: istrati.zdv.uni-mainz.de: latex3 set sender to frank@mittelbach-online.de using -f Content-class: urn:content-classes:message Subject: Re: inputenc text (and/or math) Date: Sun, 11 Feb 2001 22:20:07 +0100 Message-ID: <14983.519.363031.751628@istrati.zdv.uni-mainz.de> X-MS-Has-Attach: X-MS-TNEF-Correlator: From: "Frank Mittelbach" Sender: "Mailing list for the LaTeX3 project" To: "Multiple recipients of list LATEX-L" Reply-To: "Mailing list for the LaTeX3 project" Status: R X-Status: X-Keywords: X-UID: 3823 This is a multi-part message in MIME format. ------_=_NextPart_001_01C09470.BAF15500 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Back to Marcel's original mail: > - I believe the only reasonable default _input encoding_ is UTF8. > Being a superset of ASCII while covering all of unicode, it seems > the ideal long-time solution to all input-encoding related = problems. > UTF8 is also becoming rather well supported by editors and other > applications. i can agree to seeing it (perhaps) becoming a default input encoding one = day; it certainly has some appealing features. > - In particular, defaulting to any other 8-bit input encoding in > LaTeX2e should be avoided at all cost because it would really mess > up the upgrade path to UTF8 later. (As far as I understand, the no problem for me as i would like to keep the default input encoding = "ASCII" for the moment > proposed default of \usepackage[T1]{fontenc} does not default any > 8bit input encoding. Is this correct?) T1 is a font encoding and irrelevant with respect to input encodings, = unless you are Thierry and misuse the fact that some of the font positions fit = with some 8bit input encoding positions in your code page on your machine so = that you can pass them right through :-) > - A regular user should never have to specify the _font encoding_. that would be the ideal, but unfortunately we are currently far from it. which is what we really should talk about (how to get there) > There should only be language environments (as provided by babel) > and font packages (e.g. times, palatino). This means: > > * Babel (or something providing equivalent functionality---I > strongly believe that it should become part core LaTeX3) must be not Babel (though Babel is core of 2e as it is supported by a core team member) since babel is the trying to support all kind of legacy things, = which is what it should in the current situation but which makes it unsuitable = to be fully make it unchanged into the kernel of a 2e successor. otherwise yes. > endowed with a default set of fonts for all languages it = supports. with some sort of multiple encoding concept like the one outlined in my handwaving implementation, this might in fact be possible > Some language environment defaults could be marked experimental, > meaning that associated fonts and TFMs may change once better > quality free fonts become available, but all languages must work > "out of the box". do i hear a volunteer crying out I want to help? > One the other hand, languages like german for > which the EC fonts are well accepted (?), could be frozen straigt > away. Not really. but anyway. if you want to hear a provocative statement (to = get some more mail tonight): are we sure we want to stay with T1? I mean, don't we know by now that a number of ideas we had when we came = up with T1 (ie the Cork encoding) are full of flaws? (yes I was one of the = guys being part of the group of people deciding on it, so yes i can blame = myself :-) should it perhaps be LY1 suitably renamed (or something similar)? > * The language environment chooses the default font encoding unless > a font package is explicitly loaded. There may be more than one what is that supposed to mean? (the unless part) > language environment per language if different typographical > esthetics need to be satisfied. perhaps to the second part, i guess i would think this should be = modelled differently but perhaps we are thinking of the same expressing it = differently > * Babel must hook into the currently active font package. If a > language environment is selected, the font package must be called > to set itself up. In other words, every font package must make a > decision about encoding as a function of the language selected. i guess that is impractical. a) it would mean that each such font package would need to know about = any language ever being added to the language support of LaTeX or worse = about every language environemt as you called it above. otherwise you would = end up with a very complex set of defaults which are 99% of the time used = anyway and probably in 50% not suitable. b) it would be absolutely near to impossible to change the behaviour of = the system as everything is happening at different levels. c) finally i think it is putting up the sattle from the back (in case = that is a phrase that translates :-). i mean a font package is first of all a = font package and as such should provide fonts. it is up to the language = support /language environment to decide which of those fonts in what way it is interested to use, not the other way around. > If the language is unknown to the font package, a warning or an > error must be issued. (I am sure the set of supported > language-font pairs will grow quickly if a good mechanism for > soliciting contributions is implemented.) probably as fast as the support for varioref strings over the years. > * Maybe one can introduce commands like > \uselanguage{spanish} > \usefont{times} > and autoload the necessary packages, to make clear that these > attributes function orthogonally to each other and to "ordinary" > packages. ahh, do they? then why all the discussion above and before? they are orthogonal right now and that is something some people think = should at least be customizable. > - Is there really a need for breaking the distinction of math mode > vs. non-math mode? guess there are at least two people on the list which should have some = saying here (Vladimir and Apostolos) as both of them have written support for removing the distinction and would like to see it go officially. > As far as Greek letters go, the most common one > is $\mu$ in units. in German or English texts you mean? :-) perhaps that is slightly different in Greek texts? perhaps? :-) > This raises the question if one should not > provide standard markup for units anyway (some journal packages are > doing it---there are also spacing issues involved that warrant > special treatment), for example as a "tools" package in the = standard > LaTeX distribution. that is something which we all thought being useful but never came = around doing. but just providing standard names would be beneficial already. > - For special needs, such as easy typing of cyrillic math in 7bit > ASCII one could provide special input encodings. yes, that would be my line too, except for: > In full unicode > this shouldn't be a problem, should it? since i don't see what it has to do with the line before. > I am aware that some of these demands cannot really be met within > Knuthian TeX, but it seems LaTeX3 is prepared to eventually go beyond > TeX. So it may be useful to define a minimal set of required > extensions/changes, as this issue could be a major roadblock to > enlarging the developer base. For example, is there much motivation > for anybody to clean up the hyphenation mess before a clean long-term > solution (not just a work-around) is agreed on? i think i started from that end at tome point in the afternoon (so = better not say anything contradicting myself :-) > Just some ideas, quite important ones i would say good night frank ------_=_NextPart_001_01C09470.BAF15500 Content-Type: text/html; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Re: inputenc text (and/or math)

Back to Marcel's original mail:

 > - I believe the only reasonable default = _input encoding_ is UTF8.
 >   Being a superset of ASCII = while covering all of unicode, it seems
 >   the ideal long-time solution = to all input-encoding related problems.
 >   UTF8 is also becoming rather = well supported by editors and other
 >   applications.

i can agree to seeing it (perhaps) becoming a default = input encoding one day;
it certainly has some appealing features.

 > - In particular, defaulting to any other = 8-bit input encoding in
 >   LaTeX2e should be avoided at = all cost because it would really mess
 >   up the upgrade path to UTF8 = later.  (As far as I understand, the

no problem for me as i would like to keep the default = input encoding "ASCII"
for the moment

 >   proposed default of = \usepackage[T1]{fontenc} does not default any
 >   8bit input encoding.  Is = this correct?)

T1 is a font encoding and irrelevant with respect to = input encodings, unless
you are Thierry and misuse the fact that some of the = font positions fit with
some 8bit input encoding positions in your code page = on your machine so that
you can pass them right through :-)


 > - A regular user should never have to = specify the _font encoding_.

that would be the ideal, but unfortunately we are = currently far from it.
which is what we really should talk about (how to get = there)


 >   There should only be language = environments (as provided by babel)
 >   and font packages (e.g. times, = palatino).  This means:
 >
 >   * Babel (or something = providing equivalent functionality---I
 >     strongly believe = that it should become part core LaTeX3) must be

not Babel (though Babel is core of 2e as it is = supported by a core team
member) since babel is the trying to support all kind = of legacy things, which
is what it should in the current situation but which = makes it unsuitable to be
fully make it unchanged into the kernel of a 2e = successor.

otherwise yes.

 >     endowed with a = default set of fonts for all languages it supports.

with some sort of multiple encoding concept like the = one outlined in my
handwaving implementation, this might in fact be = possible

 >     Some language = environment defaults could be marked experimental,
 >     meaning that = associated fonts and TFMs may change once better
 >     quality free fonts = become available, but all languages must work
 >     "out of the = box".

do i hear a volunteer crying out I want to = help?


 >     One the other hand, = languages like german for
 >     which the EC fonts = are well accepted (?), could be frozen straigt
 >     away.

Not really. but anyway. if you want to hear a = provocative statement (to get
some more mail tonight):

  are we sure we want to stay with T1?

I mean, don't we know by now that a number of ideas we = had when we came up
with T1 (ie the Cork encoding) are full of flaws? = (yes I was one of the guys
being part of the group of people deciding on it, so = yes i can blame myself
:-)

should it perhaps be LY1 suitably renamed (or = something similar)?


 >   * The language environment = chooses the default font encoding unless
 >     a font package is = explicitly loaded.  There may be more than one

what is that supposed to mean? (the unless = part)

 >     language = environment per language if different typographical
 >     esthetics need to = be satisfied.

perhaps to the second part, i guess i would think this = should be modelled
differently but perhaps we are thinking of the same = expressing it differently

 >   * Babel must hook into the = currently active font package.  If a
 >     language = environment is selected, the font package must be called
 >     to set itself = up.  In other words, every font package must make a
 >     decision about = encoding as a function of the language selected.

i guess that is impractical.

 a) it would mean that each such font package = would need to know about any
language ever being added to the language support of = LaTeX or worse about
every language environemt as you called it above. = otherwise you would end up
with a very complex set of defaults which are 99% of = the time used anyway and
probably in 50% not suitable.

 b) it would be absolutely near to impossible to = change the behaviour of the
system as everything is happening at different = levels.

 c) finally i think it is putting up the sattle = from the back (in case that is
a phrase that translates :-). i mean a font package = is first of all a font
package and as such should provide fonts. it is up to = the language support
/language environment to decide which of those fonts = in what way it is
interested to use, not the other way around.



 >     If the language is = unknown to the font package, a warning or an
 >     error must be = issued.  (I am sure the set of supported
 >     language-font = pairs will grow quickly if a good mechanism for
 >     soliciting = contributions is implemented.)

probably as fast as the support for varioref strings = over the years.

 >   * Maybe one can introduce = commands like
 >       = \uselanguage{spanish}
 >       = \usefont{times}
 >     and autoload the = necessary packages, to make clear that these
 >     attributes = function orthogonally to each other and to "ordinary"
 >     packages.

ahh, do they? then why all the discussion above and = before?

they are orthogonal right now and that is something = some people think should
at least be customizable.


 > - Is there really a need for breaking the = distinction of math mode
 >   vs. non-math mode?

guess there are at least two people on the list which = should have some saying
here (Vladimir and Apostolos) as both of them have = written support for
removing the distinction and would like to see it go = officially.


 >   As far as Greek letters go, the = most common one
 >   is $\mu$ in units.

in German or English texts you mean? :-)

perhaps that is slightly different in Greek texts? = perhaps? :-)

 >   This raises the question if one = should not
 >   provide standard markup for = units anyway (some journal packages are
 >   doing it---there are also = spacing issues involved that warrant
 >   special treatment), for = example as a "tools" package in the standard
 >   LaTeX distribution.

that is something which we all thought being useful = but never came around
doing. but just providing standard names would be = beneficial already.


 > - For special needs, such as easy typing of = cyrillic math in 7bit
 >   ASCII one could provide = special input encodings.

yes, that would be my line too, except for:

 >   In full unicode
 >   this shouldn't be a problem, = should it?

since i don't see what it has to do with the line = before.


 > I am aware that some of these demands = cannot really be met within
 > Knuthian TeX, but it seems LaTeX3 is = prepared to eventually go beyond
 > TeX.  So it may be useful to define a = minimal set of required
 > extensions/changes, as this issue could be = a major roadblock to
 > enlarging the developer base.  For = example, is there much motivation
 > for anybody to clean up the hyphenation = mess before a clean long-term
 > solution (not just a work-around) is = agreed on?

i think i started from that end at tome point in the = afternoon (so better not
say anything contradicting myself :-)

 > Just some ideas,

quite important ones i would say

good night
frank

------_=_NextPart_001_01C09470.BAF15500--