Received: from mail.proteosys.com ([62.225.9.49]) by nummer-3.proteosys with Microsoft SMTPSVC(5.0.2195.5329); Fri, 17 Jan 2003 23:39:42 +0100 Received: by mail.proteosys.com (8.12.2/8.12.2) with ESMTP id h0HMde6C024761 for ; Fri, 17 Jan 2003 23:39:40 +0100 Received: from listserv.uni-heidelberg.de (listserv.uni-heidelberg.de [129.206.100.27]) by relay2.uni-heidelberg.de (8.12.4/8.12.4) with ESMTP id h0HMXswO009081; Fri, 17 Jan 2003 23:33:54 +0100 (MET) MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----_=_NextPart_001_01C2BE79.53708B00" Received: from listserv (listserv.uni-heidelberg.de [129.206.100.27]) by listserv.uni-heidelberg.de (8.12.2/8.12.2/SuSE Linux 0.6) with ESMTP id h0H4jk5L014095; Fri, 17 Jan 2003 23:26:52 +0100 Received: from LISTSERV.UNI-HEIDELBERG.DE by LISTSERV.UNI-HEIDELBERG.DE (LISTSERV-TCP/IP release 1.8d) with spool id 7555 for LATEX-L@LISTSERV.UNI-HEIDELBERG.DE; Fri, 17 Jan 2003 23:26:52 +0100 X-MimeOLE: Produced By Microsoft Exchange V6.5 Received: from relay.uni-heidelberg.de (relay.uni-heidelberg.de [129.206.100.212]) by listserv.uni-heidelberg.de (8.12.2/8.12.2/SuSE Linux 0.6) with ESMTP id h0HMQqkr021719 for ; Fri, 17 Jan 2003 23:26:52 +0100 Received: from mailgate.rz.uni-karlsruhe.de (exim@mailgate.rz.uni-karlsruhe.de [129.13.64.97]) by relay.uni-heidelberg.de (8.12.4/8.12.4) with ESMTP id h0HMXmEV029911 for ; Fri, 17 Jan 2003 23:33:49 +0100 (MET) Received: from g113.hadiko.de (root@hadig113.hadiko.uni-karlsruhe.de [172.20.43.13]) by mailgate.rz.uni-karlsruhe.de with esmtp (Exim 3.36 #1) id 18Zf3c-0006bG-00; Fri, 17 Jan 2003 23:33:48 +0100 Received: (from nil@localhost) by g113.hadiko.de (8.11.1/8.11.1/Debian 8.11.0-6) id h0HMXli00918 for LATEX-L@listserv.uni-heidelberg.de; Fri, 17 Jan 2003 23:33:47 +0100 In-Reply-To: <15912.23044.419984.897093@istrati.mittelbach-online.de> References: <200212031601.gB3G11cQ009558@sun.dante.de> <15899.14827.804209.458595@istrati.mittelbach-online.de> <20030116114637.GA9844@g113.hadiko.de> <15912.23044.419984.897093@istrati.mittelbach-online.de> Return-Path: X-OriginalArrivalTime: 17 Jan 2003 22:39:42.0783 (UTC) FILETIME=[53E804F0:01C2BE79] User-Agent: Mutt/1.3.28i X-Scanned-By: MIMEDefang 2.28 (www . roaringpenguin . com / mimedefang) X-Spam-Score: -3 () IN_REP_TO,QUOTED_EMAIL_TEXT,REFERENCES,SPAM_PHRASE_02_03,USER_AGENT,USER_AGENT_MUTT Content-class: urn:content-classes:message Subject: Re: latex/3480: Support for UTF-8 missing in inputenc.sty Date: Fri, 17 Jan 2003 23:33:46 +0100 Message-ID: A<20030117223345.GA14828@g113.hadiko.de> X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: Re: latex/3480: Support for UTF-8 missing in inputenc.sty Thread-Index: AcK+eVQLXTeCZzgnQmWEObkKXzA6TQ== From: "Dominique Unruh" To: Reply-To: "Mailing list for the LaTeX3 project" Status: R X-Status: X-Keywords: X-UID: 4434 This is a multi-part message in MIME format. ------_=_NextPart_001_01C2BE79.53708B00 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable > [aside, where is that file, the one that i have here is very short > and doesn't contain \euro but neither looks like a proper encoding > file either] Yes. I'm sorry, I seem to have mixed. It was the eurofont package that used the \euro command. But it shows the problem nevertheless. > > [about incompatible double definitions of Unicode chars] > being pragmatic i believe that these get weed out after a while, the = reason > for suggesting a .dfu file approach is that this allows easy = extensions for > locally developed encodings. In this case there should be at least some checking of redefinition, like \def\DeclareUnicodeCharacter...{% ... \if@alreadydefined \ifx\olddefinition\newdefinition\else \output@fierce@warning@or@error\fi\fi ...} > > [about generating .dfu files by a script] > a ucs.map file that contains the mappings Unicode->LICR in the form = directly > usable in .dfu files, simply as a template for making a .dfu if really > necessary. > > perhaps using docstrip to generate the standard dfu files from that > file Yes. I like that approach. In fact, this is like just implementing the "script" in LaTeX. > > There are already extensive lists of character mappings available = at: > > http://www.unruh.de/DniQ/latex/unicode/content/config/ > so there is, worth stealing from Perhaps I should add, that some of the macros rely on fontencodings written by my own (http://www.unruh.de/DniQ/latex/unicode/content/contrib) and some need extended fontencodings (http://www.unruh.de/DniQ/latex/unicode/content/ucsencs.def contains the additional macros), where the fontencodings are not complete enough (e.g. LGR). > would it be possible for you to give use a ten line bullet list of > comparsion? I will do so in the next days. > perhaps the best is simply to forget about what we did on lazy = afternoons > during the Xmas holidays? I don't think so. My package has one big disadvantage: Since it tries do support all and everything, it is huge and slow (and probably full of bugs). I don't think, it is suited for inclusion into the LaTeX kernel itself. I see it more as an alternative, in case that you need advanced features. > > - \DeclareUnicodeCharacter: This command is named identically in my > > system. I would appreciate if another name could be chosen at this > > early stadium to evade chaos. > > what are your arguments? Two scenarios: 1. Someone uses that command in some package or document. Then using the = wrong utf8.def will lead to inintelligible errors, instead of the more helpful "undefined command". 2. Someone tries to use both inputencs in a single document. (Perhaps because he wants to typeset most of the document with the fast in-kernel implementation, and some few strings containing combining chars with my implementation). > no we map to LICR those are characters not glyphs! On second thought: true. > > \DeclareUnicodeCommand (analogous to \DeclareTextCommand) > no again Command in that context has already some semantics Which? > \DeclareUnicodeLaTeXMapping Or: \DeclareUnicodeMapping % the LaTeX may be guessed \DeclareUnicodeInput % like in inputenc \DeclareUnicodeInputText % like in inputenc DniQ. ------_=_NextPart_001_01C2BE79.53708B00 Content-Type: text/html; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Re: latex/3480: Support for UTF-8 missing in = inputenc.sty

> [aside, where is that file, the one that i have = here is very short
> and doesn't contain \euro but neither looks like = a proper encoding
> file either]

Yes. I'm sorry, I seem to have mixed. It was the = eurofont package
that used the \euro command. But it shows the problem = nevertheless.

> > [about incompatible double definitions of = Unicode chars]
> being pragmatic i believe that these get weed = out after a while, the reason
> for suggesting a .dfu file approach is that this = allows easy extensions for
> locally developed encodings.

In this case there should be at least some checking of = redefinition,
like

\def\DeclareUnicodeCharacter...{%
...
\if@alreadydefined = \ifx\olddefinition\newdefinition\else
  \output@fierce@warning@or@error\fi\fi
...}

> > [about generating .dfu files by a = script]
> a ucs.map file that contains the mappings = Unicode->LICR  in the form directly
> usable in .dfu files, simply as a template for = making a .dfu if really
> necessary.
>
> perhaps using docstrip to generate the standard = dfu files from that
> file

Yes. I like that approach. In fact, this is like just = implementing the
"script" in LaTeX.

>  > There are already extensive lists of = character mappings available at:
>  > http://ww= w.unruh.de/DniQ/latex/unicode/content/config/
> so there is, worth stealing from

Perhaps I should add, that some of the macros rely on = fontencodings
written by my own
(http://ww= w.unruh.de/DniQ/latex/unicode/content/contrib) and some need
extended fontencodings
(http:= //www.unruh.de/DniQ/latex/unicode/content/ucsencs.def = contains
the additional macros), where the fontencodings are = not complete
enough (e.g. LGR).

> would it be possible for you to give use a ten = line bullet list of
> comparsion?

I will do so in the next days.

> perhaps the best is simply to forget about what = we did on lazy afternoons
> during the Xmas holidays?

I don't think so. My package has one big disadvantage: = Since it tries
do support all and everything, it is huge and slow = (and probably full
of bugs). I don't think, it is suited for inclusion = into the LaTeX
kernel itself.

I see it more as an alternative, in case that you need = advanced
features.

>  > - \DeclareUnicodeCharacter: This = command is named identically in my
>  > system. I would appreciate if another = name could be chosen at this
>  > early stadium to evade chaos.
>
> what are your arguments?

Two scenarios:

1. Someone uses that command in some package or = document. Then using the wrong
utf8.def will lead to inintelligible errors, instead = of the more
helpful "undefined command".

2. Someone tries to use both inputencs in a single = document. (Perhaps
because he wants to typeset most of the document with = the fast
in-kernel implementation, and some few strings = containing combining
chars with my implementation).

> no we map to LICR those are characters not = glyphs!

On second thought: true.

>  > \DeclareUnicodeCommand (analogous to = \DeclareTextCommand)
> no again Command in that context has already = some semantics

Which?

>  \DeclareUnicodeLaTeXMapping

Or:
\DeclareUnicodeMapping % the LaTeX may be = guessed
\DeclareUnicodeInput % like in inputenc
\DeclareUnicodeInputText % like in inputenc

DniQ.

------_=_NextPart_001_01C2BE79.53708B00--