MIME-Version: 1.0
Content-Type: multipart/alternative;
	boundary="----_=_NextPart_001_01C2BE79.53708B00"
In-Reply-To:  <15912.23044.419984.897093@istrati.mittelbach-online.de>
References: <200212031601.gB3G11cQ009558@sun.dante.de>            <15899.14827.804209.458595@istrati.mittelbach-online.de>            <20030116114637.GA9844@g113.hadiko.de>            <15912.23044.419984.897093@istrati.mittelbach-online.de>
User-Agent: Mutt/1.3.28i
Content-class: urn:content-classes:message
Subject:      Re: latex/3480: Support for UTF-8 missing in inputenc.sty
Date: Fri, 17 Jan 2003 23:33:46 +0100
Message-ID: A<20030117223345.GA14828@g113.hadiko.de>
Thread-Topic:      Re: latex/3480: Support for UTF-8 missing in inputenc.sty
Thread-Index: AcK+eVQLXTeCZzgnQmWEObkKXzA6TQ==
From: "Dominique Unruh" <dominique@UNRUH.DE>
To: <LATEX-L@listserv.uni-heidelberg.de>
Reply-To: "Mailing list for the LaTeX3 project" <LATEX-L@listserv.uni-heidelberg.de>
Status: R

This is a multi-part message in MIME format.

------_=_NextPart_001_01C2BE79.53708B00
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable

> [aside, where is that file, the one that i have here is very short
> and doesn't contain \euro but neither looks like a proper encoding
> file either]

Yes. I'm sorry, I seem to have mixed. It was the eurofont package
that used the \euro command. But it shows the problem nevertheless.

> > [about incompatible double definitions of Unicode chars]
> being pragmatic i believe that these get weed out after a while, the =
reason
> for suggesting a .dfu file approach is that this allows easy =
extensions for
> locally developed encodings.

In this case there should be at least some checking of redefinition,
like

\def\DeclareUnicodeCharacter...{%
...
\if@alreadydefined \ifx\olddefinition\newdefinition\else
  \output@fierce@warning@or@error\fi\fi
...}

> > [about generating .dfu files by a script]
> a ucs.map file that contains the mappings Unicode->LICR  in the form =
directly
> usable in .dfu files, simply as a template for making a .dfu if really
> necessary.
>
> perhaps using docstrip to generate the standard dfu files from that
> file

Yes. I like that approach. In fact, this is like just implementing the
"script" in LaTeX.

>  > There are already extensive lists of character mappings available =
at:
>  > http://www.unruh.de/DniQ/latex/unicode/content/config/
> so there is, worth stealing from

Perhaps I should add, that some of the macros rely on fontencodings
written by my own
(http://www.unruh.de/DniQ/latex/unicode/content/contrib) and some need
extended fontencodings
(http://www.unruh.de/DniQ/latex/unicode/content/ucsencs.def contains
the additional macros), where the fontencodings are not complete
enough (e.g. LGR).

> would it be possible for you to give use a ten line bullet list of
> comparsion?

I will do so in the next days.

> perhaps the best is simply to forget about what we did on lazy =
afternoons
> during the Xmas holidays?

I don't think so. My package has one big disadvantage: Since it tries
do support all and everything, it is huge and slow (and probably full
of bugs). I don't think, it is suited for inclusion into the LaTeX
kernel itself.

I see it more as an alternative, in case that you need advanced
features.

>  > - \DeclareUnicodeCharacter: This command is named identically in my
>  > system. I would appreciate if another name could be chosen at this
>  > early stadium to evade chaos.
>
> what are your arguments?

Two scenarios:

1. Someone uses that command in some package or document. Then using the =
wrong
utf8.def will lead to inintelligible errors, instead of the more
helpful "undefined command".

2. Someone tries to use both inputencs in a single document. (Perhaps
because he wants to typeset most of the document with the fast
in-kernel implementation, and some few strings containing combining
chars with my implementation).

> no we map to LICR those are characters not glyphs!

On second thought: true.

>  > \DeclareUnicodeCommand (analogous to \DeclareTextCommand)
> no again Command in that context has already some semantics

Which?

>  \DeclareUnicodeLaTeXMapping

Or:
\DeclareUnicodeMapping % the LaTeX may be guessed
\DeclareUnicodeInput % like in inputenc
\DeclareUnicodeInputText % like in inputenc

DniQ.

------_=_NextPart_001_01C2BE79.53708B00
Content-Type: text/html;
	charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN">
<HTML>
<HEAD>
<META HTTP-EQUIV=3D"Content-Type" CONTENT=3D"text/html; =
charset=3Diso-8859-1">
<META NAME=3D"Generator" CONTENT=3D"MS Exchange Server version =
6.5.7654.12">
<TITLE>     Re: latex/3480: Support for UTF-8 missing in =
inputenc.sty</TITLE>
</HEAD>
<BODY>
<!-- Converted from text/plain format -->

<P><FONT SIZE=3D2>&gt; [aside, where is that file, the one that i have =
here is very short</FONT>

<BR><FONT SIZE=3D2>&gt; and doesn't contain \euro but neither looks like =
a proper encoding</FONT>

<BR><FONT SIZE=3D2>&gt; file either]</FONT>
</P>

<P><FONT SIZE=3D2>Yes. I'm sorry, I seem to have mixed. It was the =
eurofont package</FONT>

<BR><FONT SIZE=3D2>that used the \euro command. But it shows the problem =
nevertheless.</FONT>
</P>

<P><FONT SIZE=3D2>&gt; &gt; [about incompatible double definitions of =
Unicode chars]</FONT>

<BR><FONT SIZE=3D2>&gt; being pragmatic i believe that these get weed =
out after a while, the reason</FONT>

<BR><FONT SIZE=3D2>&gt; for suggesting a .dfu file approach is that this =
allows easy extensions for</FONT>

<BR><FONT SIZE=3D2>&gt; locally developed encodings.</FONT>
</P>

<P><FONT SIZE=3D2>In this case there should be at least some checking of =
redefinition,</FONT>

<BR><FONT SIZE=3D2>like</FONT>
</P>

<P><FONT SIZE=3D2>\def\DeclareUnicodeCharacter...{%</FONT>

<BR><FONT SIZE=3D2>...</FONT>

<BR><FONT SIZE=3D2>\if@alreadydefined =
\ifx\olddefinition\newdefinition\else</FONT>

<BR><FONT SIZE=3D2>&nbsp; \output@fierce@warning@or@error\fi\fi</FONT>

<BR><FONT SIZE=3D2>...}</FONT>
</P>

<P><FONT SIZE=3D2>&gt; &gt; [about generating .dfu files by a =
script]</FONT>

<BR><FONT SIZE=3D2>&gt; a ucs.map file that contains the mappings =
Unicode-&gt;LICR&nbsp; in the form directly</FONT>

<BR><FONT SIZE=3D2>&gt; usable in .dfu files, simply as a template for =
making a .dfu if really</FONT>

<BR><FONT SIZE=3D2>&gt; necessary.</FONT>

<BR><FONT SIZE=3D2>&gt;</FONT>

<BR><FONT SIZE=3D2>&gt; perhaps using docstrip to generate the standard =
dfu files from that</FONT>

<BR><FONT SIZE=3D2>&gt; file</FONT>
</P>

<P><FONT SIZE=3D2>Yes. I like that approach. In fact, this is like just =
implementing the</FONT>

<BR><FONT SIZE=3D2>&quot;script&quot; in LaTeX.</FONT>
</P>

<P><FONT SIZE=3D2>&gt;&nbsp; &gt; There are already extensive lists of =
character mappings available at:</FONT>

<BR><FONT SIZE=3D2>&gt;&nbsp; &gt; <A =
HREF=3D"http://www.unruh.de/DniQ/latex/unicode/content/config/">http://ww=
w.unruh.de/DniQ/latex/unicode/content/config/</A></FONT>

<BR><FONT SIZE=3D2>&gt; so there is, worth stealing from</FONT>
</P>

<P><FONT SIZE=3D2>Perhaps I should add, that some of the macros rely on =
fontencodings</FONT>

<BR><FONT SIZE=3D2>written by my own</FONT>

<BR><FONT SIZE=3D2>(<A =
HREF=3D"http://www.unruh.de/DniQ/latex/unicode/content/contrib">http://ww=
w.unruh.de/DniQ/latex/unicode/content/contrib</A>) and some need</FONT>

<BR><FONT SIZE=3D2>extended fontencodings</FONT>

<BR><FONT SIZE=3D2>(<A =
HREF=3D"http://www.unruh.de/DniQ/latex/unicode/content/ucsencs.def">http:=
//www.unruh.de/DniQ/latex/unicode/content/ucsencs.def</A> =
contains</FONT>

<BR><FONT SIZE=3D2>the additional macros), where the fontencodings are =
not complete</FONT>

<BR><FONT SIZE=3D2>enough (e.g. LGR).</FONT>
</P>

<P><FONT SIZE=3D2>&gt; would it be possible for you to give use a ten =
line bullet list of</FONT>

<BR><FONT SIZE=3D2>&gt; comparsion?</FONT>
</P>

<P><FONT SIZE=3D2>I will do so in the next days.</FONT>
</P>

<P><FONT SIZE=3D2>&gt; perhaps the best is simply to forget about what =
we did on lazy afternoons</FONT>

<BR><FONT SIZE=3D2>&gt; during the Xmas holidays?</FONT>
</P>

<P><FONT SIZE=3D2>I don't think so. My package has one big disadvantage: =
Since it tries</FONT>

<BR><FONT SIZE=3D2>do support all and everything, it is huge and slow =
(and probably full</FONT>

<BR><FONT SIZE=3D2>of bugs). I don't think, it is suited for inclusion =
into the LaTeX</FONT>

<BR><FONT SIZE=3D2>kernel itself.</FONT>
</P>

<P><FONT SIZE=3D2>I see it more as an alternative, in case that you need =
advanced</FONT>

<BR><FONT SIZE=3D2>features.</FONT>
</P>

<P><FONT SIZE=3D2>&gt;&nbsp; &gt; - \DeclareUnicodeCharacter: This =
command is named identically in my</FONT>

<BR><FONT SIZE=3D2>&gt;&nbsp; &gt; system. I would appreciate if another =
name could be chosen at this</FONT>

<BR><FONT SIZE=3D2>&gt;&nbsp; &gt; early stadium to evade chaos.</FONT>

<BR><FONT SIZE=3D2>&gt;</FONT>

<BR><FONT SIZE=3D2>&gt; what are your arguments?</FONT>
</P>

<P><FONT SIZE=3D2>Two scenarios:</FONT>
</P>

<P><FONT SIZE=3D2>1. Someone uses that command in some package or =
document. Then using the wrong</FONT>

<BR><FONT SIZE=3D2>utf8.def will lead to inintelligible errors, instead =
of the more</FONT>

<BR><FONT SIZE=3D2>helpful &quot;undefined command&quot;.</FONT>
</P>

<P><FONT SIZE=3D2>2. Someone tries to use both inputencs in a single =
document. (Perhaps</FONT>

<BR><FONT SIZE=3D2>because he wants to typeset most of the document with =
the fast</FONT>

<BR><FONT SIZE=3D2>in-kernel implementation, and some few strings =
containing combining</FONT>

<BR><FONT SIZE=3D2>chars with my implementation).</FONT>
</P>

<P><FONT SIZE=3D2>&gt; no we map to LICR those are characters not =
glyphs!</FONT>
</P>

<P><FONT SIZE=3D2>On second thought: true.</FONT>
</P>

<P><FONT SIZE=3D2>&gt;&nbsp; &gt; \DeclareUnicodeCommand (analogous to =
\DeclareTextCommand)</FONT>

<BR><FONT SIZE=3D2>&gt; no again Command in that context has already =
some semantics</FONT>
</P>

<P><FONT SIZE=3D2>Which?</FONT>
</P>

<P><FONT SIZE=3D2>&gt;&nbsp; \DeclareUnicodeLaTeXMapping</FONT>
</P>

<P><FONT SIZE=3D2>Or:</FONT>

<BR><FONT SIZE=3D2>\DeclareUnicodeMapping % the LaTeX may be =
guessed</FONT>

<BR><FONT SIZE=3D2>\DeclareUnicodeInput % like in inputenc</FONT>

<BR><FONT SIZE=3D2>\DeclareUnicodeInputText % like in inputenc</FONT>
</P>

<P><FONT SIZE=3D2>DniQ.</FONT>
</P>

</BODY>
</HTML>
------_=_NextPart_001_01C2BE79.53708B00--