MIME-Version: 1.0
Content-Type: multipart/alternative;
	boundary="----_=_NextPart_001_01C32B97.BB5D0F00"
In-Reply-To: <Pine.GSO.4.33.0306052120520.3275-100000@obelix.ee.duth.gr>
References: <16095.32737.201598.296665@istrati.mittelbach-online.de><Pine.GSO.4.33.0306052120520.3275-100000@obelix.ee.duth.gr>
Content-class: urn:content-classes:message
Subject: Re: announce: inputenc support for utf8
Date: Thu, 5 Jun 2003 20:18:45 +0100
Message-ID: <16095.38805.105960.823796@istrati.mittelbach-online.de>
Thread-Topic: announce: inputenc support for utf8
Thread-Index: AcMrl7vmjh4Gzan8QHuXEyKxxISKhg==
From: "Frank Mittelbach" <frank.mittelbach@latex-project.org>
To: "Apostolos Syropoulos" <apostolo@obelix.ee.duth.gr>
Cc: <latex-team@latex-project.org>,
	<vvv@vsu.ru>
Reply-To: "Frank Mittelbach" <frank.mittelbach@latex-project.org>
Status: R

This is a multi-part message in MIME format.

------_=_NextPart_001_01C32B97.BB5D0F00
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable

Apostolos,

 > >  > ... and I would be very happy to work for the support of the =
Greek
 > >  > language!
 > >
 > > that's fine, but before that is possible there would need to be a =
greek
 > > font encoding that conforms to LaTeX specifications. we looked at =
what is
 > > there right now, but all we found have been encodings that replace =
ascii
 > > chars by something else.
 >=20
 > Thatis true. However, I wrote such encoding files for the Greek =
language
 > support of ConTeXt and so I will create the necessary files for =
LaTeX.

well, it is not just creating the files, one should consider a number of
design issues, eg

 - try to make an encoding that is also suitable for "non-TeX-world" =
fonts, ie
   avoid the problems we now have with T1 that no Postscript font, say,
   implement all glyphs (fortunately the number of missing glyphs is =
small
   there but nevertheless)

 - try to get those glyphs together that provide the best benefit, eg =
what
   needs to be there to allow proper hyphenation (for different
   dialects/languages) --- symbols that are not necessary for this =
process can
   go in a companion symbol encoding

 - other stuff that i may have forgotten

this is also the reason why i cc'd Vladimir since he my help you with =
some
inside how we arrived at the cyrillic encodings eventually

 > > see what happened to cyrillic: there is T2(A-C) which are official =
LaTeX
 > > encodings as well as X2 which is an extended encoding where you are =
on
 > > your own. If there will be a greek encoding (or more) that fit =
those
 > > restrictions needed for multi-lingual processing then adding utf8 =
support
 > > will be possible without much fuss.
 >=20
 > Okay, but we need a new name for the Greek encoding: LGR is not a =
proper
 > name. Currently there are two basic Greek encodings ISO-8859-7 and
 > Windows-1253 (their only difference is the slot reserved for Capital
 > Alpha with tonos) and so I believe one file can be used for both
 > encodings. So who will ``allocate'' the new encoding name?

those are input encodings not fontencodings right? though there is no =
problem
in taking an input encoding as a font encoding this is not necessarily =
the
best solution (given the criteria above)

as for the name, this would be finally allocated by us, if there is an
agreement within the Greek TeX community that this is the right "font"
encoding(s) to go for, and that there will be some effort to actually =
produce,
say, virtual fonts in that encoding

the name itself is most likely going to be T7 (or T7A, T7B, ... if there =
is
more than one encoding and X7 for an extended encoding that my be =
necessary
for "traditional greek" where you need many more than 128 gyphs for =
proper
hyphenation, if i remember correctly), but again i like to stress that =
this is
only the name in the future; before i would be willing to put into the
documentation that such and such is an official encoding the above =
process
should have happened as it would freeze that encoding.

so while the encoding is still being developed, discussed and further
modified, etc, it should either be run under some L* name or as E7 for
experimental in the same fashion itwas donefor other encodings while =
they
where still under development.

i don't know if you come to Brest. in case you do, we might find the =
time to
talk about it a bit further.

right now other commitments will not allow me to participate at all in =
any
such activities for a good while

best
frank

------_=_NextPart_001_01C32B97.BB5D0F00
Content-Type: text/html;
	charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN">
<HTML>
<HEAD>
<META HTTP-EQUIV=3D"Content-Type" CONTENT=3D"text/html; =
charset=3Diso-8859-1">
<META NAME=3D"Generator" CONTENT=3D"MS Exchange Server version =
6.5.7654.12">
<TITLE>Re: announce: inputenc support for utf8</TITLE>
</HEAD>
<BODY>
<!-- Converted from text/plain format -->

<P><FONT SIZE=3D2>Apostolos,</FONT>
</P>

<P><FONT SIZE=3D2>&nbsp;&gt; &gt;&nbsp; &gt; ... and I would be very =
happy to work for the support of the Greek</FONT>

<BR><FONT SIZE=3D2>&nbsp;&gt; &gt;&nbsp; &gt; language!</FONT>

<BR><FONT SIZE=3D2>&nbsp;&gt; &gt;</FONT>

<BR><FONT SIZE=3D2>&nbsp;&gt; &gt; that's fine, but before that is =
possible there would need to be a greek</FONT>

<BR><FONT SIZE=3D2>&nbsp;&gt; &gt; font encoding that conforms to LaTeX =
specifications. we looked at what is</FONT>

<BR><FONT SIZE=3D2>&nbsp;&gt; &gt; there right now, but all we found =
have been encodings that replace ascii</FONT>

<BR><FONT SIZE=3D2>&nbsp;&gt; &gt; chars by something else.</FONT>

<BR><FONT SIZE=3D2>&nbsp;&gt; </FONT>

<BR><FONT SIZE=3D2>&nbsp;&gt; Thatis true. However, I wrote such =
encoding files for the Greek language</FONT>

<BR><FONT SIZE=3D2>&nbsp;&gt; support of ConTeXt and so I will create =
the necessary files for LaTeX.</FONT>
</P>

<P><FONT SIZE=3D2>well, it is not just creating the files, one should =
consider a number of</FONT>

<BR><FONT SIZE=3D2>design issues, eg</FONT>
</P>

<P><FONT SIZE=3D2>&nbsp;- try to make an encoding that is also suitable =
for &quot;non-TeX-world&quot; fonts, ie</FONT>

<BR><FONT SIZE=3D2>&nbsp;&nbsp; avoid the problems we now have with T1 =
that no Postscript font, say,</FONT>

<BR><FONT SIZE=3D2>&nbsp;&nbsp; implement all glyphs (fortunately the =
number of missing glyphs is small</FONT>

<BR><FONT SIZE=3D2>&nbsp;&nbsp; there but nevertheless)</FONT>
</P>

<P><FONT SIZE=3D2>&nbsp;- try to get those glyphs together that provide =
the best benefit, eg what</FONT>

<BR><FONT SIZE=3D2>&nbsp;&nbsp; needs to be there to allow proper =
hyphenation (for different</FONT>

<BR><FONT SIZE=3D2>&nbsp;&nbsp; dialects/languages) --- symbols that are =
not necessary for this process can</FONT>

<BR><FONT SIZE=3D2>&nbsp;&nbsp; go in a companion symbol encoding</FONT>
</P>

<P><FONT SIZE=3D2>&nbsp;- other stuff that i may have forgotten</FONT>
</P>

<P><FONT SIZE=3D2>this is also the reason why i cc'd Vladimir since he =
my help you with some</FONT>

<BR><FONT SIZE=3D2>inside how we arrived at the cyrillic encodings =
eventually</FONT>
</P>

<P><FONT SIZE=3D2>&nbsp;&gt; &gt; see what happened to cyrillic: there =
is T2(A-C) which are official LaTeX</FONT>

<BR><FONT SIZE=3D2>&nbsp;&gt; &gt; encodings as well as X2 which is an =
extended encoding where you are on</FONT>

<BR><FONT SIZE=3D2>&nbsp;&gt; &gt; your own. If there will be a greek =
encoding (or more) that fit those</FONT>

<BR><FONT SIZE=3D2>&nbsp;&gt; &gt; restrictions needed for multi-lingual =
processing then adding utf8 support</FONT>

<BR><FONT SIZE=3D2>&nbsp;&gt; &gt; will be possible without much =
fuss.</FONT>

<BR><FONT SIZE=3D2>&nbsp;&gt; </FONT>

<BR><FONT SIZE=3D2>&nbsp;&gt; Okay, but we need a new name for the Greek =
encoding: LGR is not a proper</FONT>

<BR><FONT SIZE=3D2>&nbsp;&gt; name. Currently there are two basic Greek =
encodings ISO-8859-7 and</FONT>

<BR><FONT SIZE=3D2>&nbsp;&gt; Windows-1253 (their only difference is the =
slot reserved for Capital</FONT>

<BR><FONT SIZE=3D2>&nbsp;&gt; Alpha with tonos) and so I believe one =
file can be used for both</FONT>

<BR><FONT SIZE=3D2>&nbsp;&gt; encodings. So who will ``allocate'' the =
new encoding name?</FONT>
</P>

<P><FONT SIZE=3D2>those are input encodings not fontencodings right? =
though there is no problem</FONT>

<BR><FONT SIZE=3D2>in taking an input encoding as a font encoding this =
is not necessarily the</FONT>

<BR><FONT SIZE=3D2>best solution (given the criteria above)</FONT>
</P>

<P><FONT SIZE=3D2>as for the name, this would be finally allocated by =
us, if there is an</FONT>

<BR><FONT SIZE=3D2>agreement within the Greek TeX community that this is =
the right &quot;font&quot;</FONT>

<BR><FONT SIZE=3D2>encoding(s) to go for, and that there will be some =
effort to actually produce,</FONT>

<BR><FONT SIZE=3D2>say, virtual fonts in that encoding</FONT>
</P>

<P><FONT SIZE=3D2>the name itself is most likely going to be T7 (or T7A, =
T7B, ... if there is</FONT>

<BR><FONT SIZE=3D2>more than one encoding and X7 for an extended =
encoding that my be necessary</FONT>

<BR><FONT SIZE=3D2>for &quot;traditional greek&quot; where you need many =
more than 128 gyphs for proper</FONT>

<BR><FONT SIZE=3D2>hyphenation, if i remember correctly), but again i =
like to stress that this is</FONT>

<BR><FONT SIZE=3D2>only the name in the future; before i would be =
willing to put into the</FONT>

<BR><FONT SIZE=3D2>documentation that such and such is an official =
encoding the above process</FONT>

<BR><FONT SIZE=3D2>should have happened as it would freeze that =
encoding.</FONT>
</P>

<P><FONT SIZE=3D2>so while the encoding is still being developed, =
discussed and further</FONT>

<BR><FONT SIZE=3D2>modified, etc, it should either be run under some L* =
name or as E7 for</FONT>

<BR><FONT SIZE=3D2>experimental in the same fashion itwas donefor other =
encodings while they</FONT>

<BR><FONT SIZE=3D2>where still under development.</FONT>
</P>

<P><FONT SIZE=3D2>i don't know if you come to Brest. in case you do, we =
might find the time to</FONT>

<BR><FONT SIZE=3D2>talk about it a bit further.</FONT>
</P>

<P><FONT SIZE=3D2>right now other commitments will not allow me to =
participate at all in any</FONT>

<BR><FONT SIZE=3D2>such activities for a good while</FONT>
</P>

<P><FONT SIZE=3D2>best</FONT>

<BR><FONT SIZE=3D2>frank</FONT>
</P>

</BODY>
</HTML>
------_=_NextPart_001_01C32B97.BB5D0F00--