Received: from webgate.proteosys.de (mail.proteosys-ag.com [62.225.9.49]) by lucy.proteosys (8.11.0/8.9.3/SuSE Linux 8.9.3-0.1) with ESMTP id f1BHWJH11732 for ; Sun, 11 Feb 2001 18:32:19 +0100 Received: by webgate.proteosys.de (8.11.0/8.11.0) with ESMTP id f1BHWJd25747 . for ; Sun, 11 Feb 2001 18:32:19 +0100 Received: from mail.Uni-Mainz.DE (mailserver1.zdv.Uni-Mainz.DE [134.93.8.30]) by mailgate1.zdv.Uni-Mainz.DE (8.11.0/8.10.2) with ESMTP id f1BHWIM05803 for ; Sun, 11 Feb 2001 18:32:18 +0100 (MET) MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----_=_NextPart_001_01C09450.95545380" Received: from mailgate1.zdv.Uni-Mainz.DE (mailgate1.zdv.Uni-Mainz.DE [134.93.8.56]) by mail.Uni-Mainz.DE (8.9.3/8.9.3) with ESMTP id SAA27260 for ; Sun, 11 Feb 2001 18:32:18 +0100 (MET) Received: from mail.listserv.gmd.de (mail.listserv.gmd.de [192.88.97.5]) by mailgate1.zdv.Uni-Mainz.DE (8.11.0/8.10.2) with ESMTP id f1BHWIM05799 for ; Sun, 11 Feb 2001 18:32:18 +0100 (MET) X-MimeOLE: Produced By Microsoft Exchange V6.5 Received: from mail.listserv.gmd.de (192.88.97.5) by mail.listserv.gmd.de (LSMTP for OpenVMS v1.1a) with SMTP id <3.696BB819@mail.listserv.gmd.de>; Sun, 11 Feb 2001 18:32:11 +0100 Received: from RELAY.URZ.UNI-HEIDELBERG.DE by RELAY.URZ.UNI-HEIDELBERG.DE (LISTSERV-TCP/IP release 1.8b) with spool id 487674 for LATEX-L@RELAY.URZ.UNI-HEIDELBERG.DE; Sun, 11 Feb 2001 18:32:14 +0100 Received: from ix.urz.uni-heidelberg.de (mail.urz.uni-heidelberg.de [129.206.119.234]) by relay.urz.uni-heidelberg.de (8.8.8/8.8.8) with ESMTP id SAA23171 for ; Sun, 11 Feb 2001 18:32:13 +0100 (MET) Received: from relay.uni-heidelberg.de (relay.uni-heidelberg.de [129.206.100.212]) by ix.urz.uni-heidelberg.de (8.8.8/8.8.8) with ESMTP id SAA14092 for ; Sun, 11 Feb 2001 18:32:14 +0100 Received: from naf1.mathematik.uni-tuebingen.de (naf1.mathematik.uni-tuebingen.de [134.2.161.197]) by relay.uni-heidelberg.de (8.10.2+Sun/8.10.2) with ESMTP id f1BHWDu19283 for ; Sun, 11 Feb 2001 18:32:13 +0100 (MET) Received: from na13.mathematik.uni-tuebingen.de (na13 [134.2.161.180]) by naf1.mathematik.uni-tuebingen.de (8.9.3+Sun/8.9.3) with ESMTP id SAA14677 for ; Sun, 11 Feb 2001 18:32:13 +0100 (MET) Received: (from oliver@localhost) by na13.mathematik.uni-tuebingen.de (8.9.3+Sun/8.9.1) id SAA26388; Sun, 11 Feb 2001 18:32:13 +0100 (MET) In-Reply-To: <14982.51989.349221.285820@istrati.zdv.uni-mainz.de> References: <14982.45082.150652.74719@istrati.zdv.uni-mainz.de> <14982.51989.349221.285820@istrati.zdv.uni-mainz.de> Return-Path: X-Mailer: VM 6.88 under Emacs 20.7.2 X-Authentication-Warning: na13.mathematik.uni-tuebingen.de: oliver set sender to oliver@na13 using -f Content-class: urn:content-classes:message Subject: Re: LaTeX's internal char prepresentation (UTF8 or Unicode?) Date: Sun, 11 Feb 2001 18:32:12 +0100 Message-ID: <14982.52380.897443.588837@gargle.gargle.HOWL> X-MS-Has-Attach: X-MS-TNEF-Correlator: From: "Marcel Oliver" Sender: "Mailing list for the LaTeX3 project" To: "Multiple recipients of list LATEX-L" Reply-To: "Mailing list for the LaTeX3 project" Status: R X-Status: X-Keywords: X-UID: 3805 This is a multi-part message in MIME format. ------_=_NextPart_001_01C09450.95545380 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Frank Mittelbach writes: > Roozbeh, > > > I have yet to > > > see that UTF8 text (without taking precaution and externally > > > announcing that a file is in UTF8) is really properly handled > > > by any OS platform. Is it? > > > > Windows 2000 autodetects them. I can't define the proper > > handling in Linux well; you mean in a text editor? > > no i mean at the system level. what do you mean by windows2000 > autodetects them? my understanding of what UTF8 means as a format > is that you can't autodetect it. As best you can detect that > something is not UTF8, but how do you want to detect it as being in > that format and not in, say, a file written with an 8bit > inputencoding which happens to just contain an 8bit stream which is > by chance also conforming to the UTF8 spec? MS applications prepend a "signature" to UTF8 files. It's not really in the specs, just a MS thing. --M. ------_=_NextPart_001_01C09450.95545380 Content-Type: text/html; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Re: LaTeX's internal char prepresentation (UTF8 or = Unicode?)

Frank Mittelbach writes:
 > Roozbeh,
 >
 >  > I have yet to
 >  > > see that UTF8 text = (without taking precaution and externally
 >  > > announcing that a file is = in UTF8) is really properly handled
 >  > > by any OS platform. Is = it?
 >  >
 >  > Windows 2000 autodetects them. = I can't define the proper
 >  > handling in Linux well; you = mean in a text editor?
 >
 > no i mean at the system level. what do you = mean by windows2000
 > autodetects them? my understanding of what = UTF8 means as a format
 > is that you can't autodetect it. As best = you can detect that
 > something is not UTF8, but how do you want = to detect it as being in
 > that format and not in, say, a file = written with an 8bit
 > inputencoding which happens to just = contain an 8bit stream which is
 > by chance also conforming to the UTF8 = spec?

MS applications prepend a "signature" to = UTF8 files.  It's not really
in the specs, just a MS thing.

--M.

------_=_NextPart_001_01C09450.95545380--