Received: from mail.proteosys.com ([213.139.130.197]) by nummer-3.proteosys with Microsoft SMTPSVC(6.0.3790.3959); Tue, 17 Mar 2009 23:55:00 +0100 Received: by mail.proteosys.com (8.14.3/8.14.3) with ESMTP id n2HMt0tV018391 for ; Tue, 17 Mar 2009 23:55:00 +0100 Received: from listserv.uni-heidelberg.de (listserv.uni-heidelberg.de [129.206.100.94]) by relay2.uni-heidelberg.de (8.13.8/8.13.8) with ESMTP id n2HMnkjf003446 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Tue, 17 Mar 2009 23:49:46 +0100 Received: from listserv.uni-heidelberg.de (localhost.localdomain [127.0.0.1]) by listserv.uni-heidelberg.de (8.13.1/8.13.1) with ESMTP id n2HIVGZt016780; Tue, 17 Mar 2009 23:49:39 +0100 Received: by LISTSERV.UNI-HEIDELBERG.DE (LISTSERV-TCP/IP release 15.5) with spool id 217992 for LATEX-L@LISTSERV.UNI-HEIDELBERG.DE; Tue, 17 Mar 2009 23:49:39 +0100 Received: from relay.uni-heidelberg.de (relay.uni-heidelberg.de [129.206.100.212]) by listserv.uni-heidelberg.de (8.13.1/8.13.1) with ESMTP id n2HMnd9L025176 for ; Tue, 17 Mar 2009 23:49:39 +0100 Received: from moutng.kundenserver.de (moutng.kundenserver.de [212.227.126.186]) by relay.uni-heidelberg.de (8.14.1/8.14.1) with ESMTP id n2HMnZBR014355 for ; Tue, 17 Mar 2009 23:49:39 +0100 Received: from morse.mittelbach-online.de (p54A8550A.dip.t-dialin.net [84.168.85.10]) by mrelayeu.kundenserver.de (node=mrelayeu6) with ESMTP (Nemesis) id 0ML29c-1Lji6J1jaU-0001Xj; Tue, 17 Mar 2009 23:49:35 +0100 Received: by morse.mittelbach-online.de (Postfix, from userid 501) id E619E64039; Tue, 17 Mar 2009 23:49:31 +0100 (CET) MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 References: <553234443@web.de> <49C01E4B.4020604@elzevir.fr> X-Mailer: VM 7.19 under Emacs 21.3.1 X-Provags-ID: V01U2FsdGVkX1/Uflihiee0jQbuJW3SR0xIa3IGSRXgtVxksr3 dEHImMRwvph6YOEvHNQdhXtmr1sh7ZdDirw1ga/ibKJ1YUzE/X Kh3gedZWoW8GFZIYlWiwyf1CO934s4X X-Spam-Whitelist-Provider: Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by listserv.uni-heidelberg.de id n2HMnd9L025177 Message-ID: <18880.10491.482905.22434@morse.mittelbach-online.de> Date: Tue, 17 Mar 2009 23:49:31 +0100 Reply-To: Mailing list for the LaTeX3 project Sender: Mailing list for the LaTeX3 project From: Frank Mittelbach Subject: Re: inputenc for XeTeX and LuaTeX To: LATEX-L@LISTSERV.UNI-HEIDELBERG.DE In-Reply-To: <49C01E4B.4020604@elzevir.fr> Precedence: list List-Help: , List-Unsubscribe: List-Subscribe: List-Owner: List-Archive: X-ProteoSys-SPAM-Score: -104 () RCVD_IN_DNSWL_MED,USER_IN_WHITELIST X-Scanned-By: MIMEDefang 2.65 on 213.139.130.197 Return-Path: owner-latex-l@LISTSERV.UNI-HEIDELBERG.DE X-OriginalArrivalTime: 17 Mar 2009 22:55:00.0795 (UTC) FILETIME=[66F0C8B0:01C9A753] Status: R X-Status: X-Keywords: X-UID: 5717 Manuel Pégourié-Gonnard writes: > James Cloos a écrit : > > As for utf-8 or other, it may be useful to default to the character set > > specified for the current $LOCALE. Maybe. :-/ > > > Please don't do anything in the compilation of the document depend on the > locale! It would completely ruin portability of the source files. perhaps. it might be a straight path into long-term disaster.On the other hand the whole area is a disaster in the first place. When we started out with inputenc in 2e I also thought that it is really good to keep the encoding with the file (which you do by stating \usepackage[latin1]{inputenc} and the like) and that worked for a while fairly good. But then OSes started to convert on the fly so by cut-n-paste sometimes even on the same machine an old latin1 got translated into something else (except for the string specifying the encoding inside)... so ... not easy really > > A file must be assumed to be either utf-8 (auxiliary file written by > XeTeX/LuaTeX) or in the encoding declared as the option of inputenc. Exactly > what xetex-inputenc and luatex-inputenc do. > > The difficult problem is to guess when a file is an auxiliary file. I suppose > the heuristics for doing so will improve when the solution gets tested. how much guessing is really needed? Are you targetting an existing 2e env unchanged or are you intending to design an interface that is robust if used? Or something inbetween? Couple of thoughts of the top of my head: - new solution, ie not for 2e as such: design a proper interface for handling internal auxilary file reading and writing. That would then have hooks to maintain encoding. We certainly have to do something along those lines for expl3 - partial 2e solution: use \@input as a proposed way to read internal files back in (as suggested by Will) and handle those correctly. booh at those packages that don't use \@input but \input for their internal files (which is is already wrong in 2e proper) and ask them to change or ignore them. - possible 2e solution: steal \openout to always write \InternallyWrittenFileHookToHandleWhatWeNeedToHandle to the top of each such file; fix the cases where this is not appropriate in 2e, such as filecontents env ... and wait for the packages to blow up and fix those (probably only a few if any) cheers frank ps interestingly enough, in 2e on top of anormal TeX engine that problem was properly solved as we ensured that internally written files were always written in LICR which is unicode in 7bit so it was always coming back properly. That was at the cost of translating everything into LICR on input (with active chars) but that was necessary anyway because of the different 8bit encodings around.