Received: from mail.proteosys.com ([213.139.130.197]) by nummer-3.proteosys with Microsoft SMTPSVC(6.0.3790.3959); Wed, 18 Mar 2009 04:12:41 +0100 Received: by mail.proteosys.com (8.14.3/8.14.3) with ESMTP id n2I3CfVR024607 for ; Wed, 18 Mar 2009 04:12:42 +0100 Received: from listserv.uni-heidelberg.de (listserv.uni-heidelberg.de [129.206.100.94]) by relay2.uni-heidelberg.de (8.13.8/8.13.8) with ESMTP id n2I37NdD021003 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Wed, 18 Mar 2009 04:07:23 +0100 Received: from listserv.uni-heidelberg.de (localhost.localdomain [127.0.0.1]) by listserv.uni-heidelberg.de (8.13.1/8.13.1) with ESMTP id n2HN14jR026976; Wed, 18 Mar 2009 04:07:16 +0100 Received: by LISTSERV.UNI-HEIDELBERG.DE (LISTSERV-TCP/IP release 15.5) with spool id 206284 for LATEX-L@LISTSERV.UNI-HEIDELBERG.DE; Wed, 18 Mar 2009 04:07:16 +0100 Received: from relay.uni-heidelberg.de (relay.uni-heidelberg.de [129.206.100.212]) by listserv.uni-heidelberg.de (8.13.1/8.13.1) with ESMTP id n2I37GWi015062 for ; Wed, 18 Mar 2009 04:07:16 +0100 Received: from yx-out-1718.google.com (yx-out-1718.google.com [74.125.44.158]) by relay.uni-heidelberg.de (8.14.1/8.14.1) with ESMTP id n2I37BpD026722 for ; Wed, 18 Mar 2009 04:07:14 +0100 Received: by yx-out-1718.google.com with SMTP id 34so244289yxf.70 for ; Tue, 17 Mar 2009 20:07:10 -0700 (PDT) Received: by 10.90.93.17 with SMTP id q17mr416673agb.72.1237345630743; Tue, 17 Mar 2009 20:07:10 -0700 (PDT) Received: from ?129.127.15.244? ([129.127.15.244]) by mx.google.com with ESMTPS id 3sm2077105aga.0.2009.03.17.20.07.07 (version=TLSv1/SSLv3 cipher=RC4-MD5); Tue, 17 Mar 2009 20:07:09 -0700 (PDT) Content-Type: multipart/signed; boundary=Apple-Mail-17--807096852; micalg=sha1; protocol="application/pkcs7-signature" Mime-Version: 1.0 (Apple Message framework v930.3) References: <553234443@web.de> <49C01E4B.4020604@elzevir.fr> <18880.10491.482905.22434@morse.mittelbach-online.de> X-Mailer: Apple Mail (2.930.3) X-Spam-Whitelist: Message-ID: <370B052A-7BFB-4653-B538-A6830B77C960@gmail.com> Date: Wed, 18 Mar 2009 13:37:04 +1030 Reply-To: Mailing list for the LaTeX3 project Sender: Mailing list for the LaTeX3 project From: Will Robertson Subject: Re: inputenc for XeTeX and LuaTeX To: LATEX-L@LISTSERV.UNI-HEIDELBERG.DE In-Reply-To: <18880.10491.482905.22434@morse.mittelbach-online.de> Precedence: list List-Help: , List-Unsubscribe: List-Subscribe: List-Owner: List-Archive: X-ProteoSys-SPAM-Score: -4 () RCVD_IN_DNSWL_MED X-Scanned-By: MIMEDefang 2.65 on 213.139.130.197 Return-Path: owner-latex-l@LISTSERV.UNI-HEIDELBERG.DE X-OriginalArrivalTime: 18 Mar 2009 03:12:41.0888 (UTC) FILETIME=[66785A00:01C9A777] Status: R X-Status: X-Keywords: X-UID: 5720 --Apple-Mail-17--807096852 Content-Type: text/plain; charset=ISO-8859-1; format=flowed; delsp=yes Content-Transfer-Encoding: quoted-printable On 18/03/2009, at 9:19 AM, Frank Mittelbach wrote: > Manuel P=E9gouri=E9-Gonnard writes: >> James Cloos a =E9crit : >>> As for utf-8 or other, it may be useful to default to the =20 >>> character set >>> specified for the current $LOCALE. Maybe. :-/ >>> >> Please don't do anything in the compilation of the document depend =20= >> on the >> locale! It would completely ruin portability of the source files. > > perhaps. it might be a straight path into long-term disaster.On the =20= > other > hand the whole area is a disaster in the first place. When we =20 > started out with > inputenc in 2e I also thought that it is really good to keep the =20 > encoding with > the file (which you do by stating \usepackage[latin1]{inputenc} and =20= > the like) > and that worked for a while fairly good. But then OSes started to =20 > convert on > the fly so by cut-n-paste sometimes even on the same machine an old =20= > latin1 got > translated into something else (except for the string specifying the =20= > encoding > inside)... so ... not easy really Yep, agreed that dealing with encodings is annoying :) >> A file must be assumed to be either utf-8 (auxiliary file written by >> XeTeX/LuaTeX) or in the encoding declared as the option of =20 >> inputenc. Exactly >> what xetex-inputenc and luatex-inputenc do. >> >> The difficult problem is to guess when a file is an auxiliary file. =20= >> I suppose >> the heuristics for doing so will improve when the solution gets =20 >> tested. > > how much guessing is really needed? Are you targetting an existing =20 > 2e env > unchanged or are you intending to design an interface that is robust =20= > if used? > Or something inbetween? Almost entirely the first. Neither package needs to guess anything; the problem is that there's =20 just no way to know if \input refers to a generated file or a user file. The XeTeX solution simply patches \@input. The LuaTeX solution does =20 something similar and allows customisation so that certain files or =20 file extensions can be treated as if they were \@input rather than =20 \input. > - new solution, ie not for 2e as such: design a proper interface for =20= > handling > internal auxilary file reading and writing. That would then have =20 > hooks to > maintain encoding. We certainly have to do something along those =20 > lines for expl3 Yep. > - partial 2e solution: use \@input as a proposed way to read =20 > internal files > back in (as suggested by Will) and handle those correctly. booh at =20= > those > packages that don't use \@input but \input for their internal =20 > files (which > is is already wrong in 2e proper) and ask them to change or ignore =20= > them. Yep. I hadn't thought of it before, but we could add a note to the =20 documentation explicitly discussing this behaviour. Using \@input for =20= internally-generated files is implicit in what it does but there's no-=20= where (that I know of) that states it plainly. Note that even the kernel uses \input on the .aux file somewhere :) > - possible 2e solution: steal \openout to always write > \InternallyWrittenFileHookToHandleWhatWeNeedToHandle > to the top of each such file; fix the cases where this is not =20 > appropriate > in 2e, such as filecontents env ... and wait for the packages to =20 > blow up > and fix those (probably only a few if any) Nice idea, probably will work; but the return on investment is too low =20= (for me at least). I expect non-UTF8 input in Xe(La)TeX documents to =20 be hardly ever used. And we can always foist off the responsibility on =20= the packages that don't work because of \input v. \@input. * * * So, assuming we want to do something about the whole situation (I hope =20= so), how open are you to the idea of adding branching to inputenc to =20 load packages that aren't under the LaTeX team's control? I'm more =20 than happy printing a big warning telling users what's going on. Thanks for the comments, Will= --Apple-Mail-17--807096852 Content-Disposition: attachment; filename=smime.p7s Content-Type: application/pkcs7-signature; name=smime.p7s Content-Transfer-Encoding: base64 MIAGCSqGSIb3DQEHAqCAMIACAQExCzAJBgUrDgMCGgUAMIAGCSqGSIb3DQEHAQAAoIIGITCCAtow ggJDoAMCAQICECN4qE5kBXLk2f/jVDfSZPwwDQYJKoZIhvcNAQEFBQAwYjELMAkGA1UEBhMCWkEx JTAjBgNVBAoTHFRoYXd0ZSBDb25zdWx0aW5nIChQdHkpIEx0ZC4xLDAqBgNVBAMTI1RoYXd0ZSBQ ZXJzb25hbCBGcmVlbWFpbCBJc3N1aW5nIENBMB4XDTA5MDEyOTA1NDkxNVoXDTEwMDEyOTA1NDkx NVowQjEfMB0GA1UEAxMWVGhhd3RlIEZyZWVtYWlsIE1lbWJlcjEfMB0GCSqGSIb3DQEJARYQd3Nw cjgxQGdtYWlsLmNvbTCCASIwDQYJKoZIhvcNAQEBBQADggEPADCCAQoCggEBAL0BeSiAbKuqxeRN p2qn/m8ZL+xawr/WXyPgEF0FipWgRe9l3sMXcFHokcUu0xOc97R7xkUsGcQ8EyybGHuWey6x7X1Y xJZXnoAxqcaG+eREytoYGMIKs6BhEEogLVb2ERw3lQNVnOzanSFeGo8suMAN4zzCtqAjJiA1ph7h 1pksTgECYK5EiIZbFsB6zSDa8crNk404z1CfIA6YO8ezvjbDda+D0r8NU2tq9WS9F5IaG+bW71Ya JegEcSZ+WF6Z+fs2MUMCLLu8n50Er0nuy4dxOmkdMRNfbeaM39dsEwjAAgcQnvPNmlJ215nZWQRH 49YowtSBOYUYq0ZylWRE6x8CAwEAAaMtMCswGwYDVR0RBBQwEoEQd3NwcjgxQGdtYWlsLmNvbTAM BgNVHRMBAf8EAjAAMA0GCSqGSIb3DQEBBQUAA4GBABaRP8+PDYpKIRGlFgjs1HvMmJnqu4reSqp+ ulv0zJZIjIbX/sLbIsnecl9nycHfhubPdc+hDfpCqNZ2+NGQHwwoyuDl7KOdTY0BDPp3eJLio7ob EYEr0H8rFwqfx2LWJ0G6nMhNEjLvs7sFKyriSpk++TWJnnsf86xai5m0tlOwMIIDPzCCAqigAwIB AgIBDTANBgkqhkiG9w0BAQUFADCB0TELMAkGA1UEBhMCWkExFTATBgNVBAgTDFdlc3Rlcm4gQ2Fw ZTESMBAGA1UEBxMJQ2FwZSBUb3duMRowGAYDVQQKExFUaGF3dGUgQ29uc3VsdGluZzEoMCYGA1UE CxMfQ2VydGlmaWNhdGlvbiBTZXJ2aWNlcyBEaXZpc2lvbjEkMCIGA1UEAxMbVGhhd3RlIFBlcnNv bmFsIEZyZWVtYWlsIENBMSswKQYJKoZIhvcNAQkBFhxwZXJzb25hbC1mcmVlbWFpbEB0aGF3dGUu Y29tMB4XDTAzMDcxNzAwMDAwMFoXDTEzMDcxNjIzNTk1OVowYjELMAkGA1UEBhMCWkExJTAjBgNV BAoTHFRoYXd0ZSBDb25zdWx0aW5nIChQdHkpIEx0ZC4xLDAqBgNVBAMTI1RoYXd0ZSBQZXJzb25h bCBGcmVlbWFpbCBJc3N1aW5nIENBMIGfMA0GCSqGSIb3DQEBAQUAA4GNADCBiQKBgQDEpjxVc1X7 TrnKmVoeaMB1BHCd3+n/ox7svc31W/Iadr1/DDph8r9RzgHU5VAKMNcCY1osiRVwjt3J8CuFWqo/ cVbLrzwLB+fxH5E2JCoTzyvV84J3PQO+K/67GD4Hv0CAAmTXp6a7n2XRxSpUhQ9IBH+nttE8YQRA HmQZcmC3+wIDAQABo4GUMIGRMBIGA1UdEwEB/wQIMAYBAf8CAQAwQwYDVR0fBDwwOjA4oDagNIYy aHR0cDovL2NybC50aGF3dGUuY29tL1RoYXd0ZVBlcnNvbmFsRnJlZW1haWxDQS5jcmwwCwYDVR0P BAQDAgEGMCkGA1UdEQQiMCCkHjAcMRowGAYDVQQDExFQcml2YXRlTGFiZWwyLTEzODANBgkqhkiG 9w0BAQUFAAOBgQBIjNFQg+oLLswNo2asZw9/r6y+whehQ5aUnX9MIbj4Nh+qLZ82L8D0HFAgk3A8 /a3hYWLD2ToZfoSxmRsAxRoLgnSeJVCUYsfbJ3FXJY3dqZw5jowgT2Vfldr394fWxghOrvbqNOUQ Gls1TXfjViF4gtwhGTXeJLHTHUb/XV9lTzGCAxAwggMMAgEBMHYwYjELMAkGA1UEBhMCWkExJTAj BgNVBAoTHFRoYXd0ZSBDb25zdWx0aW5nIChQdHkpIEx0ZC4xLDAqBgNVBAMTI1RoYXd0ZSBQZXJz b25hbCBGcmVlbWFpbCBJc3N1aW5nIENBAhAjeKhOZAVy5Nn/41Q30mT8MAkGBSsOAwIaBQCgggFv MBgGCSqGSIb3DQEJAzELBgkqhkiG9w0BBwEwHAYJKoZIhvcNAQkFMQ8XDTA5MDMxODAzMDcwNFow IwYJKoZIhvcNAQkEMRYEFNgkYYzd3+k56HKSJNAiajcorIHaMIGFBgkrBgEEAYI3EAQxeDB2MGIx CzAJBgNVBAYTAlpBMSUwIwYDVQQKExxUaGF3dGUgQ29uc3VsdGluZyAoUHR5KSBMdGQuMSwwKgYD VQQDEyNUaGF3dGUgUGVyc29uYWwgRnJlZW1haWwgSXNzdWluZyBDQQIQI3ioTmQFcuTZ/+NUN9Jk /DCBhwYLKoZIhvcNAQkQAgsxeKB2MGIxCzAJBgNVBAYTAlpBMSUwIwYDVQQKExxUaGF3dGUgQ29u c3VsdGluZyAoUHR5KSBMdGQuMSwwKgYDVQQDEyNUaGF3dGUgUGVyc29uYWwgRnJlZW1haWwgSXNz dWluZyBDQQIQI3ioTmQFcuTZ/+NUN9Jk/DANBgkqhkiG9w0BAQEFAASCAQBuj02Urdf+umrSX3L4 L53COLs8K5f6XuOYmiCWJmOL3JIrPZRfn7ExcH3wUU7upkq1ur+BchJj5uANWXI1I+WL033LGdlt XPK4Qaft+ueXlFTXqUN4fZZkz4lbYjnP1GW3oZ1NKuhdQhmvJr58yo0aJi1OIOxBM1HTGbiG6kWv Ui9Hy1NmvUaKvZ62RtVE4D0hfaiekv7yXGkpYPHU2DbrzkLUyY4gHSNsQ63Jhw6YfZR9iVyxajs6 IF37ZluFYjQxEDiAuQtZy1Mou/FcZ9a9IQVytyNuHEaoZQdODvqemz2R2NcDSBafd94NVdwndO81 GSFBrzKEQbKIhee5gJgPAAAAAAAA --Apple-Mail-17--807096852--