Received: from mout.gmx.net (mout.gmx.net [212.227.17.22]) by h1439878.stratoserver.net (8.14.2/8.14.2/Debian-2build1) with ESMTP id s3U95DHb030411 for ; Wed, 30 Apr 2014 11:05:15 +0200 Received: from relay2.uni-heidelberg.de ([129.206.210.211]) by mx-ha.gmx.net (mxgmx110) with ESMTPS (Nemesis) id 0M0dCk-1X0nHx1Gpy-00uoUK for ; Wed, 30 Apr 2014 11:05:08 +0200 Received: from listserv.uni-heidelberg.de (listserv.uni-heidelberg.de [129.206.100.94]) by relay2.uni-heidelberg.de (8.13.8/8.13.8) with ESMTP id s3U911Ut001327 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Wed, 30 Apr 2014 11:01:02 +0200 Received: from listserv.uni-heidelberg.de (listserv.uni-heidelberg.de [127.0.0.1]) by listserv.uni-heidelberg.de (8.13.8/8.13.8) with ESMTP id s3U8xqrK014050; Wed, 30 Apr 2014 11:01:00 +0200 Received: by LISTSERV.UNI-HEIDELBERG.DE (LISTSERV-TCP/IP release 16.0) with spool id 10867867 for LATEX-L@LISTSERV.UNI-HEIDELBERG.DE; Wed, 30 Apr 2014 11:01:00 +0200 Received: from relay.uni-heidelberg.de (relay.uni-heidelberg.de [129.206.100.212]) by listserv.uni-heidelberg.de (8.13.8/8.13.8) with ESMTP id s3U910xD014226 for ; Wed, 30 Apr 2014 11:01:00 +0200 Received: from mail1.bemta3.messagelabs.com (mail1.bemta3.messagelabs.com [195.245.230.177]) by relay.uni-heidelberg.de (8.14.1/8.14.1) with ESMTP id s3U90Qhf019962 for ; Wed, 30 Apr 2014 11:00:30 +0200 Received: from [85.158.137.3:3641] by server-17.bemta-3.messagelabs.com id 68/9F-22741-AABB0635; Wed, 30 Apr 2014 09:00:26 +0000 X-Env-Sender: davidc@nag.co.uk X-Msg-Ref: server-13.tower-38.messagelabs.com!1398848178!13410382!1 X-Originating-IP: [86.188.197.178] X-StarScan-Received: X-StarScan-Version: 6.11.3; banners=nag.co.uk,-,- X-VirusChecked: Checked Received: (qmail 16318 invoked from network); 30 Apr 2014 08:56:18 -0000 Received: from nagmx1.nag.co.uk (HELO nagmx1.nag.co.uk) (86.188.197.178) by server-13.tower-38.messagelabs.com with SMTP; 30 Apr 2014 08:56:18 -0000 Received: from [192.156.217.104] (malta.nag.co.uk [192.156.217.104]) by nagmx1.nag.co.uk (Postfix) with ESMTP id 7755C12020B; Wed, 30 Apr 2014 09:56:32 +0100 (BST) User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:24.0) Gecko/20100101 Thunderbird/24.4.0 MIME-Version: 1.0 References: <535FB8BC.2040409@nag.co.uk> <536078C4.5040103@googlemail.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Message-ID: <5360BAB2.4040600@nag.co.uk> Date: Wed, 30 Apr 2014 09:56:18 +0100 Reply-To: Mailing list for the LaTeX3 project Sender: Mailing list for the LaTeX3 project From: David Carlisle Organization: NAG Subject: Re: preview of latex2e 2014 release To: LATEX-L@LISTSERV.UNI-HEIDELBERG.DE In-Reply-To: <536078C4.5040103@googlemail.com> Precedence: list List-Help: , List-Unsubscribe: List-Subscribe: List-Owner: List-Archive: Envelope-To: X-GMX-Antispam: 0 (Mail was not recognized as spam); Detail=V3; X-GMX-Antivirus: 0 (no virus found) X-UI-Filterresults: notjunk:1;V01:K0:gIn2I/D2/bU=:3zpA9sosqgQ1rZ1ZwMsFQ3hkiD fKfZxFlRZW+SOgU/R1Uc6k+vt4rh9NQo3IqQKy4gG+DdofPBni8AskyjEJcjECs4U4dSQAUMP YqJMTe6K2q4T6xJ4lxldazIVlsZ8YBh6E7FKm7TDQZtmhyLTCPLhi0lWAV3A+hwuIbmSM0SDx z741h1bqsvqCGx3MXV71+IRxazJstv4wQCd0xN0+4lmsVjfxfwRKam/13YgKmAHt0sHF71ird k9pFWPA5PEFeywAB76Ydews1lK5n2BROmUxllW+sSp+mufuB6AJ3Z4QS4VaX7wDQUP+Gep9Qk xBMHnPIwJX0w2/ouPOMu5dzv6wx/foQv4Ss9Gh3SVga6aIBrlco36gxQsRfcQXyMjrBakT3L2 o6MCHHgLDTzhoqPDJNVMhGLczUnbezYc8lTF4tA+4tA5iQkvUyFOI/2NhS7i0KhHDupsiZi1C 9dVxXDBjt1JfyOyFRUTiI69pd9uvbtcBA63sqb/VVcUDSRIlTa3AbcSQTmTBumprMuHRK3tQ8 gbFQdIhtwj7yV2N767+H+NX+NztBkBRLdV2Af0ugeCK7AxZceaPA02GXH89d92xtilUvB8MsN nEqqR+CNY8j/IVOdEQrru4furcv1QnIxqWKkQpSUkuS1yKgLnUY0p8vs2dsgq/A019WtNF2zj f8rQA/G+SLXJzMY5Y5MmsGJhJFi0p9t7eN+lUfI/b0isnGqnOv2ErFFI/usfAxuJZRQSxlcaI ATiJ//uqxlq8eGUuwEJ8wKuNk7YAGoK/T0igajz6nlrGOET2yO3w+0p/mBcLwOENpDrEDesrV Bq2crYvXbzpJMdZtng2k9RpWhiqKJhK2/ozFmF1Dvm8FciD/mYKsBXfq6gTvZ0gqHRFoIQCFx E1RqL1NRdBj1am2meWgkw1DHmvol2EDoudZqpT7P18MsbTOflZOk4EhNswphm54FK/RUZ8Qcp M6HVdMOZ8y84B2rQzQxEJoEsuDFp3dpttXlLN6OwRULiLvBB6hxXX1R1ukbFdJdVmCrKK4SiJ JM3PevDYM2o8ObmuifRIdV+wtnIpSj7E5Xo6arpPOzp2vG2yAs2492WwGrtLEHROjbB9uTwsP GL+wpDAhEyzChypyc9LcDsXhCI9RLO4jvtGgxWvcEXQ+urD3akfxH5Kv/YgseofjR3LXrZkOf WaFsy/jUvWRVDys6yfoygbZC4Px4I1jz7k12PqRK7NS8TijFL2dcs0XH8Bif16y1ItLZjZCiv MNlR6NCtN4K8UZNHB3J+RQfuP2gQtV+dWKAicqjPrLljsq6wW22LwvGKImij10gd09NXbK8Jl S4v3WeHOrLy9KyUcD4W7M0vpeGG6Gv9S37k3i7LwtkOky8DDpY6Q2KlWlBQq4FjtFalqDtDW1 jhbsPdeTnJf/3W6m4Pkvnojk161hEPQwWwRub6C96vnK2jVtWtxV02GZ80lGQFHT/keGMMSyi uYYUzB+A== X-UI-Loop:V01:wAy8r6JB4mI=:o5bEVKJIGdgLn9FWbWR/lLEKQC8QgF8o01bHZ+/USKM= Status: R X-Status: X-Keywords: X-UID: 7373 On 30/04/2014 05:15, Heiko Oberdiek wrote: > On 29.04.2014 16:35, David Carlisle wrote: > > | inputenc package updates The inputenc package | allows different > input encodings for LaTeX documents to | be specified, importantly > the utf8 option to specify the | Unicode UTF-8 encoding. A common > mistake has been | to use this option with Unicode-based TeX engines > | LuaTeX and XeTeX, however inputenc does not work | (and is not > needed) with those systems. The package | has been modified so that > if used with LuaTeX or | XeTeX, then it just issues a warning if > utf8 is specified, | and stops with an error for any other encoding > | requested. > > Encodings ascii.def and x-ascii.def > ----------------------------------- A use case for these encodings > is that a document should not contain problematic non-ASCII > characters. Thus inputenc cries, if a 8-bit character or control > characters is present. Since a valid document only contains 7-bit > characters, this works very well with LuaTeX and XeTeX. IMHO, > inputenc must not throw an error. Instead: * It should disable the > 7-bit control characters of package inputenc as without LuaTeX/XeTeX. > * The other characters in the Unicode range with codes > 127 are way > too many for inputenc to handle. (Each character would have been made > active and defined to throw an error.) IMHO the best approach is to > ignore these characters. * Instead of a warning, an info message is > enough, at most a warning, which says character with character codes > > 127 are not handled by inputenc for LuaTeX/XeTeX. > > Encodings utf8.def and utf8x.def -------------------------------- > The current version knows `utf8.def` and ignores it with a warning. > However, there are documents that are using utf8x.def of package > ucs. The same arguments are valid here: * The document source is in > UTF-8, more or less needed for XeTeX/LuaTeX. * Thus an error would be > wrong and a warning is enough. That should also be the case for > utf8x.def If someone uses a document for the old engines: > > \documentclass{article} \usepackage[utf8]{inputenc} > \usepackage[T1]{fontenc} % or without \begin{document} > > and gets a warning, that utf8 is not required for LuaTeX/XeTeX, then > he run into trouble (both with/without \usepackage[utf8]{inputenc}): > The characters with codes >127 are not mapped to the proper LICRs, > but are using the slots of the font encoding OT1 or T1 (for example) > with the consequences that characters are missing or wrong. Thus a > hint in the warning of inputenc should be added, that package > fontspec might be useful. > > Yours sincerely Heiko Oberdiek > Heiko thanks, This is a personal response not checked with the rest of the team, and I haven't checked any changes into the sources yet:-) I'd wondered about fontenc originally but didn't as fontenc only works on these engines and if you are doing \ifxetex \usepackage{fontenc} \setmainfont... \else ... \fi then you could load inputenc package in the \else branch and the problem this change was trying to help of inputenc not working with xetex/luatex would not be an issue. But perhaps you are right and mentioning fontspec would help. on ascii I think it hadn't occurred to me that anyone was using inputenc with ascii:) unlike on pdftex it's not feasible to make all the non-ascii chars invalid but on the grounds of not breaking existing documents I think you are right that we should let ascii x-ascii and utf8x through. It would be possible to do as you suggest above and disable the control characters below 32, but I think it's more important on unicode engines to move away from using active characters for encoding support so I'd rather just treat all these as aliases for the native engine utf8 support. Which would mean something like \else % \end{macrocode} % Unicode based TeX engines do not require this package at all, and always use UTF-8 % input. Allow the package to be used if |[uf8]| or |[ascii]| options are used to simplify % switching between TeX engines. % \begin{macrocode} \def\inputencoding#1{% \edef\reserved@a{#1}% \@tempswafalse \@tfor\reserved@b:={utf8}{utf8x}{ascii}{x-ascii}\do{% \ifx\reserved@a\reserved@b\@tempswatrue\fi} \if@tempswa \PackageWarningNoLine {inputenc}% {inputenc not required for xetex or luatex.\MessageBreak utf8 assumed.\MessageBreak The fontspec package may be required to\MessageBreak access suitable fonts}% \else \PackageError {inputenc}% {inputenc not required for xetex or luatex.\MessageBreak only UTF-8 supported}% {For xelatex or lualatex do not load inputenc or use [utf8] option.} \fi} \fi % \end{macrocode} % \end{macro} which makes ! Package inputenc Error: inputenc not required for xetex or luatex. (inputenc) only UTF-8 supported. See the inputenc package documentation for explanation. Type H for immediate help. ... l.156 \endinput ? h For xelatex or lualatex do not load inputenc or use [utf8] option. ? for an unsupported encoding or for utf8(x) or (x-)ascii it does Package inputenc Warning: inputenc not required for xetex or luatex. (inputenc) utf8 assumed. (inputenc) The fontspec package may be required to (inputenc) access suitable fonts. It might be argued that the message should be different in the utf8 and ascii cases but I think the real message is that there is no macro-layer encoding switch happening on these engines and the purpose of the change was to flag a top level message or error rather than just have the internal catcode changing loops generating spurious error generating code as it wasn't written for a multibyte base encoding. So as far as possible I think the xetex behaviour of inputenc should be give a message or error and do nothing, rather than really "support" lots of different named encodings as it does with pdflatex. David ________________________________________________________________________ The Numerical Algorithms Group Ltd is a company registered in England and Wales with company number 1249803. The registered office is: Wilkinson House, Jordan Hill Road, Oxford OX2 8DR, United Kingdom. This e-mail has been scanned for all viruses by Star. The service is powered by MessageLabs. ________________________________________________________________________