Received: from mout.gmx.net (mout.gmx.net [212.227.15.19]) by h1439878.stratoserver.net (8.14.2/8.14.2/Debian-2build1) with ESMTP id s5UH1qeQ019007 for ; Mon, 30 Jun 2014 19:01:53 +0200 Received: from relay.uni-heidelberg.de ([129.206.100.212]) by mx-ha.gmx.net (mxgmx001) with ESMTPS (Nemesis) id 0MC4u0-1WstWE0DKK-008sUH for ; Mon, 30 Jun 2014 19:01:46 +0200 Received: from listserv.uni-heidelberg.de (listserv.uni-heidelberg.de [129.206.100.94]) by relay.uni-heidelberg.de (8.14.1/8.14.1) with ESMTP id s5UGxIZf024224 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Mon, 30 Jun 2014 18:59:18 +0200 Received: from listserv.uni-heidelberg.de (listserv.uni-heidelberg.de [127.0.0.1]) by listserv.uni-heidelberg.de (8.13.8/8.13.8) with ESMTP id s5UGUS3u031072; Mon, 30 Jun 2014 18:59:17 +0200 Received: by LISTSERV.UNI-HEIDELBERG.DE (LISTSERV-TCP/IP release 16.0) with spool id 11100168 for LATEX-L@LISTSERV.UNI-HEIDELBERG.DE; Mon, 30 Jun 2014 18:59:17 +0200 Received: from relay.uni-heidelberg.de (relay.uni-heidelberg.de [129.206.100.212]) by listserv.uni-heidelberg.de (8.13.8/8.13.8) with ESMTP id s5UGxHrw009932 for ; Mon, 30 Jun 2014 18:59:17 +0200 Received: from smtp2.easily.co.uk (smtp2.easily.co.uk [91.194.151.17]) by relay.uni-heidelberg.de (8.14.1/8.14.1) with ESMTP id s5UGx8gp024158 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Mon, 30 Jun 2014 18:59:10 +0200 Received: from [139.222.113.100] (port=57457 helo=[139.222.113.100]) by smtp2.easily.co.uk with esmtpa (Exim 4.43) id 1X1ev5-000514-Dh for LATEX-L@LISTSERV.UNI-HEIDELBERG.DE; Mon, 30 Jun 2014 17:59:07 +0100 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:24.0) Gecko/20100101 Thunderbird/24.6.0 MIME-Version: 1.0 References: <53B11E4F.5010603@morningstar2.co.uk> X-Enigmail-Version: 1.6 Content-Type: text/plain; charset=UTF-8 Message-ID: <53B1975B.3020509@morningstar2.co.uk> Date: Mon, 30 Jun 2014 17:59:07 +0100 Reply-To: Mailing list for the LaTeX3 project Sender: Mailing list for the LaTeX3 project From: Joseph Wright Subject: Re: Case changing operations To: LATEX-L@LISTSERV.UNI-HEIDELBERG.DE In-Reply-To: Precedence: list List-Help: , List-Unsubscribe: List-Subscribe: List-Owner: List-Archive: Content-Transfer-Encoding: quoted-printable X-MIME-Autoconverted: from 8bit to quoted-printable by relay.uni-heidelberg.de id s5UGxIZf024224 Envelope-To: X-GMX-Antispam: 0 (Mail was not recognized as spam); Detail=V3; X-GMX-Antivirus: 0 (no virus found) X-UI-Filterresults: notjunk:1;V01:K0:2oSP+1oXjcQ=:D/67xP4bMGDSjUuzZkq8LGppYZ MiSmA4eu4d9dADWKkcFqaEa9jxVn9L/Wv0jeGLY1oZBT5opx5j3rmduu8+HSJ3tR6OBMg0X9e 7U6O10xwT6PmkWYqbI7D9zfnQNxQP0hyvmGxBclCjhT4dfpvEn22KMrJyvMXW7+6HQ03dxo0U B8euJFYB2E6x3FL29lwKHk7G4P8yHVfQMDXeW/AFISrVcRuPdTEqi077w5b5ttX27oy1ydGFO 70kh+mAviam0FtyVCUe0rEF1se8QpJIwsdndh5ztky/GOZufJ4caJbesByt8VtVZgTQeJTF7m HEKwFPG6A8Sg8N4Wz0Mv8diVXCcvvN3idOLd7wVip4ZCj3KVzyL8q/k8VSz+x9esCZ2C6pc2R UxGHnzucltKQcqFs8QTJW1Tv9SoSKn1XYDk5PwjeIdfMbxWu/QAjbi/c3TFOPNJr+XKaHLBlF e6+owrixhi/JOPiVjvhhY6tCv89/uO0Wcz8PywD51EtCsY0DM4B06eKoLoAjBfBAap+Vr3Rg6 /6KwJA75T+0OWIbQdQyIoc6kjwA8ytoJBat8yoXGuPevW0OmOiF8vXmY3f0hrhpD70dn27cws z2k+WiGJaBpV/3OUxpOp7BgyMzc58v4yhCwpCyEuQTy07pvQV8jIInKwguL8Y1k3xQAUt0S+u x39cKEaqN3AZVw8xeCql1SvGVTi7SZJPLV4ccuPd1Nf/PJfjr3yzCkG6zIWZj99k2MQ4ZKlvH RtaGItPk0eo7yDae3dqD8YyoYhucoergU+xiYduCOs0ONygi3H9oJtV7GzqP3ctVRILut/gIg eYeG64ezRuJhnOVFo3CBhMV8PB+utq3mHQO6JEebEDNxrOLEGpzCNKBTh3wY4TIprzTFs+iB7 LqferYVvWT61m2CV4JjkqLaFndPyMd2Fhk2RwgHFjuxS+/WD1faMkOLKOsbBwF0+93/6oC1AU ewAOtqOpuYsHqh3yvu3IHi0Z9f8nle25KHsIvau+j0ySfnjYgWPpX3XqyHi5R1TvYw/HA0C5D YrwErd/17YDMtZRQlDQ3T6s7mozzxQXLEvYaXoFDWnHpciCPaaAtYkx9BQ3J1RD1DaAxk9Nzx ykpZr80mtEE31cKxUOp5pFBRg/oXgIEZocPu1UsKVYcjXyXwOT2V2WP5Se1UYAd+31/GXQZ81 HPZnolbDieEzyKCohTfKbi4cgXCQNPv0pltbsqvQRAviGPfcr/65HiYuvn2jPWVWjm4ko+4ur gX+ROJoyvlenfIf6FQsMNc8IEK1JTJeD7nU9/Ij1xcRHm1w7mfTagImYWxSYnpSoOCEvLOBcq trG5+NOxE+sbBfDpZwe7GV+CzzeFTeuiL6U47BbvJagXSlXPFDmxJn8DEdp1YPI8lGmwOcCln x8mvj7LHYMr/HX2bwFBFAPTFOi4iyLy6kaBF42wSeGEURlV+NvUwR5iBPJXg+UqovROJ+M0d+ fpmh1n+rQ+31tyH/s59FIGrvSNuYJtqYaVmid76E0vBaZFd7up407iRiG7JO+ocNJ6TvEcCFE 39ThV8wMU9sIEjlKTJhg= X-UI-Loop:V01:mFekk4nyO4g=:itES0/93rBaLdNE6oxqtt6h90D3hOi/hvK0ZYtRAPlo= Status: R X-Status: X-Keywords: X-UID: 7522 On 30/06/2014 14:05, Joel C. Salomon wrote: > There=E2=80=99s an important use-case that seems not to have been addre= ssed, > but perhaps this is better handled in a different layer: > mixed-language strings. >=20 > For example, consider a document with the title, =E2=80=9CThe Interesti= ng Life > of Rag=C4=B1p Hul=C3=BBsi =C3=96zdem=E2=80=9D (to chose the first Turki= sh name I could find > with both dotted i and dotless =C4=B1). Somehow, within the \title{} > declaration, the change of language must be indicated so that (e.g.) > at the top of the page this will be transformed to =E2=80=9CTHE INTERES= TING > LIFE OF RAGIP HUL=C3=9BS=C4=B0 =C3=96ZDEM=E2=80=9D and not =E2=80=9CTHE= INTEREST=C4=B0NG L=C4=B0FE =E2=80=A6=E2=80=9D nor =E2=80=9C=E2=80=A6 > LIFE OF RAGIP HUL=C3=9BSI =E2=80=A6=E2=80=9D. > > A similar situation arises in German where within geographical names > =E2=80=98=C3=9F=E2=80=99 should capitalize to the recently-defined =E2=80= =98=E1=BA=9E=E2=80=99, not =E2=80=98SS=E2=80=99. > (According to , this ru= le was > adopted in 2010.) >=20 > As I said, this is probably best handled in a separate layer: Code > that capitalizes user-provided text would need to defer to the LaTeX3 > equivalent of Babel, which would scan the text for user-level > language-change commands, and (among other things) call > \tl_upper_case:nn with the appropriate language argument. But I think > it=E2=80=99s important that the interface to the casing functions being > defined now be aware of the way they will likely be used. As you say, this looks much more like a 'high level' requirement: it's tricky to see how nesting can work and at the same time not be tied to design otherwise. For example, we can mark up a language in the input easily enough The Interesting Life of \SomeLangCommand{tr}{Rag=C4=B1p Hul=C3=BBsi =C3= =96zdem} but the problem is then making sure that the command does case changing at point of use. One might imagine that such a command might have a flexible definition: \TitleCase#1 =3D> \cs_set_eq:NN \SomeLangCommand \text_title_case:nn ... % Other similar stuff \text_title_case:Vn \l_language_current_tl {#1} That might still leave a question about x- versus f-type expansion: if the outcome is meant to be 'just text' then you need to expand \SomeLangCommand, which at the moment is deliberately avoided. Of course, such an issue might be avoided by doing a pre-parse, as you sugge= st: \TitleCase#1 =3D> \cs_set_eq:NN \SomeLangCommand \text_title_case_and_brace:nn ... % Other similar stuff \tl_set:Nx \l_some_tmpa_tl {#1} % % Now "The Interesting Life of {RAGIP HUL=C3=9BS=C4=B0 =C3=96ZDEM}" \tl_set:Nx \l_some_tmpa_tl { \text_title_case:VV \l_language_current_tl \l_some_tmpa_tl } On the German business, I'd already wondered how best to add the 'capital Eszett' business for the simple case. I'm not quite sure how one is meant to do that (not de-DE or whatever, but de-!). I'm also not sure whether people who do use it will use it for all Eszetts (might otherwise lead to some odd decisions in upper casing). As you say, this could of course occur in input where such a decision applies only to some cases. What is worth noting is that while the commands I've added take general text as input, we are seeing them as building blocks for e.g. a hypothetical \text_title_case:nn. That and related operations need to do things like worry about 'words', and in that context splitting up the input is needed anyway. (Will has an expandable approach to do that. We might imagine seeking to add to that some form of 'recursion' for nested languages.) BTW, nice Turkish name: that one is going into the test suite for this ar= ea! --=20 Joseph Wright