X-VM-v5-Data: ([nil nil nil nil nil nil nil nil nil] ["3953" "Tue" "2" "April" "1996" "15:50:38" "BST" "David Carlisle" "carlisle@CS.MAN.AC.UK" nil "97" "Re: Changing \\hyphenchar" "^Date:" nil nil "4" nil nil nil nil] nil) Received: from listserv.gmd.de (listserv.gmd.de [192.88.97.1]) by trudi.zdv.Uni-Mainz.DE (8.6.12/8.6.12) with ESMTP id RAA16905; Tue, 2 Apr 1996 17:17:47 +0200 Received: from listserv.gmd.de by listserv.gmd.de (LSMTP for OpenVMS v1.1a) with SMTP id <13.C5E425DD@listserv.gmd.de>; Tue, 2 Apr 1996 17:17:45 +0200 Received: from URZINFO.URZ.UNI-HEIDELBERG.DE by URZINFO.URZ.UNI-HEIDELBERG.DE (LISTSERV-TCP/IP release 1.8b) with spool id 65284 for LATEX-L@URZINFO.URZ.UNI-HEIDELBERG.DE; Tue, 2 Apr 1996 15:50:54 +0100 Received: from ix.urz.uni-heidelberg.de (root@aixterm1.urz.uni-heidelberg.de [129.206.119.41]) by relay.urz.uni-heidelberg.de (8.7.5/8.7.4) with SMTP id QAA03858 for ; Tue, 2 Apr 1996 16:50:49 +0200 (MET DST) Received: from m1.cs.man.ac.uk by ix.urz.uni-heidelberg.de (AIX 3.2/UCB 5.64/4.03aixterm1) id AA23673; Tue, 2 Apr 1996 16:50:46 +0200 Received: from r8h.cs.man.ac.uk by m1.cs.man.ac.uk (4.1/SMI-4.1:AL6) id AA29165; Tue, 2 Apr 96 15:50:40 BST Message-ID: <9604021450.AA25874@r8h.cs.man.ac.uk> Reply-To: Mailing list for the LaTeX3 project In-Reply-To: <199604021335.PAA03506@murnau.idris.fr> (message from Bernard GAULLE on Tue, 2 Apr 1996 15:35:14 +0200) Date: Tue, 2 Apr 1996 15:50:38 BST From: David Carlisle Sender: Mailing list for the LaTeX3 project To: Multiple recipients of list LATEX-L Subject: Re: Changing \hyphenchar Status: R X-Status: X-Keywords: X-UID: 1732 Bernard writes I'd like (and all French speaking people with me) a general solution be found for LaTeX. Well this message will not provide a general solution (sorry!) but anyway some thoughts... > With the dc, the printed glyph is font slot '177 / ^^7f > (except with olddc package) The fd files generated by `olddc.ins' use the dc 1.1 release font names and the ones generated by `newdc.ins' generate the new release 1.2 names, but in both cases the hyphenchar of the fonts are not set (except for tt fonts which set it to -1) (The fd files distributed with the dc fonts do set it, but that is another issue) As the hyphenchar is not set when the fonts are loaded, the \defaulthyphenchar is used, no standard LaTeX package changes this from its default plain value of 45 (-) and so LaTeX whether using cm fonts, or new or old dc currently gets French hyphenation right according to your criteria (and presumably therefore gets it wrong for other languages). If someone wants to use the alternative hyphenation character with the fd files distributed by LaTeX, then more or less they just need to set \defaulthypenchar to 127 before loading the fonts. Unfortunately this has a side effect of disabling all hyphenation in old 7bit fonts as the fd files for them do not set a \hyphenchar on the assumption the \defaulthyphenchar would be 45. So probably all the `OT1' fd files should be modified to explicitly set \hyphenchar for each OT1 font to be 45. As the hyphenchar of T1 fonts is not set in the fd files, that leaves it free for a user to set \defaulthyphenchar to customise whether 45 or 127 is used. Unfortunately this needs to be switched before any fonts are loaded into the format, but a suitable cfg file declaration could be constructed... However major open questions relate to multilingual documents. TeX makes it very inconvenient to have the same font loaded with different values for the font parameters like hyphenchar, so if you are setting two different languages, one which `should' use 45 and one which `should' use 127 then probably there needs to be a compromise somewhere... One possible compromise would be to have hyphenation tables that could `cope' with either setting of hyphenchar. It is almost certainly not necessary to load extra patterns to prohibit __all__ possible letter combinations near - as that makes the hyphenation tables very large, hopefully some smaller set would work `in practice'. Especially if mono-lingual documents were anyway set with the old 45 position, and so the extra hyphenation patterns were only needed in special multi-lingual contexts. (Languages that normally used the 127 slot might want to load a more extensive range of patterns for - as the extra memory taken would presumably be justified in those cases) > Ok, the solution is: type in ^^7f in place of "-" and the problem will > be solved :-) Or, worst, activate "-"... The first solution doen't work as If you type \documentclass{article} \usepackage[T1]{fontenc} \catcode`\^^7f=12 \begin{document} aaa-bbb aaa^^7fbbb \end{document}} Then you see that ^^7F is totally unsuitable for use as a compound word character, as it has its side bearings adjusted so that it is a hanging punctuation. Of course this highlights one of the problems, the current mess is caused by a typical TeX `compromise' that mixes two features that ought to be distinct, * The hyphenation behaviour around an explict hyphen * The choice of a different glyph for end-of-line hyphens as opposed to compound word marks. While staying with TeX (as opposed to etex or omega or somthing) these two features may not be separated, anything you do to affect one, is likely to have an effect on the other. Hopefully LaTeX can have some kind of interface that hides this as much as possible, but currently, as I say, it just ducks the issue and does what it always did and uses slot 45 with no `good' interface for changing this. David