Received: from webgate.proteosys.de (mail.proteosys-ag.com [62.225.9.49]) by lucy.proteosys (8.11.0/8.9.3/SuSE Linux 8.9.3-0.1) with ESMTP id f1FL13H19480 for ; Thu, 15 Feb 2001 22:01:03 +0100 Received: by webgate.proteosys.de (8.11.0/8.11.0) with ESMTP id f1FL13d12030 . for ; Thu, 15 Feb 2001 22:01:03 +0100 Received: from mail.Uni-Mainz.DE (mailserver1.zdv.Uni-Mainz.DE [134.93.8.30]) by mailgate1.zdv.Uni-Mainz.DE (8.11.0/8.10.2) with ESMTP id f1FL12M10730 for ; Thu, 15 Feb 2001 22:01:02 +0100 (MET) MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----_=_NextPart_001_01C09792.67DE1180" Received: from mailgate1.zdv.Uni-Mainz.DE (mailgate1.zdv.Uni-Mainz.DE [134.93.8.56]) by mail.Uni-Mainz.DE (8.9.3/8.9.3) with ESMTP id WAA27348 for ; Thu, 15 Feb 2001 22:01:01 +0100 (MET) X-MimeOLE: Produced By Microsoft Exchange V6.5 Received: from mail.listserv.gmd.de (mail.listserv.gmd.de [192.88.97.5]) by mailgate1.zdv.Uni-Mainz.DE (8.11.0/8.10.2) with ESMTP id f1FL0xM10718 for ; Thu, 15 Feb 2001 22:00:59 +0100 (MET) Received: from mail.listserv.gmd.de (192.88.97.5) by mail.listserv.gmd.de (LSMTP for OpenVMS v1.1a) with SMTP id <4.392738DB@mail.listserv.gmd.de>; Thu, 15 Feb 2001 22:00:51 +0100 Received: from RELAY.URZ.UNI-HEIDELBERG.DE by RELAY.URZ.UNI-HEIDELBERG.DE (LISTSERV-TCP/IP release 1.8b) with spool id 489302 for LATEX-L@RELAY.URZ.UNI-HEIDELBERG.DE; Thu, 15 Feb 2001 21:41:17 +0100 Received: from ix.urz.uni-heidelberg.de (mail.urz.uni-heidelberg.de [129.206.119.234]) by relay.urz.uni-heidelberg.de (8.8.8/8.8.8) with ESMTP id VAA21661 for ; Thu, 15 Feb 2001 21:41:06 +0100 (MET) Received: from relay.uni-heidelberg.de (relay.uni-heidelberg.de [129.206.100.212]) by ix.urz.uni-heidelberg.de (8.8.8/8.8.8) with ESMTP id VAA06436 for ; Thu, 15 Feb 2001 21:41:06 +0100 Received: from angel.algonet.se (angel.algonet.se [194.213.74.112]) by relay.uni-heidelberg.de (8.10.2+Sun/8.10.2) with SMTP id f1FKf8x06376 for ; Thu, 15 Feb 2001 21:41:08 +0100 (MET) Received: (qmail 25217 invoked from network); 15 Feb 2001 21:41:05 +0100 Received: from delenn.tninet.se (HELO algonet.se) (195.100.94.104) by angel.algonet.se with SMTP; 15 Feb 2001 21:41:05 +0100 Received: from [195.100.226.147] (du185-226.ppp.su-anst.tninet.se [195.100.226.185]) by delenn.tninet.se (BLUETAIL Mail Robustifier 2.2.1) with ESMTP id 264456.269664.982delenn-s0 for ; Thu, 15 Feb 2001 21:41:04 +0100 In-Reply-To: <200102151903.NAA56125@dcdrjh.fnal.gov> Return-Path: X-Sender: haberg@pop.matematik.su.se Content-class: urn:content-classes:message Subject: Re: Side remarks about TeX input sequence Date: Thu, 15 Feb 2001 21:39:50 +0100 Message-ID: X-MS-Has-Attach: X-MS-TNEF-Correlator: From: "Hans Aberg" Sender: "Mailing list for the LaTeX3 project" To: "Multiple recipients of list LATEX-L" Reply-To: "Mailing list for the LaTeX3 project" Status: R X-Status: X-Keywords: X-UID: 3945 This is a multi-part message in MIME format. ------_=_NextPart_001_01C09792.67DE1180 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable At 13:03 -0600 2001/02/15, Randolph J. Herber wrote: > The java language specification defines all three character > sequences as line terminators and '\032' (also known as = Control-Z) > if it is the last character of the file as a file terminator > (in MSDOS, Control-Z does mark the logical end of a text file). > > TeX and LaTeX could take a similar approach: > > If the underlaying operating system file is > record structured and therefore is not a > character stream file, then suffix each record > with one of the above line terminator sequences. > > Then, in TeX's mouth, any of the line terminator > sequences could be recognized as being a line > terminator and, if Java's example is followed, > then '\032' or end-of-file would mark the logical > end of the input file. > > With such processing, it would not matter that one had > transfered a text file between systems with incompatible text > file structures as binary files (e.g., the scientists at > CDF do that frequently then ask me to repair the problem). Right. This is what I was hinting at: The thing is that a matter of practise, one ends up with a flood of UNIX, MacOS, and MSOS files via = the Internet, and it is difficult to keep track which files that should be translated, and which one should not. Therefore, at least for computer related software, such as for compilers and their text editors, it is now common on the MacOS that they use that Java convention or something similar. Thus, I do not anymore translate = the text files I pick down, but merely give them the attribute 'TEXT', even though they have UNIX newlines in them. Experimenting though with Hugs (that is, the sources I ported to MacOS), = I found it tricky to write UNIX newlines, because sometimes one writes to = a console, and it does not accept \n as newlines. So the safe thing is to write files as "text", whereas reading them with the Java convention, either by tweaking the library routines, or by opening the files as = binary and do the parsing then. -- But with the latter approach, one will have = to check what happens under VMS and other platforms that have yet other conventions. Another approach would be to make a TeX version that pretends to be = 32-bit internally, and require the \r, \n, and \r\n newlines to be translated = into the Unicode line separator. Hans Aberg ------_=_NextPart_001_01C09792.67DE1180 Content-Type: text/html; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Re: Side remarks about TeX input sequence

At 13:03 -0600 2001/02/15, Randolph J. Herber = wrote:
>        The = java language specification defines all three character
>        = sequences as line terminators and '\032' (also known as = Control-Z)
>        if it = is the last character of the file as a file terminator
>        (in = MSDOS, Control-Z does mark the logical end of a text file).
>
>        TeX = and LaTeX could take a similar approach:
>
>          =       If the underlaying operating system file = is
>          =       record structured and therefore is not = a
>          =       character stream file, then suffix each = record
>          =       with one of the above line terminator = sequences.
>
>          =       Then, in TeX's mouth, any of the line = terminator
>          =       sequences could be recognized as being a = line
>          =       terminator and, if Java's example is = followed,
>          =       then '\032' or end-of-file would mark the = logical
>          =       end of the input file.
>
>        With = such processing, it would not matter that one had
>        = transfered a text file between systems with incompatible text
>        file = structures as binary files (e.g., the scientists at
>        CDF do = that frequently then ask me to repair the problem).

Right. This is what I was hinting at: The thing is = that a matter of
practise, one ends up with a flood of UNIX, MacOS, = and MSOS files via the
Internet, and it is difficult to keep track which = files that should be
translated, and which one should not.

Therefore, at least for computer related software, = such as for compilers
and their text editors, it is now common on the MacOS = that they use that
Java convention or something similar. Thus, I do not = anymore translate the
text files I pick down, but merely give them the = attribute 'TEXT', even
though they have UNIX newlines in them.

Experimenting though with Hugs (that is, the sources I = ported to MacOS), I
found it tricky to write UNIX newlines, because = sometimes one writes to a
console, and it does not accept \n as newlines. So = the safe thing is to
write files as "text", whereas reading them = with the Java convention,
either by tweaking the library routines, or by = opening the files as binary
and do the parsing then. -- But with the latter = approach, one will have to
check what happens under VMS and other platforms that = have yet other
conventions.

Another approach would be to make a TeX version that = pretends to be 32-bit
internally, and require the \r, \n, and \r\n newlines = to be translated into
the Unicode line separator.

  Hans Aberg

------_=_NextPart_001_01C09792.67DE1180--