Received: from mail.proteosys.com ([213.139.130.197]) by nummer-3.proteosys with Microsoft SMTPSVC(5.0.2195.5329); Mon, 14 Jul 2003 23:59:36 +0200 Received: by mail.proteosys.com (8.12.9/8.12.2) with ESMTP id h6ELxYcH018932 for ; Mon, 14 Jul 2003 23:59:35 +0200 Received: from listserv.uni-heidelberg.de (listserv.uni-heidelberg.de [129.206.100.27]) by relay.uni-heidelberg.de (8.12.9/8.12.9) with ESMTP id h6ELonmp018677; Mon, 14 Jul 2003 23:50:49 +0200 (MET DST) MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----_=_NextPart_001_01C34A53.36E19400" Received: from listserv (listserv.uni-heidelberg.de [129.206.100.27]) by listserv.uni-heidelberg.de (8.12.3/8.12.3/SuSE Linux 0.6) with ESMTP id h6DM0BlD023915; Mon, 14 Jul 2003 23:50:23 +0200 Received: from LISTSERV.UNI-HEIDELBERG.DE by LISTSERV.UNI-HEIDELBERG.DE (LISTSERV-TCP/IP release 1.8d) with spool id 1740 for LATEX-L@LISTSERV.UNI-HEIDELBERG.DE; Mon, 14 Jul 2003 23:50:22 +0200 X-MimeOLE: Produced By Microsoft Exchange V6.5 Received: from relay2.uni-heidelberg.de (relay2.uni-heidelberg.de [129.206.210.211]) by listserv.uni-heidelberg.de (8.12.3/8.12.3/SuSE Linux 0.6) with ESMTP id h6ELoMM9004043 for ; Mon, 14 Jul 2003 23:50:22 +0200 Received: from smtp.albany.edu (mail1.csc.albany.edu [169.226.1.133]) by relay2.uni-heidelberg.de (8.12.9/8.12.9) with ESMTP id h6ELodGl017076 for ; Mon, 14 Jul 2003 23:50:40 +0200 (MET DST) Received: from hilbert.math.albany.edu (hilbert.math.albany.edu [169.226.23.52]) by smtp.albany.edu (8.12.5/8.12.5) with ESMTP id h6ELoZU4018267 for ; Mon, 14 Jul 2003 17:50:36 -0400 (EDT) Received: (from hammond@localhost) by hilbert.math.albany.edu (8.12.5/8.12.5/Submit) id h6ELoZcf028679; Mon, 14 Jul 2003 17:50:35 -0400 (EDT) In-Reply-To: <16146.60345.852158.31606@pussy.npc.de> Lines: 38 References: <20030710081528.A12401@diabolo.informatik.rwth-aachen.de> <78ADDA01-B2DC-11D7-8AE7-0050E4455404@atlis.com> <20030711081704.A14039@diabolo.informatik.rwth-aachen.de> <16146.60345.852158.31606@pussy.npc.de> Return-Path: X-OriginalArrivalTime: 14 Jul 2003 21:59:36.0989 (UTC) FILETIME=[37787CD0:01C34A53] User-Agent: Gnus/5.09 (Gnus v5.9.0) Emacs/21.3 X-Scanned-By: MIMEDefang 2.33 (www . roaringpenguin . com / mimedefang) X-Spam-Score: -32.8 () EMAIL_ATTRIBUTION,IN_REP_TO,QUOTED_EMAIL_TEXT,REFERENCES,REPLY_WITH_QUOTES,USER_AGENT_GNUS_UA Content-class: urn:content-classes:message Subject: XML, UTF-8 and TeX engines Was (Re: OT: ANT) Date: Mon, 14 Jul 2003 22:50:35 +0100 Message-ID: A X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: XML, UTF-8 and TeX engines Was (Re: OT: ANT) Thread-Index: AcNKUzeSwzqGrVjeTFeANKf7iW1wpw== From: "William F Hammond" To: Reply-To: "Mailing list for the LaTeX3 project" Status: R X-Status: X-Keywords: X-UID: 4681 This is a multi-part message in MIME format. ------_=_NextPart_001_01C34A53.36E19400 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Joachim Schrod writes: > But not necessarily the most interesting. There is also the > possibility of experimenting with new innovative approaches to style > sheets, given by modular XML processors like PXP and modular > typesetting engines like ant. James Clark once wrote somewhere that style sheet processing is a limited form of sgml processing, and I've never had reason to doubt it for author-side processing. Don't underestimate the power of less restrained frameworks like David Megginson's perl module SGMLS.pm (and its friendly interface sgmlspl.pl) for formatting XML to LaTeX. I have recently found it to work with UTF-8 encoded XML documents under Perl 5.6+. With it at some point you'll want to say "use utf8;"; for example, say it in an sgmlspl.pl script if that is your setup. The utf8 pragma is said to be headed for redundancy with eventual versions of Perl (5.8 is current, I think), but is said to be harmless in those versions. So, in fact, I've rolled provision for this in the current gellmu tarball (still not ready for ctan but in my web). An example document is in that tarball: examples/intlchars. This example can be built with the bin/linux/umkg interface if emacs 21, opensp, perl, and, of course, latex and pdflatex, augmented by the new utf8ienc, are all in place. For latex output I am, of course, constrained by what characters are provided in utf8ienc, and for viewing HTML I am likewise constrained by what characters the Gecko layout engine (common to netscape, mozilla, galeon, and konqueror) handles on my platform. I don't yet really have the means to test characters beyond unicode plane 0. The example document is also at http://www.albany.edu/~hammond/gellmu/utf8/ -- Bill ------_=_NextPart_001_01C34A53.36E19400 Content-Type: text/html; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable XML, UTF-8 and TeX engines Was (Re: OT: ANT)

Joachim Schrod <jschrod@ACM.ORG> writes:

> But not necessarily the most interesting. There = is also the
> possibility of experimenting with new innovative = approaches to style
> sheets, given by modular XML processors like PXP = and modular
> typesetting engines like ant.

James Clark once wrote somewhere that style sheet = processing is a
limited form of sgml processing, and I've never had = reason to doubt it
for author-side processing.  Don't underestimate = the power of less
restrained frameworks like David Megginson's perl = module SGMLS.pm (and
its friendly interface sgmlspl.pl) for formatting XML = to LaTeX.

I have recently found it to work with UTF-8 encoded = XML documents
under Perl 5.6+.  With it at some point you'll = want to say "use
utf8;"; for example, say it in an sgmlspl.pl = script if that is your
setup.  The utf8 pragma is said to be headed for = redundancy with
eventual versions of Perl (5.8 is current, I think), = but is said to be
harmless in those versions.

So, in fact, I've rolled provision for this in the = current gellmu
tarball (still not ready for ctan but in my = web).  An example document
is in that tarball: examples/intlchars.  This = example can be built
with the bin/linux/umkg interface if emacs 21, = opensp, perl, and, of
course, latex and pdflatex, augmented by the new = utf8ienc, are all in
place.  For latex output I am, of course, = constrained by what
characters are provided in utf8ienc, and for viewing = HTML I am
likewise constrained by what characters the Gecko = layout engine
(common to netscape, mozilla, galeon, and konqueror) = handles on my
platform.  I don't yet really have the means to = test characters beyond
unicode plane 0.

The example document is also at
http://www.albany.ed= u/~hammond/gellmu/utf8/

          &nbs= p;            = ;            = -- Bill

------_=_NextPart_001_01C34A53.36E19400--