Received: by nummer-3.proteosys id <01C19442.D449C254@nummer-3.proteosys>; Thu, 3 Jan 2002 11:38:49 +0100 MIME-Version: 1.0 x-vm-v5-data: ([nil nil nil nil nil nil nil t nil][nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil]) Content-Type: multipart/alternative; boundary="----_=_NextPart_001_01C19442.D449C254" X-MimeOLE: Produced By Microsoft Exchange V6.5 Content-class: urn:content-classes:message Subject: LaTeX, BibTeX, databases, \bibitem, front matter Date: Tue, 9 Oct 1990 11:39:45 +0100 Message-ID: X-MS-Has-Attach: X-MS-TNEF-Correlator: From: Sender: "LaTeX-L Mailing list" To: "Rainer Schoepf" Reply-To: "LaTeX-L Mailing list" Status: R X-Status: X-Keywords: X-UID: 238 This is a multi-part message in MIME format. ------_=_NextPart_001_01C19442.D449C254 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Frank forwarded some mail from Nico about BibTeX etc. BIBLIOGRAPHIC DATABASE? I agree with Nico's comment about BibTeX not being a database management system. I have the impression that some research workers really want a system that will not only contain author, date, etc. for a paper/book, but also a copy of the abstract and perhaps some reminders to = themselves. Thus, they want something that will both (1) help (by supplying bibliographic details) when they are in the = process of writing a paper and (2) help them to search through all the papers they've ever read until they come to something that is about a particular subject. E.g. "give me all the papers about X", "what was that paper about = Y?". It's unfortunate that Lamport/Patashnik used the term "bibliographic database" (although I can't think of anything better offhand). It leads to people thinking that BibTeX will do the things that they associate with "database systems" these days. However, I think that to turn BibTeX into something that did more than (1) would be too ambitious. We're going to have enough problems finding someone to make BibTeX do (1) better. I think it would be better to treat (2) as a separate project, and to = ask "is (2) best done by adding database management features to BibTeX, or would it be better done by adding BibTeX-like features to a database management system?" I don't know what the answer is. For the time being, I think users will have to be left in an = unsatisfactory situation. If the "raw data" is kept in a "real database", the .bib = file is just yet another intermediate file. I seem to remember that Sebastian Rahtz has set INGRES up so that INGRES can hold bibliographic information and write it out in the form of a .bib file. It sounds as though Nico is doing something similar with a different database system. It seems sensible to use a "real database system" for what "real = database systems" are good at. Although it is clearly unsatisfactory to have a = .bib file that merely "shadows" a "real database" I can't think of anything = better that could be done quickly. [Perhaps a database expert might have some = bright ideas. Could such an expert write software that took an .aux file, = generated instructions in a "query language" to select the \cite-ed references, = and then produced a .bbl file (or equivalent) without there ever being a = .bib file? I don't know: I'm not a database expert. Even if they could, it would probably still be useful to have a standalone program like BibTeX that did task (1) in a database-system-independent way.] A salesman has sent me a leaflet about a piece of software called EndNote which apparently seems to aim to do both (1) and (2) for Mac and = PC word-processor users. (It's not public domain, but then neither is = INGRES.) I see that it can export information for troff's "refer". He's coming = to see me at the beginning of November, so I'll ask if EndNote might be = tailored to read .aux files and produce .bbl files. I expect the answer will be "no" or "only after a lot of work", but it does no harm to ask! Perhaps one could have a public-domain BibTeX for task (1) with tailored proprietary software for people who want (1)+(2). Conclusion: I can't think of an easy way of improving the "3 times through LaTeX and once through BibTeX" business at the moment. I doubt whether its worth making BibTeX into a proper database system, although it might be worth making "a proper database system" do what BibTeX does. CODING THE LOGICAL STRUCTURE OF EACH \BIBITEM? I can see the attraction of this. The list of references could be held in the root file or \input or \include-ed from a .tex file like everything else. There wouldn't need to be any special treatment of .bbl files. However, in terms of project-management, it seems very convenient to = regard determination of the logical structure for bibliographic references as a separate task which can be delegated to whoever volunteers (if we can find them) to do another iteration on BibTeX. It would be up to this person to specify the logical elements for a reference, e.g. to agonise about whether ADDRESS is the fundamental concept of whether it should be PLACEOFPUBLICATION. All that needs agreeing between the LaTeX 3.0 people and the BibTeX worker (if we find one) is the form of interface, i.e. what BibTeX passes back to (a) go where the \cite was and (b) go in the list of references. There might need to be separate interfaces for "reference by number", author-date and "short title" [I'll have a go at suggesting what these might be in a future message] but beyond that, the LaTeX 3.0 people need not be concerned about the distinction between e.g. different author-date styles. Thus, the BibTeX worker's considerations of matters such as ADDRESS versus PLACEOFPUBLICATION can proceed in parallel with the LaTeX 3.0 people's work on other matters. It wouldn't matter much whether both projects were complete at the same time. If the LaTeX 3.0 people attempted to code the logical structure of each \bibitem, this would mean that LaTeX 3.0 could not be finished until the LaTeX 3.0 people had satisfied themselves about "What are the fundamental types of publication (\bibitem{knuth-84}{book} or \bibitem{knuth-84}{monograph}?), and what are the fundamental items of bibliographic information about them? Did Lamport/Patashnik get it right? E.g. ADDRESS versus PLACEOFPUBLICATION." The LaTeX 3.0 people would presumably have to provide TeX code that sorted = bibliographic details out into the order required for a particular style e.g. = "reference by number in ACM style" (and perhaps provide a few style-options to show how the details could be changed for a different convention, e.g. author-date in APA style). I think that the LaTeX 3.0 work is ambitious enough without taking this analysis on too. Attempting to get the subdivision of \bibitem right could hold the rest of the project up. Continuing to delegate the work on the logical structure of each = \bibitem to BibTeX might not be as elegant from the user's point-of-view as getting LaTeX to do all the work (using structure information from subdivisions of \bibitem and bibliography-style information from \documentstyle) but I think the result would be available sooner and = that the user might prefer to have something better than LaTeX 2.09 soon, = rather than to have perfection not so soon. The two approaches (1) "put all the BibTeX work into LaTeX" (to take account of the logical structure of each \bibitem), and (2) "make BibTeX into a proper database management system" seem to be pulling in different directions. I don't think one can do both (otherwise you'd end up with LaTeX as a bibliographic database management system), although one could do neither. For LaTeX 3.0, I'd be inclined to leave the contents of each \bibitem (or the successor to \bibitem) as a "black box", to be filled in by the user or by BibTeX. If someone does the analysis for BibTeX 2.0 (say), the question could be considered again if there is ever a LaTeX 4.0 (!) [This all assumes that the SGML-ers have not analysed the structure of a list of references and hence that someone has to do the analysis. If the SGML-ers have done the anlysis (for a DTD, perhaps), could they publish it?] Conclusion: I'd specify an interface between LaTeX 3.0 and BibTeX that would support the "reference by number", author-date and (if possible) "short title" schemes, but delegate the task of supplying \bibitems (or whatever) to that specification to whoever updates BibTeX and its .bst files. FRONT MATTER INFORMATION IN A .BIB RECORD? The gurus of "how to do a list of references" seem to agree that bibliographic details should be as they appear on the title page of the article, book, etc. But there are many caveats: * Several books by the same author in one bibliography should follow the same style (Chicago Manual of Style, p. 441). * There are potential problems with names like Tchaikovsky, which may appear in different forms (Chaikovsky) on different title pages, even though the works are all by the same person. [British Standard BS 1629, p. 5] * The part of the name not on the title page may be enclosed in square brackets (Chicago, p. 441). * If the name on the title page is a pseudonym, the author's real name may be given in the bibliography in square brackets. (Chicago, p. 442). * Capitalization, punctuation, etc. of a title may be differ in a bibliography from the conventions on the title page (Chicago, p. = 447). Similarly, compulsory line-breaks may be wanted on the title-page but not in the list of references (\\, Lamport's book, p. 164). For another example, consider "LaTeX: A Document Preparation System": that's not how it appears on the title page. * It may be necessary to use discretion about whether to regard a subtitle as part of a title or to abbreviate a long title (ISO standard 690, p. 5). * A bibliography may give "place of publication" in a form that is different to that on the title page, using discretion about: - whether to list all places where the publisher has offices or just one place - whether to give further information (if the place of publication is not widely known or could be ambiguous). (Chicago, p. 456). * A bibliography might give a publisher's name in a form that differs slightly from that shown on the title page (Chicago, p. 458). I think that having a BibTeX that can produce several bibliographies at once would be "a good thing". For example: * for conference proceedings where each contribution may have its own list of references * for books that may have e.g. "References" and "Further reading" * things like the SPSS manual (and other things produced by software houses), which seem to give the software-house's related publications in a preface, but put "academic references" at the end. However, I don't necessarily think that the same mechanism should be used to "derive a publication's title-page from its .bib entry". Traditionally, bibliography entries have been derived from title-pages (with some human discretion), rather than the other way round, so its probably safer to have software that imitates the tradition. One might think of having things like \begin{titlepage} \author[bibliography-version]{titlepage-version} \title[bibliography-version]{titlepage-version} \place[bibliography-version]{titlepage-version} \publisher[bibliography-version]{titlepage-version} \end{titlepage} \begin{copyrightpage} \copyrightholder{...} \isbn{...} \end{copyrightpage} in the .tex file (where the optional arguments allow humans to exercise the discretion recommended by the gurus) and having LaTeX produce perhaps a .bibitem file that the user can append to a suitable .bib file (maybe after exercising a bit more discretion). LaTeX could put information that might conceivably be used by a bibliographer in the bibitem file, but refrain from putting information that no bibliographer would ever want to that file. TUGboat would get a .bib entry with each article (but it would be derived automatically from the article-heading, rather than used to automatically produce the article-heading). [Conversely, if the Cork suggestion was adopted, and title-page information was produced by some future BibTeX from a .bib file, there would have to be some mechanism to allow for minor variations, e.g. TITLE =3D "LaTeX: A Document Preparation System", TITLEPAGETITLE =3D "LaTeX\\A Document Preparation System"] The problems that I mentioned in the context of "logical structure of \bibitem" arise here too. To write .sty files for (say) book, = report, conference-proceedings and article, you only need to be clear about = those categories (as well as being a TeX wizard and having a lot of time, perseverance and patience). To write .bst files with entry-types of = book, report, conference-proceedings, article, you need to be clear whether they are distinct entry-types or not. [ISO 690 could be interpreted as lumping books, reports and conference-proceedings all together as "monographs".] You also need to be clear about the fields: e.g. PLACE or ADDRESS. So the LaTeX 3.0 code for title-pages would get held up (and hence LaTeX 3.0 as a whole would get held up) while someone analysed "the structure of a \bibitem". On the other hand, if LaTeX 3.0 wrote out a .bibitem file that wasn't quite what some new BibTeX expected, it wouldn't matter very much, and could be corrected once it was clear what was required. Conclusion: I'd like BibTeX to support multiple lists-of-references, but think that "LaTeX producing .bib info from titlepage info" might be better than "BibTeX producing titlepage info from .bib info". -------------- = David ------_=_NextPart_001_01C19442.D449C254 Content-Type: text/html; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable LaTeX, BibTeX, databases, \bibitem, front matter

Frank forwarded some mail from Nico about BibTeX = etc.


BIBLIOGRAPHIC DATABASE?

I agree with Nico's comment about BibTeX not being a = database management
system.  I have the impression that some = research workers really want a
system that will not only contain author, date, etc. = for a paper/book,
but also a copy of the abstract and perhaps some = reminders to themselves.
Thus, they want something that will both
(1) help (by supplying bibliographic details) when = they are in the process
    of writing a paper
and
(2) help them to search through all the papers = they've ever read
    until they come to something that = is about a particular subject.
    E.g. "give me all the papers = about X", "what was that paper about Y?".

It's unfortunate that Lamport/Patashnik used the term = "bibliographic
database" (although I can't think of anything = better offhand).  It leads
to people thinking that BibTeX will do the things = that they associate
with "database systems" these days.

However, I think that to turn BibTeX into something = that did more
than (1) would be too ambitious.  We're going to = have enough problems
finding someone to make BibTeX do (1) better.

I think it would be better to treat (2) as a separate = project, and to ask
"is (2) best done by adding database management = features to BibTeX,
or would it be better done by adding BibTeX-like = features to a
database management system?"  I don't know = what the answer is.

For the time being, I think users will have to be left = in an unsatisfactory
situation.  If the "raw data" is kept = in a "real database", the .bib file
is just yet another intermediate file.  I seem = to remember that
Sebastian Rahtz has set INGRES up so that INGRES can = hold bibliographic
information and write it out in the form of a .bib = file.  It sounds as
though Nico is doing something similar with a = different database
system.

It seems sensible to use a "real database = system" for what "real database
systems" are good at.  Although it is = clearly unsatisfactory to have a .bib
file that merely "shadows" a "real = database" I can't think of anything better
that could be done quickly.  [Perhaps a database = expert might have some bright
ideas.  Could such an expert write software that = took an .aux file, generated
instructions in a "query language" to = select the \cite-ed references, and
then produced a .bbl file (or equivalent) without = there ever being a .bib file?
I don't know:  I'm not a database expert.  = Even if they could, it
would probably still be useful to have a standalone = program like BibTeX
that did task (1) in a database-system-independent = way.]

A salesman has sent me a leaflet about a piece of = software called
EndNote which apparently seems to aim to do both (1) = and (2) for Mac and PC
word-processor users.  (It's not public domain, = but then neither is INGRES.)
I see that it can export information for troff's = "refer".  He's coming to
see me at the beginning of November, so I'll ask if = EndNote might be tailored
to read .aux files and produce .bbl files.  I = expect the answer will be
"no" or "only after a lot of = work", but it does no harm to ask!
Perhaps one could have a public-domain BibTeX for = task (1) with tailored
proprietary software for people who want = (1)+(2).

Conclusion:  I can't think of an easy way of = improving the "3 times
through LaTeX and once through BibTeX" business = at the moment.  I doubt
whether its worth making BibTeX into a proper = database system, although
it might be worth making "a proper database = system" do what BibTeX does.


CODING THE LOGICAL STRUCTURE OF EACH \BIBITEM?

I can see the attraction of this.  The list of = references could be
held in the root file or \input or \include-ed from a = .tex file
like everything else.  There wouldn't need to be = any special treatment
of .bbl files.

However, in terms of project-management, it seems very = convenient to regard
determination of the logical structure for = bibliographic references
as a separate task which can be delegated to whoever = volunteers
(if we can find them) to do another iteration on = BibTeX.
It would be up to this person to specify the logical = elements
for a reference, e.g. to agonise about whether = ADDRESS is
the fundamental concept of whether it should be = PLACEOFPUBLICATION.

All that needs agreeing between the LaTeX 3.0 people = and the
BibTeX worker (if we find one) is the form of = interface, i.e.
what BibTeX passes back to (a) go where the \cite was = and (b) go
in the list of references.  There might need to = be separate
interfaces for "reference by number", = author-date and "short title"
[I'll have a go at suggesting what these might be in = a future message]
but beyond that, the LaTeX 3.0 people need not be = concerned
about the distinction between e.g. different = author-date styles.
Thus, the BibTeX worker's considerations of matters = such as
ADDRESS versus PLACEOFPUBLICATION can proceed in = parallel with
the LaTeX 3.0 people's work on other matters.  = It wouldn't matter
much whether both projects were complete at the same = time.

If the LaTeX 3.0 people attempted to code the logical = structure of
each \bibitem, this would mean that LaTeX 3.0 could = not be finished
until the LaTeX 3.0 people had satisfied themselves = about "What are
the fundamental types of publication = (\bibitem{knuth-84}{book} or
\bibitem{knuth-84}{monograph}?), and what are the = fundamental
items of bibliographic information about them?  = Did Lamport/Patashnik
get it right?  E.g. ADDRESS versus = PLACEOFPUBLICATION."  The LaTeX 3.0
people would presumably have to provide TeX code that = sorted bibliographic
details out into the order required for a particular = style e.g. "reference
by number in ACM style" (and perhaps provide a = few style-options to
show how the details could be changed for a different = convention,
e.g. author-date in APA style).  I think that = the LaTeX 3.0 work
is ambitious enough without taking this analysis on = too.  Attempting
to get the subdivision of \bibitem right could hold = the rest of the
project up.

Continuing to delegate the work on the logical = structure of each \bibitem
to BibTeX might not be as elegant from the user's = point-of-view as
getting LaTeX to do all the work (using structure = information from
subdivisions of \bibitem and bibliography-style = information from
\documentstyle) but I think the result would be = available sooner and that
the user might prefer to have something better than = LaTeX 2.09 soon, rather
than to have perfection not so soon.

The two approaches (1) "put all the BibTeX work = into LaTeX" (to take
account of the logical structure of each \bibitem), = and (2) "make
BibTeX into a proper database management system" = seem to be pulling
in different directions.  I don't think one can = do both (otherwise
you'd end up with LaTeX as a bibliographic database = management system),
although one could do neither.

For LaTeX 3.0, I'd be inclined to leave the contents = of each \bibitem
(or the successor to \bibitem) as a "black = box", to be filled in by the
user or by BibTeX.  If someone does the analysis = for BibTeX 2.0 (say),
the question could be considered again if there is = ever a LaTeX 4.0 (!)

[This all assumes that the SGML-ers have not analysed = the structure
of a list of references and hence that someone has to = do the analysis.
If the SGML-ers have done the anlysis (for a DTD, = perhaps), could they
publish it?]

Conclusion:  I'd specify an interface between = LaTeX 3.0 and BibTeX
that would support the "reference by = number", author-date and (if
possible) "short title" schemes, but = delegate the task of supplying
\bibitems (or whatever) to that specification to = whoever updates BibTeX
and its .bst files.


FRONT MATTER INFORMATION IN A .BIB RECORD?

The gurus of "how to do a list of = references" seem to agree that
bibliographic details should be as they appear on the = title page of
the article, book, etc.  But there are many = caveats:
*  Several books by the same author in one = bibliography should
   follow the same style (Chicago Manual of = Style, p. 441).
*  There are potential problems with names like = Tchaikovsky,
   which may appear in different forms = (Chaikovsky) on different
   title pages, even though the works are = all by the same person.
   [British Standard BS 1629, p. 5]
*  The part of the name not on the title page = may be enclosed
   in square brackets (Chicago, p. = 441).
*  If the name on the title page is a pseudonym, = the author's
   real name may be given in the = bibliography in square brackets.
   (Chicago, p. 442).
*  Capitalization, punctuation, etc. of a title = may be differ in a
   bibliography from the conventions on the = title page (Chicago, p. 447).
   Similarly, compulsory line-breaks may be = wanted on the title-page
   but not in the list of references (\\, = Lamport's book, p. 164).
   For another example, consider = "LaTeX: A Document Preparation System":
   that's not how it appears on the title = page.
*  It may be necessary to use discretion about = whether to regard
   a subtitle as part of a title or to = abbreviate a long title
   (ISO standard 690, p. 5).
*  A bibliography may give "place of = publication" in a form that is
   different to that on the title page, = using discretion about:
   - whether to list all places where the = publisher has offices
     or just one place
   - whether to give further information = (if the place of publication
     is not widely known or could = be ambiguous).
   (Chicago, p. 456).
*  A bibliography might give a publisher's name = in a form that differs
   slightly from that shown on the title = page (Chicago, p. 458).

I think that having a BibTeX that can produce several = bibliographies at
once would be "a good thing".  For = example:
*  for conference proceedings where each = contribution may have its own
   list of references
*  for books that may have e.g. = "References" and "Further reading"
*  things like the SPSS manual (and other things = produced by software
   houses), which seem to give the = software-house's related publications
   in a preface, but put "academic = references" at the end.

However, I don't necessarily think that the same = mechanism should
be used to "derive a publication's title-page = from its .bib entry".
Traditionally, bibliography entries have been derived = from title-pages
(with some human discretion), rather than the other = way round, so its
probably safer to have software that imitates the = tradition.  One might
think of having things like
\begin{titlepage}
   = \author[bibliography-version]{titlepage-version}
   = \title[bibliography-version]{titlepage-version}
   = \place[bibliography-version]{titlepage-version}
   = \publisher[bibliography-version]{titlepage-version}
\end{titlepage}
\begin{copyrightpage}
   \copyrightholder{...}
   \isbn{...}
\end{copyrightpage}
in the .tex file (where the optional arguments allow = humans to exercise
the discretion recommended by the gurus) and having = LaTeX produce
perhaps a .bibitem file that the user can append to a = suitable .bib file
(maybe after exercising a bit more discretion).  = LaTeX could put
information that might conceivably be used by a = bibliographer in the
bibitem file, but refrain from putting information = that no bibliographer
would ever want to that file.  TUGboat would get = a .bib entry
with each article (but it would be derived = automatically from the
article-heading, rather than used to automatically = produce the
article-heading).

[Conversely, if the Cork suggestion was adopted, and = title-page
information was produced by some future BibTeX from a = .bib file, there
would have to be some mechanism to allow for minor = variations, e.g.
      TITLE =3D "LaTeX: = A Document Preparation System",
      TITLEPAGETITLE =3D = "LaTeX\\A Document Preparation System"]

The problems that I mentioned in the context of = "logical structure
of \bibitem" arise here too.  To write .sty = files for (say) book, report,
conference-proceedings and article, you only need to = be clear about those
categories (as well as being a TeX wizard and having = a lot of time,
perseverance and patience).  To write .bst files = with entry-types of book,
report, conference-proceedings, article, you need to = be clear whether
they are distinct entry-types or not.  [ISO 690 = could be interpreted
as lumping books, reports and conference-proceedings = all together
as "monographs".]   You also need = to be clear about the fields:
e.g. PLACE or ADDRESS.  So the LaTeX 3.0 code = for title-pages would
get held up (and hence LaTeX 3.0 as a whole would get = held up)
while someone analysed "the structure of a = \bibitem".
On the other hand, if LaTeX 3.0 wrote out a .bibitem = file that wasn't
quite what some new BibTeX expected, it wouldn't = matter very much,
and could be corrected once it was clear what was = required.

Conclusion:  I'd like BibTeX to support multiple = lists-of-references,
but think that "LaTeX producing .bib info from = titlepage info"
might be better than "BibTeX producing titlepage = info from .bib info".

          &nbs= p;            = ;       --------------

          &nbs= p;            = ;            =             &= nbsp;           &n= bsp;            = David

------_=_NextPart_001_01C19442.D449C254--