Re: Gutenberg DTD's

by "Frank Boumphrey" <bckman(at)ix.netcom.com>

 Date:  Sun, 12 Mar 2000 14:12:12 -0500
 To:  "Murray Altheim" <altheim(at)eng.sun.com>
 Cc:  <hwg-gutenberg-dtds(at)hwg.org>
 References:  prioritynetworks sun prioritynetworks2 sun2
  todo: View Thread, Original
Murray wrote:
>[and I thought we were at a point where we discuss the requirements]

Yes we are, and i think this discussion is very useful in that it informs
our requirements list!

How about

"The DTD be reasonably simple to author"

> I would prefer if we tackled no more than about four, just concentrating
> on the most common book structures used in the Gutenberg Project

I agree that this is a reasonable first goal. It will also give us some idea
as to how realistic the other goals are.

I'd
> rather us be successful in taking on something we can accomplish in a
> reasonable amount of time using volunteers, rather than turning this
> into another project on the scale of TEI.

agreed.

> You should consider adding to your requirements list the approach you'd
> want to take to naming. There's definitely pros and cons, for example,
> <bibliography> is much clearer than <bib>,

I think this is somethink which we can discuss when we have collected a list
of components, but I think we could add a requirement

'tag names as far as possible should be descriptive'

We
> also need to deal with delimiters, case, etc.:

My own preference is for

<command-synopsis-fragment-reference>

but I know that this is a matter of religion:>) I am prepared to compromise

> I'm just worried we could kill this project by taking on too big a task.
> I think we have similar visions for it, but I think we should take this
> one step at a time.

Very sound advice.

There's a lot of prior work, and perhaps a requirement
> should be a survey of existing DTDs similar to each one the Gutenberg
> Project would consider.

I think that individually we need to do this, but personally I feel that if
we merely alter existing work we will build into the new work many of the
faults of the old.

For example i am trying to track down the original requirements or design
goals document for TEI.  I am also making a paramaterized version of
TEIxlite so that I can get a feel of what is possible with TEI.

My own impression is that TEI is mainly useful for annotating works that
require lexical analysis. This is rather a different design than what we
need from the Gutenberg DTD's, where I feel the prime goal is to define the
structure of the document.

A secondary goal is to have a DTD that can be used for lexigraphical
analysis but IMO this capability can be provided with add on modulles
designed for that purpose.

> I think we have similar visions for it, but I think we should take this
> one step at a time

Yes I think we do, in fact I think most of those on the project have a
similar vision.

Frank

----- Original Message -----
From: Murray Altheim <altheim(at)eng.sun.com>
To: Frank Boumphrey <bckman(at)ix.netcom.com>
Cc: <hwg-gutenberg-dtds(at)hwg.org>
Sent: Sunday, March 12, 2000 1:21 PM
Subject: Re: Gutenberg DTD's


> Frank Boumphrey wrote:
> >
> > > Perhaps an XML version of TEI would suffice for many of these document
> > > types?
> >
> > There has benn a thread on this very topic on XML Dev, and we do indeed
have
> > a volunteer working on a subset of TEI, and it's documentation.
> >
> > However IMO as presently written TEI does not meet the two most
important
> > requirements, that it be both simp-le to use and Intuitive.
>
> Part of the complexity of a DTD is the number of element types and
attributes.
> Here at Sun we're going through a complicated and length process of
subsetting
> DocBook. The process involves weekly meetings and has taken over two
years,
> first the reduction in unnecessary attributes, now we're tackling
elements.
> ISO 12089 is only four types (four DTDs: book, article, math and serial)
and
> is a substantial project in itself. Your list has fourteen (!) and
includes
> some very difficult and complex structures. My point is that two of your
> requirements are antithetical to each other.
>
> [and I thought we were at a point where we discuss the requirements]
>
> I would prefer if we tackled no more than about four, just concentrating
> on the most common book structures used in the Gutenberg Project. I'd
> rather us be successful in taking on something we can accomplish in a
> reasonable amount of time using volunteers, rather than turning this
> into another project on the scale of TEI.
>
> > > >   2. At a minimum the DTD's should be suitable for:
> >
> > This is an ambitious list, and we may not complete it all. How ever we
> > certainly don't need the granularity that of TEI.
> >
> > My vision is for a relatively agranular basic DTD written on the DocBook
> > model which can have granularity added to it via modules. If any one
wishes
> > i do have on file an XML version of TEILite that I can forward on
request.
> > There is unfortunatly no documenttation with it. Now if some one would
like
> > to volunteer to write documentation it _may_ possibly be useful. I would
> > also like to transform many of the tag names to something more
intuitive.
>
> You should consider adding to your requirements list the approach you'd
> want to take to naming. There's definitely pros and cons, for example,
> <bibliography> is much clearer than <bib>, but if you have to type a
> thousand of 'em you begin to appreciate the latter, especially when
> encountering multi-word <CommandSynopsisFragmentReference>. We could
> set a character limit that would force such a beast to become
> <CmdSynFragFef>, but then we'd have the problem you point out in TEI. We
> also need to deal with delimiters, case, etc.:
>
>     <command-synopsis-fragment-reference>
>     <command_synopsis_fragment_reference>
>     <CommandSynopsisFragmentReference>
>     <COMMAND-SYNOPSIS-FRAGMENT-REFERENCE>
>     etc.
>
> > the other thing I do not like about it is that it is mono-lithic and
> > un-parameterized which IMO also makes it difficult to use.
>
> Well, it is the 'lite' version. TEI P3 has about 50 modules and is quite
> parameterized. It's a combination of the parameterization of DocBook and
> TEI P3 you see in XHTML. And I believe it does dictionaries. Actually,
> I was in correspondence with David Megginson and some people in Toronto
> last year about a bilingual dictionary project I wanted to modify to
> make suitable for a Tibetan-English dictionary I was volunteering to
> work on. It kinda fell flat due to lack of time and the difficulty of
> maintaining a contact in Dharamsala/McLeod Ganj (although the Tibetan
> Government-in-Exile does have a very nice web site). I might be willing
> to try to reopen that project, but I'd emphasize that it's a large
> effort on its own. One DTD.
>
> I'm just worried we could kill this project by taking on too big a task.
> I think we have similar visions for it, but I think we should take this
> one step at a time. There's a lot of prior work, and perhaps a requirement
> should be a survey of existing DTDs similar to each one the Gutenberg
> Project would consider.
>
> Murray
>
>
...........................................................................
> Murray Altheim
<mailto:altheim&#x40;eng.sun.com>
> XML Technology Center
> Sun Microsystems, Inc., MS MPK17-102, 1601 Willow Rd., Menlo Park, CA
94025
>
>    the honey bee is sad and cross and wicked as a weasel
>    and when she perches on you boss she leaves a little measle -- archy

HWG: hwg-gutenberg-dtds mailing list archives, maintained by Webmasters @ IWA