Re: basics

by "Donna M Smillie" <dms(at)zetnet.co.uk>

 Date:  Tue, 29 Feb 2000 02:03:51 -0000
 To:  <hwg-gutenberg(at)hwg.org>,
"Lewis Overton" <lewy_o(at)yahoo.com>
 References:  yahoo
  todo: View Thread, Original
Hi Lewis

Here's my understanding of what we are doing, and what our focus should
be.  If I'm wrong in any of this, I know Frank and others will put me
right.  :-)

The primary focus of the markup we're doing should be the structure of
the document.  To a large extent, we can ignore presentation, except in
so far as creating appropriate style sheets to display these documents
in a logical and understandable fashion.  The aim, though, in applying
the correct markup, should be to represent the structural content of the
document as accurately as possible.

The way I approach this is to ignore what the document looks like for
most of the process.  I start by analysing the structure and working out
the appropriate markup to represent that structure in the most
straightforward and logical way.  Only when I've done that, and inserted
all of the XML markup, do I then pick (or create) a stylesheet that will
display the document in a reasonable way - not necessarily the way it
might have been in the original, mind, just in a reasonable way, simply
so that it can be viewed comfortably.

I got somewhat hung up on white space in poetry when I was marking up
the Second Jungle Book, and even added a load of classes to the
anne11.css style sheet in an attempt to mark up the *look* of the lines
and verses.  Frank rightly pointed out that that simply isn't valid XML,
so now I simply mark up the structure, and leave presentation to one
side.

Poetry is something of a special case, of course, since poets often lay
out their verses and lines in visual patterns which are intended to
complement the vocal patterns of reading them.  But XML is geared to the
content and structure, not the presentation, so I simply try to slot the
markup around the ASCII text, leaving the visual pattern untouched for a
future marker to add that information to the markup when it becomes
appropriate to do so.

You mention things like bold text - I approach bold and italics in terms
of "Is it simply a visual thing?  Or does it in fact indicate
emphasis?".  If I feel that the use of bold / italics in the original
was actually designed to convey emphasis, then I use <emph> ... </emph>.
Otherwise I ignore it.

> Mark up for presentation or to replicate the ASCII limitations?

Something rather different to either of these, I think.  It's not for
presentation, but XML markup isn't the same, and doesn't have the same
limitations as ASCII.  I think of marking up in XML as being a lot
closer to creating a database record with a structured set of fields,
which could, with appropriate software, be interrogated and manipulated.

I hope that made some sort of sense, and went some way to answering you
question.  I'd be interested to hear how others view this.

Regards,
Donna
--
dms(at)zetnet.co.uk
Different Worlds:  http://www.users.zetnet.co.uk/dms/
Pictures of the Past, The Leslie Smith Family,
An Introduction to HTML, Copyright Considerations
Online Bookshop

----- Original Message -----
From: Lewis Overton <lewy_o(at)yahoo.com>

> I have a question about the basics of this project. (Forgive me,
please, if
> I am a bit slow to catch on. I may have missed obvious answers to
these
> questions. I'm just learning about XML.)
>
> Suppose one is presented with a text containing a few patterns like:
>
> This was originally set in BOLD letters, where all caps means that the
> printed original use bold type. To mark up this text, shouldn't one
> actually format the text in bold rather than presenting it as found in
the
> ASCII only version?
>
> Some poetry has structure such as:
>
> This is a verse.
>    I know for a fact.
> It could  be worse.
>    If I mess up my act.
>
> If this is marked as verse + lines, both HTML and XML lose the two
blanks.
> One could mark it as:
>
> <verse>
> <line.a>This is a verse</line>
> <line.b>I know for a fact.</line>
> <line.a>It could be worse,</line
> <line.b>If I mess up my act.</line>
> </verse>
>
> ... but defining a and b violates the DTD.
>
> Some verses are indented or doubly indented. The ascii markup used
leading
> blanks to indicate both indents and italics. Again, the HTML
blockquote
> does this very well. It isn't in the DTD, however.
>
> And if you really want some fun, mark up the gutblurb so that it can
be
> rendered visible by changing the attribute display:none; to
> display:block; . IE5 presents all this fine. The XML validator throws
up.
>
> So, what is the purpose? Mark up for presentation or to replicate the
ASCII
> limitations?
>
> If the choice is presentation, where are the tools to make this
possible?

HWG: hwg-gutenberg mailing list archives, maintained by Webmasters @ IWA