Re: XHTML queries

by Christopher Higgs <c.higgs(at)landfood.unimelb.edu.au>

 Date:  Mon, 12 Jun 2000 13:45:29 +1000
 To:  Toller <toller(at)ntlworld.com>,
hwg-xml(at)hwg.org
 In-Reply-To:  jtaug99
  todo: View Thread, Original
G'Day Jenny,

At 00:09 12/06/00 +0100, Toller wrote:
>1. Am I right in thinking that XHTML files are saved as .html rather than
>.xhtml? I can't find anything on this one.

XHMTL can be published as either .html (or any other extension used for 
HTML pages) or .xml (for XML files).  The difference here is in how 
scripting languages interact with the page (ie. do they use the HTML 
Document Object model, or the XML object model).  Stick with .html for now 
unless you are working on an intranet doing "special" things :)

>2. What xml declaration should I use? I am terribly confused about the whole
>area of character sets and encoding. Does anyone know of any URL where I can
>go and read about it in words of one syllable?

The prologue (xml declaration + doctype statement) are mandatory, but also 
can be empty!

I know this sounds confusing, but basically leave them out too - Current 
HTML browsers will not understand the xml declaration and (like all good 
browsers) will display any and all processing instructions to the user.

>  I note that without an xml declaration, the  encoding will use the default
>UTF-8 or UTF-16. What are these and which one is likely to be my default? If
>I don't want these, what else can I use?

UTF-8 is the default.  Other Unicode font sets are available to cope with 
languages such as Japanese and Arabic which require extra 
letters.  Basically only a problem if you want to work in a non-English 
language.

>Xml declaration - are these the main choices, then?:
><?xml version="1.0" encoding="UTF-8"?>
><?xml version="1.0" encoding="UTF-16"?>
><?xml version="1.0" encoding="ISO-8859-1"?>
><?xml version="1.0" encoding="EN"?>
>
>Which one would people recommend I use for my sites?

None.

>Also, what are entity sets and do I need to declare them? If so, how? Are
>they the same as the character sets and encoding?

Only if you want to validate the page (good for data stores, etc, but not 
neccessary for web pages where "well formedness" is sufficient)

>3. The specs seem to say that comments are removed entirely - is this really
>true or am I misunderstanding it? The tutorials I read did not mention this
>at all. Does this mean that anyone reading your source code will not see the
>comments? If so, what can one use instead?

An XML browser (in contrast to our HTML browsers) will strip out the 
comment tags leaving the content available - this has some severe 
ramifications for the future, our current HTML browsers will not cope with 
the CDATA sections necessary for true XML scripting.

>4. Similarly, the tutorials I saw made no mention of <script> or <style>.
>The specs seem to say that the contents of these should be wrapped in CDATA
>as follows:
><script>
>  <![CDATA[
>  ... unescaped script content ...
>  ]]>
>  </script>
>Do I need to add this CDATA thing to my JavaScript? If so, does anyone know
>of a URL model I could refer to so that I can avoid errors? I've just
>noticed that W3C do NOT use CDATA in their specs source code around their
>style sheet (presumably they don't need to for some reason) - so I daresay I
>have misunderstood this as well. The specs seem fairly clear on this point -
>C.4 - but not quite clear enough for me, I'm afraid...

Current practice at this stage is to persist with HTML scripting practices, 
possibly using namespaces to identify treated script portions.  I expect 
that by the time the number of XML browsers become a viable market, the 
serving technologies will have advanced and server-side XSLT based upon 
Profiles will be widespread, thus negating the problem.

Of course, that's JMHO :-)


Chris Higgs <c.higgs(at)landfood.unimelb.edu.au>
Institute of Land and Food Resources
University of Melbourne http://www.landfood.unimelb.edu.au

HWG hwg-xml mailing list archives, maintained by Webmasters