Re: Acrobat to HTML

by Brian V Bonini <b-bonini(at)cox.net>

 Date:  27 Jun 2003 09:39:26 -0400
 To:  Janet Zagoria-Honeywebster <honeywebster(at)attbi.com>
 Cc:  HWG Graphics <hwg-graphics(at)hwg.org>
 References:  attbi
  todo: View Thread, Original
On Thu, 2003-06-26 at 23:54, Janet Zagoria-Honeywebster wrote:
> My question is about producing HTML from Acrobat 6.
> 
> The better code is to produce it as HTML DOCTYPE 3.2.
> 
> But for some reason it doesn't close many of the [b] or [font] tags or puts
> them in an odd place (like after a following [p] tag. Has anyone had
> experience with this or know how to direct Acrobat to do something
> different?
> 
> The original file was in InDesign with styles. The HTML produced from
> InDesign is scary. The Acrobat to HTML choices are 4.01 with CSS 1.0 and
> 3.2. The 4.01 is also scary. For my purpose (eBook to be read in MS eBook
> Reader), the code needs to be clean.
> 
> Even though the 3.2 Acrobat produces has other problems then the above, it
> has by far the cleanest. Any suggestions?
> 
> Thank you.


It be nice if apps like this would start outputting xml already.. I
never trust those markup generators and always end up spending more time
code sweeping them then it would take to do it from scratch...

A couple things I can think of are:

a.) Use Tidy to clean up the crap and then you could use any editor with
extended search and replace functions to re map some of it to xhtml
b.) Output the PDF to text and mark it up from scratch

HWG: hwg-graphics mailing list archives, maintained by Webmasters @ IWA