[TriLUG] OT: Maintaining public/private versions of a document

Fri Jul 26 20:36:35 EDT 2019

On Fri, 26 Jul 2019 16:15:07 -0400
Brian via TriLUG <trilug at trilug.org> wrote:

> Hi Gang,
> 
> I'm wondering if anyone has any tips on good software to use for 
> maintaining a document in two forms, a "private" version that is 
> complete, and a "public" version that excludes certain portions of
> the private version.  I'd love to be able to only modify the private 
> version, somehow mark which sections are private, and have the public 
> version stay in sync.  I'm not super-concerned with what format the 
> source is in, but publishing to annotated PDF (that is, PDF including 
> working intradocument links) would be a hard requirement.
> 
> Offhand I bet this is something LaTeX could handle, but I don't know 
> anything about actually using LaTeX, and would prefer a WYSIWYG
> editing experience.

LaTeX can be a real bear, if you have demanding formatting wishes.
Also, with LaTeX, heaven help you if someday they want you to switch
from PDF to a more HTML like destination: HTML, XML, Xhtml, ePub, etc.

> 
> The particular use case is API documentation for a project I'm on. 
> There're parts of the API we don't intend to publish to others, 

Be careful. Microsoft got in trouble for secret API pieces.

> but 
> would like to maintain documentation on the entire interface.
> 
> Thoughts?  Suggestions?  FOSS is good but not a requirement.

If I were assigned this task, the first thing I'd explore is
Asciidoctor. With Asciidoctor that's converted to Xhtml, you can
declare multiple classes for each paragraph, link, heading, etc. You
could add a class called "zapped" to each entity you want not to show
in public. For private use,  .zapped wouldn't be defined in css. But for
public, the css would look like the following:

.zapped{display: none;}

So now you have Xhtml private and public doing what you want. The
question is, will it work with PDF? I don't know. Converting to PDF
would look something like the following:

pandoc -o myfile.epub3.epub -f html -t epub3 myfile.html

In the preceding, myfile.html would have been xhtml, but I don't think
Pandoc can work with a file with extension .xhtml. Be sure myfile.html
has all necessary css to render. I'm hoping that pandoc will see the
"zapped" classes in entities and not print them. However, if that can't
be done, an awk program could be written to simply delete all entities
with the "zapped" class. This would work as long as all such entities
consisted of whole lines, because awk is line oriented.

A slightly harder possibility would be to take myfile.html, run it in a
DOM style XML parser, and remove all entities, and their children, that
have class "zapped", and put the output to myfile_public.html. Then,
obviously the pandoc would do the right thing, because it would never
see the secret stuff on the public side.

Asciidoctor isn't a quick learn. But what I've found is that it can do
pretty much anything you want done, with not too many compromises. I'm
currently writing two books in Asciidoctor. One source file, outputs to
both PDF and ePub. Life is good.

I've used LyX, a front end to LaTeX, for 17 years. And because LyX
styles are written in LaTeX, I've become pretty proficient in LaTeX.
There's no way on earth I'd attempt your assignment using LaTeX. I'd
write my own native format before doing that. I *was* writing my own
native format when I discovered Asciidoctor could do 90% of what my
format could do, and could do a whole bunch of stuff my native format
couldn't do. Today, the only real benefit of LaTeX is beautiful
typesetting and easy typesetting of math equations. I can get
typesetting reasonably good from Asciidoctor, and I just don't do that
much math in my books.

I'd stay away from LaTeX.

SteveT

Steve Litt 
July 2019 featured book: Troubleshooting Techniques
     of the Successful Technologist
http://www.troubleshooters.com/techniques