get the facts: Ship XML or HTML? That is the question.
Sean Wheller
sean at inwords.co.za
Sun Jun 19 18:39:27 UTC 2005
Recent discussions around help viewers has resulted in much confusion,
miscommunication and division. The disarray and division is both internal to
the Ubuntu documentation team and external with a few influential individuals
in the development team. The root cause for this disarray begins and ends
with myself. I made what I now see is a controversial decision to ship Ubuntu
Documents as HTML instead of XML. when I made this decision, it was not
without due diligence or consultation with the community, but I now see that
people either have not understood my previous communications or have
forgotten the discussion we have had in the past. For whatever reason, not
understanding the reasoning behind this decision, people have jumped to all
sorts of conclusions and mixed a number of issues. The outcome of all this
has resulted in some members of the Ubuntu Documentation team wanting to hold
a technical board meeting in order to decide whether or not Ubuntu documents
should be shipped as XML or HTML. Recent conversation has failed to change
the position of these members.
I am not convinced that we need a technical board meeting to decide this
matter, I would much rather this decision be made internal to the team. I do
not see the need to invoke structures such as the technical board without
good cause. However, people want to go ahead with a technical board which
leaves me with little choice but to go with the flow on this one. So here is
a document that, I hope, will convey not only my logic for not wanting a
technical board meeting but also my reasoning for wanting to ship ubuntu
documents in HTML as apposed to XML.
Before starting I feel that some historical context is required. I would
therefore like readers to please read the following artifacts (message items)
and their resulting threads.
1.On 11/01/2005 Sean Wheller posted a Request for Comment (RFC), “Online Help
Systems.” [http://lists.ubuntu.com/archives/ubuntu-doc/2005-January/000944.html]
2.On 19/01/2005 Nick Loeve posted, “format of distributed
docs” [http://lists.ubuntu.com/archives/ubuntu-doc/2005-January/001020.html]
3.On 02/03/2005 Jeff Schering posted, “yelp doesn't do xref or
trademarks” [http://lists.ubuntu.com/archives/ubuntu-doc/2005-March/001291.html]
4.On 04/03/2005 Sean Wheller posted, “To yelp or not to Yelp? that is the
question” [http://lists.ubuntu.com/archives/ubuntu-doc/2005-March/001322.html]
5.On 13/04/2005 Sean Wheller posted, “[wanted] thoughts on repos
structure” [http://lists.ubuntu.com/archives/ubuntu-doc/2005-March/001528.html]
6.On 01/06/2005 Mathew East posted, “Possible documentation
viewer” [http://lists.ubuntu.com/archives/ubuntu-doc/2005-June/002457.html]
Hopefully you have read these artifacts, for they serve to show that this is
not a new discussion but one that has been on the burner since January 2005.
In addition to these artifacts numerous discussions where held on #ubuntu-doc
on or around the dates listed above. For purpose of brevity I will not list
them here. In addition a number of communications have taken place off-list.
These messages for privacy reasons are not admissible here. However on
artifact definitely work noting is the Documentation Team Meeting held on the
12/03/2005. The summary of this meeting can be reviewed here
[https://wiki.ubuntu.com/DocumentationTeamMeetingSummary3]. Within this
summary I would like to draw your attention to the following topics:
* Open bugs needing to be solved for Hoary
* Desktop-neutral documentation format
The first topic indicates some of the problems encountered last minute in
Hoary. The second highlights the acceptance of the Ubuntu Documentation Team
to accept “documentation desktop-neutral, at least in the format of the
sources.”
With these artifacts and the final acceptance of the team in the meeting of
12/03/2005 the decision to create desktop neutral documentation sources was
accepted. Based on these proceedings I made the decision to ship HTML. The
justification for which I shall now provide.
So what is the problem? Why ship HTML instead of XML?
To answer these questions we need to understand some technical fundamentals
and build from this understanding:
1.How does Yelp and Docbook XML Work?
2.What are the advantages and disadvantages of this approach?
Yelp is a Help Viewer for the GNOME Desktop Environment. As with most
open-source projects the GNOME Documentation Team uses Docbook XML as the
standard, presentation neutral format in which to store documents that are
the user manuals to applications. Please note I have said “presentation
neutral format” this is intentional since XML was not designed as a format
for viewing. Instead it was designed to separate the concerns of data and
presentation layers. Unlike HTML, XML does not define how data will be
presented. Many may be confused at this point for what they see in Yelp is
nicely formatted text complete with fonts, styles and colors. To understand,
let's look at what happens for Yelp to create this presentation.
First the Yelp team have developed a set of XSLT files. These files are not
the same as the XSLT files developed by Norman Walsh and that form a part of
the Docbook project at sourceforge. The Yelp XSLs (yelp stylesheets) are
compiled into Yelp. There job is to transform the Docbook XML files created
by the GNOME Documentation Project into a presentable layout and formating
under Yelp. So when a user makes a request for a User Manual on a GNOME
application, Yelp reads the XML file and transforms it using the Yelp
Stylesheets. The advantage of this approach is that XML is dynamically
transformed, at time of viewing, into a presentational format that users can
read. It saves GNOME developers and authors from having to first transform to
a presentational format such as HTML before packaging and shipping the User
Manuals for GNOME. Other than this advantage, there is really no other
benefit to this approach.
While the Yelp approach of dynamically transforming XML into a usable
presentational format is cool, even the way to for the long term, it does
have some drawbacks. Please note I do say, “the way to go.” I therefore agree
that from a technological perspective the direction shown by Yelp is a good
one. So what's the problem? I seem to be agreeing with Yelp and supporting
the case to ship XML.
There is no single problem. Rather a collection of problems.
The first problem is a simple one, but nevertheless a complaint raised by
users. The problem is that in order for Yelp to make a document presentable
it must transform the document. This means that as a user moves between
documents Yelp must transform and then render the document before the target
document can be read. This results in a slow performance that many users on
slow computers find very annoying. The result is a less that favorable user
experience and an eventual reluctance to use the help system.
Next problem. The Yelp stylesheets do not support all Docbook elements and
their organization as defined by the Docbook Document Type Definition (DTD).
For the record, the DTD is a standard managed by the OASIS Docbook Technical
Committee. The problem here is that authors cannot use Docbook as it is
defined in the DTD as a result many features are excluded from an authors
possibilities.
Why is it this way? Well, the fact is that “Yelp is a Help Viewer for the
GNOME Desktop Environment.” Docbook is large and very powerful, the Yelp
Developers and the GNOME Documentation Team have not needed nor wanted to use
all the features of Docbook. So they have focused their attention on
developing support for those features and functions required by GNOME. This
approach is totally understandable as it would take an enormous effort for
anyone to develop fully Docbook compatability. As a result the Yelp
developers are focused on GNOME and also do not have capacity to facilitate
the needs and wants of every project.
While I understand and agree with the reasoning of the GNOME Documentation
Team and the Yelp Developers not to support all of the Docbook standard it
does not help solve problems and wants of people at ubuntu-doc. I have not
listed the unsupported things here as they are discussed in the artifacts
listed above, in particular artifact “01/06/2005 Mathew East, “Possible
documentation viewer.”
Next problem. In addition to not implementing a full support for Docbook, the
GNOME implementation also uses three methods that are proprietary to Yelp.
The first is a processing instruction that determines the level to which the
table of content will be expanded or collapsed in the tree view pane of the
Yelp workspace. The instruction looks like this.
<?yelp:chunk-depth 3?>
The second method proprietary to yelp is its implementation of inter-document
cross references. To create external cross references GNOME have used the
xref element, which looks like this.
<xref linkend=””/>
This is a standard and valid Docbook element. However, in order to create an
inter-document reference Yelp uses the following syntax in the value of the
linkend attribute.
<xref linkend=”ghelp:foo-app”/>
This means that the xref will only work in the runtime environment that is
Yelp.
The second method used by Yelp is not so much proprietary as just forcing
authors to do something in order to comply with Yelp requirements for
generating a toc. It means that every chapter or sect* node in the document
must declare an id attribute in order for it to be displayed in the toc. When
an id attribute is missing the node is just not displayed. On the face of
this is not a big problem. But it does force authors to add id attributes
when they are not required other than buy Yelp.
The overall of these problems is that they create incompatibility. First the
XML is Yelp specific and second if you where to try transform to another
format the xrefs would not work. The processing instruction also needs to be
commented out in order to validate the document. The use of the ghelp feature
also defines a namespace that is not part of the Docbook XML standard. In
order for it to work document declarations must be expanded to include the
new namespace.
Next problem. Working solely in a GNOME environment, the issues discussed so
far are not a problem but what happens when you have more than one desktop
environment to cater for. I remind you that Yelp is a Help Viewer for the
GNOME Desktop Environment. Yelp compliant XML is not supported by KDE or any
other desktop environment. So how does KDE do it? Well while they also store
documents in Docbook XML, they ship HTML. This means all of the KDE help is
available in any application capable of rendering HTML. Oh, did I mention
that Yelp 2.10 can render HTML. I think it was mentioned in one of the
artifacts presented earlier. In KDE it is worth noting that the Help Viewer
is the KHelpCenter. It does not have the capability to dynamically transform
Docbook XML into a presentational format. Interestingly enough in KDE an
applications help can be accessed in more ways than just KHelpCenter. For
example with Konqueror, the KDE file manager, you can use kioslaves to call a
user manual using the syntax help:foo-app.
In addition to the fact that Docbook is a standard for doing documentation in
open-source projects, there are a number of reasons why it is good to use
Docbook. I will not list them all here, I only refer to features we can use
at ubuntu-doc and explain in short the benefits to us.
The first is content reuse. Within Docbook as an XML format we can easily
reuse or re-purpose content. So for example, content that is the same between
Ubuntu and Kubuntu can be easily shared between documents. Sometimes the node
being reused may contain a cross-reference between documents. If the ghelp
method was used, we would not be able to use the content in this way. Another
example is where documents are so much the same that it is worthwhile
managing both gnome and kde versions in the same XML-instance. In this case
we use profiling. Profiling entails marking XML nodes in a way that at time
of processing more than one variant of the document can be output depending
on the profile selected. yelp has no support for this feature as a result we
would have to duplicate content and maintain duplicates over time. Update of
information is therefore not just a case of changing something in something
in one place, but seeking each instance of that something and updating that.
Overtime this overhead can be very time consuming.
Moving on. We need to ask the question, “What is the advantages and
disadvantages of shipping HTML? Again to do this we need to look at the
technical issues.
1.The disadvantage of shipping HTML is that you must transform to HTML before
creating a package. This is the only disadvantage I can see.
2.There are many advantages to shipping HTML.
a)Rendering is fast as there is no need for transformation from XML to HTML.
b)Ability to customize formatting using CSS.
c)Ability to customize layout and interactivity.
d)Ability to be viewed with multiple user agents, including Yelp.
From a payload perspective, the difference in size of an XML package in
comparison to a HTML package is negligible.
Having said all this. Let's me now put technical issues aside and address why
I feel that a technical board should not be invoked in making the choice to
ship HTML. Although I do ask that one not forget the technical explanations I
have provided and that the artifacts presented also be kept in mind.
My primary problem with invoking a technical board is that while I have
provided convincing technically founded arguments to justify my reasoning for
creating desktop-neutral source formats and shipping HTML in order to ensure
maximum interoperability between desktop environments and user agents, I have
yet to receive an apposing argument of the same nature. As this is the case,
I feel that no evidence has been present for me to reconsider my position
prior to invoking a technical board hearing. Since an opposing argument has
not been presented, I feel that the technical board has nothing to compare
and therefore no basis on which to make an educated decision between XML and
HTML.
Further to this I am concerned that the composition of a technical board would
be comprised of ubuntu members. It is no secret that Ubuntu and its community
members are extremely pro GNOME. My fear is therefore that religious passion
and a tendency to protect GNOME interests, combined with a lack of technical
understanding and depth of perspective on the subject, will over impair a
technical boards ability to make informed and fair judgment on this matter.
In closing I would like to make the following points:
1.My proposal is not without history or consultation with the community. I
have therefore not made a unilateral decision.
2.My proposal is for ubuntu-docs not gnome or kde docs. For purpose of clarity
these terms are used as follows:
* ubuntu documents (ubuntu docs) – documents that are specific to ubuntu or
kubuntu. These documents are not developed nor maintained upstream and have
no place moving upstream.
* gnome documents (gnome docs) – documents that are specific to a gnome
application. These documents may be developed at ubuntu-doc but will move
upstream when the gnome application does.
* kde documents (kde docs, kdocs) – documents that are specific to a KDE
application. These documents may be developed at ubuntu-doc but will move
upstream when the KDE application does.
3.While there has been discussion regarding various help viewer applications
my proposal is aimed toward “any browser” compatibility. since yelp can
render HTML I am not advocating the elimination of Yelp as a tool for viewing
ubuntu-docs.
4.I have made mention of creating a web-based application. This has been
incorrectly interpreted as a proposal to build a new help viewer agent
application. For the record the use of the term web-based application was
used to describe a help system based on HTML, employing CSS for formatting
and possibly javascript in order to impliment interactive functionality
within pages.
* web-based application – an application that is delivered over the Internet.
Typically browser-based. http://en.wikipedia.org/wiki/Browser-based
Understanding of these terms and points is important and I believe will go a
long way to dispelling part of the current confusion.
In light of the current situation I have announced in private to core members
of the Ubuntu Documentation team that I am reconsidering my position within
team and project as a whole. I feel that there have been several instances
where I have embarked on initiatives in service of the project and they have
been total time wasted when people external to the team have chosen to
completely disregard the communication and collaboration that had taken place
in order to start such initiatives. My feeling is that such people have not
taken the time to understand nor ask about the history and reasoning for such
initiatives. Instead they have taken the influence of their position or
standing in the community as a warrant to oppose.
By the same token it concerns me that the Ubuntu Documentation team has not
rallied to support such initiatives and, I think, based on the standing or
influence of such individuals, decided to side with such persons. Such
abandonment of historical team decisions is concerning but more it is
disruptive to the vision and direction of the team.
In such circumstances I find myself, once again, feeling like I have been full
gas in neutral for the past few months. I do not wish to be in this position
now or in the future and have therefore decided to take a break from
activities until such time as resolution is found to the current situation.
The direction in which the decision to ship XML or HTML is made will help me
decide whether or not I continue to be an active member of the Ubuntu
Documentation Team. I wish to stress that this is not blackmail. The
community has the right to decide democratically on the technical direction
vision of the documentation project. However, should the project decide to
ship XML, a decision which demands yelp compatibility, then my work of the
past half year would have been a waste of my time. I personal do not wish to
undo this work. I also do not wish to be put in a position where I feel that
I am having to compromise on a lesser solution that I know is possible.
I hope this document has explained my actions and position on the topic. I
welcome discussion and questions. If the community still feel the need to
invoke a technical board, I shall respect that decision and, in the event of
a motion in favor of shipping XML, I trust that the community will respect my
choice to abstain from the project.
Sincerely,
--
Sean Wheller
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
URL: <https://lists.ubuntu.com/archives/ubuntu-doc/attachments/20050619/9212dbd9/attachment.pgp>
More information about the ubuntu-doc
mailing list