Writing XML with Alo
Version 1.3
Introduction
Alo is designed to make writing XML easy from ANSI C programs. A primary design goal is to avoid writing more lines of code than lines of XML data. For instance, writing an element with it's data should take only one line of code. With Alo, outputting XML data from your app should be no harder or more confusing to write than in any other data format. After reading this tutorial you will be able to easily write code to output XML data.
Writing A Simple Document
Writing an XML document with Alo is really easy. It is very
much like writing a simple text file using the stdio commands
like fopen
and fprintf
. First, you
need to pick a name for the document. You probably want to end
the name with ".xml" but it is not required. Let's make an example
for hello.xml:
alo_doc_info * hello;
alo_open(&hello, "hello.xml", ALO_OPTION_OUTPUT_XML_DECLARATION, "");
alo_close(hello);
This produces a file named "hello.xml" containing
<?xml version="1.0" ?>
The alo_doc_info
is like a stdio.h FILE
ref. It
contains information about the document being written like which element is
currently being written. After the file name in alo_open
is the
option argument. The default option is to create a new file using the
specified name. We passed an option to output an XML declaration. All XML
documents must start with a declaration to be well formed, but you may have
a case where you do not want this. Data can be put in the XML declaration.
For simplicity, we have none, and so pass "". A version number is output
into the XML declaration even though none was specified. This is
because one is required.
Now this is not a very useful XML document, because it
contains no data. Let's add the greeting "Hello, world!". To
write the greeting element we need to use the
alo_out
call. alo_out
is used to
write all markup and content and it needs some info to do this.
First, we need to specify which document to write the element
to. Next is the element we are writing to. Then comes the
content is being written out to the element. This is controlled
by a format string. The "^e" indicates that an element is being
written. Because of this alo_out
expects to see a
namespace argument followed by an element name argument.
Namespaces are not supported yet so pass a 0 or NULL. The
element is a C null terminated string. After the "^e" comes a
"%s" in the format string. This printf
like format
indicates to output the next argument "Hello, world!" as a
string within the element. Alo will leave the greeting element
open until it or the hello document are closed. The
alo_close
on the third line causes the greeting
element to get closed with an end tag before hello.xml is
closed.
alo_doc_info * hello;
int16_t root;
root = alo_open(&hello, "hello.xml", ALO_OPTION_OUTPUT_XML_DECLARATION, "");
alo_out(hello, root, "^e%s", 0, "greeting", "Hello, world!");
alo_close(hello);
This results in a new hello.xml containing
<?xml version="1.0" ?>
<greeting>Hello, world!</greeting>
The greeting element we added is the root, also caled the document element. All XML documents have one and only one root element. All other markup (elements, attributes or comments) and content must be within that root element.
While our hello.xml is complete and well formed, many XML documents
identify their character encoding in their XML declaration for clarity. Alo
can be used to write UTF-8, ISO-8859-1, US-ASCII and probably any other
encoding with a nul char at the end. This is possible in Alo by adding the
encoding info to the alo_open
call. alo_open
adds
string pairs like "encoding", "UTF-8" until it finds an empty string.
Lastly, there is an option to convert from ISO-8859-1 to UTF-8, and it is
a great way to increase the portability of your data from legacy systems
without UTF-8 support.
Here is the new code with the encoding addition to alo_open
:
root = alo_open(&hello, "hello.xml", ALO_OPTION_OUTPUT_XML_DECLARATION, "encoding", "UTF-8", "");
alo_out(hello, root, "^e%s", 0, "greeting", "Hello, world!");
alo_close(hello);
which produces a new hello.xml containing
<?xml version="1.0" encoding="UTF-8" ?>
<greeting>Hello, world!</greeting>
That is it! Three lines of Alo code is all you need to write to make a XML document. The next section shows how to write other XML markup.
Writing Other Markup
The XML specification allows other types of markup to be written. Attributes can be added to elements to add to their meaning. Comments can be written that aid human readers of the XML without being displayed to others. And there are others. But before we can write them we need to learn how to write markup to an element. Remember that only one root element can be in an XML document. We must write these other markups to that element.
Writing Attributes to Elements
When an attribute is written it is always written within an element's start tag. An a example is this:
<news id="Alo1.0release">Alo 1.0 is released!</news>
The id attribute is a common attribute used to allow
reference to just that element within the whole document.
An attribute must be written to a clearly identified
element. The alo_out
command returns a reference
to the element when the element is written. So we can write the
element and attribute in three lines of code:
news = alo_out(doc, root, "^e", 0, "news");
alo_out(doc, news, "^a%s", 0, "id", "Alo1.0release");
alo_out(doc, news, "%s", "Alo 1.0 is released!");
Notice the news returned by alo_out
when
writing the "news" element is passed when the attribute is
written. Notice that a "^a" is used instead of a "^e" to
indicate that "id" is an attribute of news and not an element
inside news. Obviously you can add as many attributes to news
as you want by adding more alo_out
calls.
We can also shorten the C code to a single line if we want.
news = alo_out(doc, root, "^e^a%s^%s", 0, "news", 0, "id", "Alo1.0release", "Alo 1.0 is released!");
The trick here is to separate the "%s" for the attribute from the following "%s" for the element content using a "^". Without this, the attribute will incorporate both "%s%s" into the attribute. The third line in the prior example has an implied "^" since it starts a new format string. Since usually we write one line of code per XML element, this is what is wanted.
It is important that the attribute be written before the
element's content (%s) because writing the content closes the
element's start tag which is where the attribute must be
written. Long format strings can be written to write lots of
XML with one alo_out
call but this can make the
code difficult to match to the XML. Generally writing one
element or attribute per line of code helps to match the code
and XML to each other and ease reading. Let's add some more
information
news = alo_out(doc, root, "^e", 0, "news");
alo_out(doc, news, "^a%s", 0, "id", "Alo1.0release");
alo_out(doc, news, "^a%s", 0, "priority", "important");
alo_out(doc, news, "%s", "Alo 1.0 is released!");
alo_out(doc, news, "^e%s", 0, "location", "https://alo.sourceforge.net");
This added another two lines of code to add another
attribute and another element to the news element. We can
continue to add more attributes or elements as needed. We can
add more to sub elements as long as we remember and pass the
element returned from alo_out
. There is not much
more to say about attributes or elements so let's move on to
writing other markup.
Writing Comments
Comments are easy to write and they can really aid reading
the raw XML. Use them when it can help. It is easy to do.
Here's how we add a comment to the "Hello, world!" greeting. We
write the comment inside the greeting element. The greeting
element is returned by alo_ou
t when the greeting
element was written.
alo_out(doc, root, "^C%s", "This is our first XML document written with Alo!");
to write
<!--This is our first XML document written with Alo!-->
Writing Processing Instructions
Processing Instructions are used in XML documents to aid or change the processing. A common example is xml-stylesheet to indicate what CSS to use to nicely present the XML. Let's specify that our XML document should be presented using the mystyle.css style sheet:
alo_out(doc, root, "^P%s", "xml-stylesheet", "href=\"mystyle.css\" type=\"text/css\"");
to write
<?xml-stylesheet href="mystyle.css" type="text/css"?>
Writing CDATA
Sometimes lots of data needs to be written that should not have any markup like characters treated as markup. This can simplify reading or writing the data, but it can also make it harder to add or extend the data. Since Alo properly handles special XML characters, it is less useful. But let's see how it works. We can write our earlier greeting inside a CDATA section to avoid escaping all the markup characters
<![CDATA[<greeting>Hello, world!</greeting>]]>
just write
alo_out(doc, root, "^D%s", "<greeting>Hello, world!</greeting>");
Formatting Output
Formatting data is easy with Alo. The printf
formatting commands are supported so nothing new or unusual
needs to be learned. Here are some examples writing data:
data = alo_out(doc, root, "^e", 0, "data");
alo_out(doc, data, "^a%d", 0, "id", 100);
alo_out(doc, data, "^e%d", 0, "number", 1);
alo_out(doc, data, "^e0x%x", 0, "hex_number", 0xba11);
alo_out(doc, data, "^e0x%8lX", 0, "hex_number", 0x00ab1234);
alo_out(doc, data, "^e%g", 0, "float", 1.5);
alo_out(doc, data, "^e%c", 0, "character", '@');
alo_out(doc, data, "^e%s", 0, "text", "Hello");
alo_out(doc, data, "^e%s", 0, "boolean", (1 > 0) ? "true" : "false");
alo_out(doc, data, "^e%0d-%02d-%02d", 0, "date", 2004, 7, 4);
And this is the resulting XML:
<data id="100">
<number>1</number>
<hex_number>0xba11</hex_number>
<hex_number>0x00AB1234</hex_number>
<float>1.5</float>
<character>@</character>
<text>Hello</text>
<boolean>true</boolean>
<date>2004-07-04</date>
</data>
What's Next
By combining the examples above you should be able to output most XML documents from your C code. There are more examples in the provided test code. Remember to drag your xml file to a browser often to validate it!