Writing XML with Alo

Version 1.3

Introduction

Alo is designed to make writing XML easy from ANSI C programs. A primary design goal is to avoid writing more lines of code than lines of XML data. For instance, writing an element with it's data should take only one line of code. With Alo, outputting XML data from your app should be no harder or more confusing to write than in any other data format. After reading this tutorial you will be able to easily write code to output XML data.

Writing A Simple Document

Writing an XML document with Alo is really easy. It is very much like writing a simple text file using the stdio commands like fopen and fprintf. First, you need to pick a name for the document. You probably want to end the name with ".xml" but it is not required. Let's make an example for hello.xml:

alo_doc_info * hello;
alo_open(&hello, "hello.xml", ALO_OPTION_OUTPUT_XML_DECLARATION, "");
alo_close(hello);

This produces a file named "hello.xml" containing

<?xml version="1.0" ?>

The alo_doc_info is like a stdio.h FILE ref. It contains information about the document being written like which element is currently being written. After the file name in alo_open is the option argument. The default option is to create a new file using the specified name. We passed an option to output an XML declaration. All XML documents must start with a declaration to be well formed, but you may have a case where you do not want this. Data can be put in the XML declaration. For simplicity, we have none, and so pass "". A version number is output into the XML declaration even though none was specified. This is because one is required.

Now this is not a very useful XML document, because it contains no data. Let's add the greeting "Hello, world!". To write the greeting element we need to use the alo_out call. alo_out is used to write all markup and content and it needs some info to do this. First, we need to specify which document to write the element to. Next is the element we are writing to. Then comes the content is being written out to the element. This is controlled by a format string. The "^e" indicates that an element is being written. Because of this alo_out expects to see a namespace argument followed by an element name argument. Namespaces are not supported yet so pass a 0 or NULL. The element is a C null terminated string. After the "^e" comes a "%s" in the format string. This printf like format indicates to output the next argument "Hello, world!" as a string within the element. Alo will leave the greeting element open until it or the hello document are closed. The alo_close on the third line causes the greeting element to get closed with an end tag before hello.xml is closed.

alo_doc_info * hello;
int16_t root;
root = alo_open(&hello, "hello.xml", ALO_OPTION_OUTPUT_XML_DECLARATION, "");
alo_out(hello, root, "^e%s", 0, "greeting", "Hello, world!");
alo_close(hello);

This results in a new hello.xml containing

<?xml version="1.0" ?>
<greeting>Hello, world!</greeting>

The greeting element we added is the root, also caled the document element. All XML documents have one and only one root element. All other markup (elements, attributes or comments) and content must be within that root element.

While our hello.xml is complete and well formed, many XML documents identify their character encoding in their XML declaration for clarity. Alo can be used to write UTF-8, ISO-8859-1, US-ASCII and probably any other encoding with a nul char at the end. This is possible in Alo by adding the encoding info to the alo_open call. alo_open adds string pairs like "encoding", "UTF-8" until it finds an empty string. Lastly, there is an option to convert from ISO-8859-1 to UTF-8, and it is a great way to increase the portability of your data from legacy systems without UTF-8 support.

Here is the new code with the encoding addition to alo_open:

root = alo_open(&hello, "hello.xml", ALO_OPTION_OUTPUT_XML_DECLARATION, "encoding", "UTF-8", "");
alo_out(hello, root, "^e%s", 0, "greeting", "Hello, world!");
alo_close(hello);

which produces a new hello.xml containing

<?xml version="1.0" encoding="UTF-8" ?>
<greeting>Hello, world!</greeting>

That is it! Three lines of Alo code is all you need to write to make a XML document. The next section shows how to write other XML markup.

Writing Other Markup

The XML specification allows other types of markup to be written. Attributes can be added to elements to add to their meaning. Comments can be written that aid human readers of the XML without being displayed to others. And there are others. But before we can write them we need to learn how to write markup to an element. Remember that only one root element can be in an XML document. We must write these other markups to that element.

Writing Attributes to Elements

When an attribute is written it is always written within an element's start tag. An a example is this:

<news id="Alo1.0release">Alo 1.0 is released!</news>

The id attribute is a common attribute used to allow reference to just that element within the whole document. An attribute must be written to a clearly identified element. The alo_out command returns a reference to the element when the element is written. So we can write the element and attribute in three lines of code:

news = alo_out(doc, root, "^e", 0, "news");
alo_out(doc, news, "^a%s", 0, "id", "Alo1.0release");
alo_out(doc, news, "%s", "Alo 1.0 is released!");

Notice the news returned by alo_out when writing the "news" element is passed when the attribute is written. Notice that a "^a" is used instead of a "^e" to indicate that "id" is an attribute of news and not an element inside news. Obviously you can add as many attributes to news as you want by adding more alo_out calls.

We can also shorten the C code to a single line if we want.

news = alo_out(doc, root, "^e^a%s^%s", 0, "news", 0, "id", "Alo1.0release", "Alo 1.0 is released!");

The trick here is to separate the "%s" for the attribute from the following "%s" for the element content using a "^". Without this, the attribute will incorporate both "%s%s" into the attribute. The third line in the prior example has an implied "^" since it starts a new format string. Since usually we write one line of code per XML element, this is what is wanted.

It is important that the attribute be written before the element's content (%s) because writing the content closes the element's start tag which is where the attribute must be written. Long format strings can be written to write lots of XML with one alo_out call but this can make the code difficult to match to the XML. Generally writing one element or attribute per line of code helps to match the code and XML to each other and ease reading. Let's add some more information

news = alo_out(doc, root, "^e", 0, "news");
alo_out(doc, news, "^a%s", 0, "id", "Alo1.0release");
alo_out(doc, news, "^a%s", 0, "priority", "important");
alo_out(doc, news, "%s", "Alo 1.0 is released!");
alo_out(doc, news, "^e%s", 0, "location", "https://alo.sourceforge.net");

This added another two lines of code to add another attribute and another element to the news element. We can continue to add more attributes or elements as needed. We can add more to sub elements as long as we remember and pass the element returned from alo_out. There is not much more to say about attributes or elements so let's move on to writing other markup.

Writing Comments

Comments are easy to write and they can really aid reading the raw XML. Use them when it can help. It is easy to do. Here's how we add a comment to the "Hello, world!" greeting. We write the comment inside the greeting element. The greeting element is returned by alo_out when the greeting element was written.

alo_out(doc, root, "^C%s", "This is our first XML document written with Alo!");

to write

<!--This is our first XML document written with Alo!-->

Writing Processing Instructions

Processing Instructions are used in XML documents to aid or change the processing. A common example is xml-stylesheet to indicate what CSS to use to nicely present the XML. Let's specify that our XML document should be presented using the mystyle.css style sheet:

alo_out(doc, root, "^P%s", "xml-stylesheet", "href=\"mystyle.css\" type=\"text/css\"");

to write

<?xml-stylesheet href="mystyle.css" type="text/css"?>

Writing CDATA

Sometimes lots of data needs to be written that should not have any markup like characters treated as markup. This can simplify reading or writing the data, but it can also make it harder to add or extend the data. Since Alo properly handles special XML characters, it is less useful. But let's see how it works. We can write our earlier greeting inside a CDATA section to avoid escaping all the markup characters

<![CDATA[<greeting>Hello, world!</greeting>]]>

just write

alo_out(doc, root, "^D%s", "<greeting>Hello, world!</greeting>");

Formatting Output

Formatting data is easy with Alo. The printf formatting commands are supported so nothing new or unusual needs to be learned. Here are some examples writing data:

data = alo_out(doc, root, "^e", 0, "data");
alo_out(doc, data, "^a%d", 0, "id", 100);
alo_out(doc, data, "^e%d", 0, "number", 1);
alo_out(doc, data, "^e0x%x", 0, "hex_number", 0xba11);
alo_out(doc, data, "^e0x%8lX", 0, "hex_number", 0x00ab1234);
alo_out(doc, data, "^e%g", 0, "float", 1.5);
alo_out(doc, data, "^e%c", 0, "character", '@');
alo_out(doc, data, "^e%s", 0, "text", "Hello");
alo_out(doc, data, "^e%s", 0, "boolean",  (1 > 0) ? "true" : "false");
alo_out(doc, data, "^e%0d-%02d-%02d", 0, "date", 2004, 7, 4);

And this is the resulting XML:

<data id="100">
    <number>1</number>
    <hex_number>0xba11</hex_number>
    <hex_number>0x00AB1234</hex_number>
    <float>1.5</float>
    <character>@</character>
    <text>Hello</text>
    <boolean>true</boolean>
    <date>2004-07-04</date>
</data>

What's Next

By combining the examples above you should be able to output most XML documents from your C code. There are more examples in the provided test code. Remember to drag your xml file to a browser often to validate it!