As I mentioned in my last post, I decided to try to create a simple little OpenXML library for Silverlight. The goal with the task was to see how it could work and if it was possible to roll it in to something useful. And to be honest, it is very useful. And even though the library that is available for download at the end of the post is small, it is aimed to be as open and flexible as possible. A small intro on how to use it might be in order…
I have decided not to use some Xml serailizer object to get the Xml. Why? Well, there are some Xml scenarios that are somewhat fiddly, and I felt that creating good base classes should be enough to make it simple to get object graphs with custom serilization.
As I mentioned the previous post, OpenXML relies on a zipped package format. In my little library, this is supported by a class called Package. It is an abstract base class to be used when creating a new document type. In the downloadable source, one document type is already available available, but more about that later. The Package class’ responsibility is to handle the writing of the necessary package parts, the [Content_Types].xml and .rels files, as well as making sure that all the other package parts are saved. It takes care of saving the document using a virtual method called Save(). So if you want to change the way it does it you can, but in most situations this shouldn’t be necessary. The Package class also contains 2 protected properties, a list containing the different parts of the package and another one containing the package’s relations.
The Package class’ Save method takes an IStreamStorage as a parameter. This reason for not just taking a Stream is that it needs to save the different parts in different files. And the Package class does not automatically zip the content. During development it might be nice to get the files in a files structure on disk instead of having them zipped up. In Silverlight 3 this isn’t possible, but in Silverlight 4, the application can request access to the users “My *” folders and thus support that scenario. So I decided to go for an interface instead. The interface is very simple and looks like this
public interface IStreamStorage : IDisposable
{
Stream GetStream(string path);
void Close();
}
As you can see, it basically gives us access to new Streams based on paths. It also has a Close method, making it possible to close the completed file. And yeah…it also extends IDisposable. The Package class wraps all it’s writes in using statements like this
protected void CreatePartFile(IStreamStorage storage, string filename, Action<XmlWriter> callback)
{
using (Stream str = storage.GetStream(filename))
using (XmlWriter writer = XmlWriter.Create(str))
{
writer.WriteStartDocument(true);
callback(writer);
}
}
If you have not before implemented the IDisposable interface, it is recommended to do this according to a specific pattern. Implement the IDisposable by having the Dispose() method call a protected virtual Dispose(bool) method passing in true. Also implement a finalizer that calls the Dispose(bool) passing in false. That way, the same code is being used in the Dispose() and the finalizer. And that code knows from where it has been called from by looking at the parameter. If the parameter is True, then the class is being disposed, and it can close and dispose of any needed classes. If not, we should make sure that we only set our variables to null as they might already have been garbage collected. The Dispose(bool) overload is made protected virtual so that any inheriting classes can override it to handle the disposing of its members.The inheriting class must just remember to call the bases implementation as well… And yeah…make sure you have an instance variable that keeps track if the class has been disposed. After Dispose() has been called, the object should be considered “dead” and calling any method on it should cause an ObjectDisposedException to be thrown. So it should look something like this
public void Dispose()
{
Dispose(true);
}
protected virtual void Dispose(bool disposing)
{
if (disposing)
{
_stream.Close();
_disposed = true;
}
_stream = null;
}
~ZipStreamStorage()
{
Dispose(false);
}
And any method in the class should look something like this
public System.IO.Stream GetStream(string path)
{
VerifyNotDisposed();
// Implementation
}
void VerifyNotDisposed()
{
if (_disposed)
throw new ObjectDisposedException();
}
The library of course includes an implementation of the IStreamStorage as well. It is called ZipStreamStorage and supports the basic zip based OpenXML package. It takes a Stream as a constructor parameter and uses the ZipOutputStream class and to get new Streams in the GetStream() method. It also makes sure that the ZipOutputStream is closed both if the user calls Close or if the IDIsposable interfaces Dispose is called.
And after that gigantic sidestep, it is time to get back to the implementation at hand. The Package’s two protected list properties I mentioned before, include everything needed to build an OpenXML package. The first list is called Relationships and contains objects of type Relationship. The Relationship class is a simple class that represents a Relationship between 2 points. The second list is of type PackagePart and is of course called Parts. It contains the different parts that make up the package.
But instead of looking at all the abstract stuff and base classes and things, I suggest having a loot at the pre-made document that is in the download and touch on the needed base classes as we go along. The pre-made package is called WordDocument. All MS Word documents contain at least 5 parts. Remember, that is all MS Word documents. Microsoft adds a couple of things to their documents that is beyond the requirements of OpenXML. Let’s just ignore 4 of them for now and focus on the one that really matters.
The WordDocument constructor creates the “mandatory” parts and add them to the Package’s Parts list. It also makes sure that they are added to the Package’s Relationships list so they are correctly referenced by the Package.
public class WordDocument : Package
{
...
public WordDocument()
{
Document = new DocumentPart(this, "/word/document.xml",true);
Parts.Add(Document);
_appPart = new AppPart();
Parts.Add(_appPart);
_corePart = new CorePart();
Parts.Add(_corePart);
FontTable = new FontTablePart("/word/fontTable.xml");
Parts.Add(FontTable);
Document.Relationships.Add(new Relationship(FontTable));
Styles = new StylesPart("/word/style.xml");
Parts.Add(Styles);
Relationships.Add(new Relationship(Document));
Relationships.Add(new Relationship(_appPart));
Relationships.Add(new Relationship(_corePart));
}
...
}
The extremely important DocumentPart is a pretty typical PackagePart. PackagePart is, as I said before, the base class for any object that needs to be handled as a part of an OpenXML document. It has 2 virtual save methods. The basic Save() method takes an IStreamStorage instance as a parameter and is responsible for saving the part from scratch. But as most PackageParts are XML based, there is another save method called SaveContent() that is called by the default implementation of Save(). The Save() method defaults to creating 2 files, one that stores the part’s relationships and one that contains the part’s content. The second one is wrapped in an XmlWriter and passed to the SaveContent() method, which is responsible for writing the XML.
The PackagePart class also contains 3 abstract properties, Name, ContentType and RelationshipType. These are used when creating and referencing the part. The Name is the path to the file inside the package, the ContentType is a string representing the content type of the part and the RelationshipType is a string representing the type of relationship used when referencing the part. In the case of DocumentPart, the Name is passed in through the constructor, the RelationshipType is a hardcoded string and the ContentType is depending on a bool passed into the constructor.
Beyond those abstract and virtual members, it contains a RelationshipCollection called Relationships and a NamespaceCollection called Namespaces. They are pretty self explanatory. The CreateElement<T> method might not be as obvious. As I mentioned, most package parts contain XML. That means the corresponding PackagePart classes will contain some form of object(s) that represent Xml elements. Because all these classes have some things in common, I created a base class called OXMLElement. Since some OXMLElements need to access the PackagePart’s namespace and relationship collections, they need to have a reference to the PackagePart that it belongs to. This is where the CreateElement<T> method comes into play. It creates a new instance of a class inheriting from OXMLElement and makes sure that it has a reference to the correct PackagePart. So any time you need a class that inherits from OXMLElement, you should use the “owning” part’s CreateElement<T>() when creating the instance…
OXMLElement has two save methods just like PackagePart as well as 2 abstract properties, NodeName and NodeNamespace. The abstract properties are used by the “raw” save method to create the base element before calling SaveContent.
protected internal virtual void Save(XmlWriter writer)
{
writer.WritePrefixedStartElement(NodeName, NodeNamespace);
SaveContent(writer);
writer.WriteEndElement();
}
protected internal virtual void SaveContent(XmlWriter writer)
{
}
BTW, XmlWriter does not have a WritePrefixedStartElement…I have added an extension method for that…
OXMLElement also has a sibling class called OXMLContainerElement<T>, that works as a base class for objects that will contain other OXMLElements. It supports this by containing a protected List<T> that can contain those elements for you…
The DocumentPart is one of those PackagePart objects that need to be stored as XML, and a document contains one or more sections. Because of this, I have created a Section class that inherits from OXMLContainerElement<T>, and exposed a SectionCollection on the DocumentPart to hold those sections. The Section class is a pretty typical OXMLContainerElement<T>. It represents an Xml element named “sectPr” declared in a namespace called http://schemas.openxmlformats.org/wordprocessingml/2006/main, so the overrides for NodeName and NodeNamespace are simply implemented by hard coding these values. It also adds properties for the different settings for the element. It then overrides the Save(XmlWriter) method with an implementation that writes the correct Xml. Finally, it exposes a list of Paragraph objects. Paragraph in turn does more or less the same thing and exposes a list of Run objects. The hierarchy of objects can be however long or short needed to fulfill the requirements.
That’s it. It isn’t harder than that…But I do have one last thing. What happens if an object can contain more than one type of objects. The parent obviously dictates what elements are allowed. Well…Run happens to be such an object. A Run can contain lots of different types of elements. The implementation in this case has to rely on either base classes or interfaces. In the case of Run, there is a base class called RunContent that any object that should be a child of Run needs to inherit from.
So…let’s look at how to create a simple Word document in Silverlight… I have created a Silverlight application that looks as the following

As you can see, it supports adding a title, some text as well as selecting an image. Let’s start by looking at the text parts and leave the image out of it. Why? Well, because there is no image support by default. I will have to build that…
The only interesting part of the application right now is the Save method. It gets hold of a file stream by using the SaveFileDialog.
private void Save(object sender, RoutedEventArgs e)
{
SaveFileDialog dlg = new SaveFileDialog();
dlg.Filter = "Word Document (.docx)|*.docx|Zip Files (.zip)|*.zip";
dlg.DefaultExt = ".docx";
if (dlg.ShowDialog() == true)
{
...
}
}
I decided to give myself the option to save as a zip as well. That way I didn’t have to rename the file to see the content…
Next it creates a new WordDocument and set some metadata properties on it
WordDocument doc = new WordDocument();
doc.ApplicationName = "SilverWord";
doc.Creator = "Chris Klug";
doc.Company = "Intergen";
Next, I decided to add a couple of fonts to the document. My text will use 2 different fonts, Comic Sans and Aharoni…and yes I know that you get shot in certain countries when using Comic Sans, but this is about technology and not design at the moment and everyone has Comic Sans…
A font is defined by using a FontDefinition object that needs to be added to the documents FontTablePart. When a FontDefinition is added to a FontTablePart, a FontReference is returned. This can in turn be used in places where a font is needed. FontDefinition contain a bunch of odd little properties needed by the document, such as Panose1, CharSet and Pitch. Just ignore them for now. If you need them, you can create a document in Word, save it, unpack it and find the values in there…
doc.FontTable.CreateElement<FontDefinition>();
FontDefinition fontDefinition = doc.FontTable.CreateElement<FontDefinition>();
fontDefinition.Name = "Comic Sans MS";
fontDefinition.Panose1 = "030F0702030302020204";
fontDefinition.CharSet = "00";
fontDefinition.Family = FontFamilyEnumeration.Script;
fontDefinition.Pitch = FontPitchEnumeration.Variable;
fontDefinition.Signature.UnicodeSignature0 = "00000287";
fontDefinition.Signature.CodePageSignature0 = "0000009F";
FontReference comicSans = doc.FontTable.AddFont(fontDefinition);
fontDefinition = doc.FontTable.CreateElement<FontDefinition>();
fontDefinition.Name = "Aharoni";
fontDefinition.Panose1 = "02010803020104030203";
fontDefinition.CharSet = "B1";
fontDefinition.Family = FontFamilyEnumeration.Auto;
fontDefinition.Pitch = FontPitchEnumeration.Variable;
fontDefinition.Signature.UnicodeSignature0 = "00000801";
fontDefinition.Signature.CodePageSignature0 = "00000020";
FontReference aharoni = doc.FontTable.AddFont(fontDefinition);
Next, I decided to create a re-usable style. OpenXML documents can store style definitions in a separate part. This part is called StylesPart in this implementation. It sort of works like a CSS file. It contains named styles that can be used together with certain elements. They come in a couple of different versions. The one I will use is CharacterStyle. It defines styles to be used by text. The other implemented type id ParagraphStyle, which contains style information for paragraphs.
CharacterStyle style = doc.Styles.CreateElement<CharacterStyle>();
style.Id = "TitleStyle";
style.Name = "Title Style";
style.RunProperties.FontSize = 30;
style.RunProperties.IsBold = true;
style.RunProperties.Font.ComplexScript = aharoni;
style.RunProperties.Font.HighAnsi = aharoni;
style.RunProperties.Font.ASCII = aharoni;
style.RunProperties.Font.EastAsia = aharoni;
doc.Styles.AddStyle(style);
As you can see, there are several different font properties defined. They make it possible to use different fonts depending on the situation…
Next it is time to start adding some actual content to the document. This is done using a Run object. A Run is created as well as a Text object, both using the documents CreateElement<T>() method. The content of the Text is set to the title before it is added to the Run and the Run’s style is set before it is added to the documents first section’s first paragraph.
DarksideCookie.OXML.Word.Elements.Run run =
doc.Document.CreateElement<DarksideCookie.OXML.Word.Elements.Run>();
Text t = doc.Document.CreateElement<Text>();
t.Content = Title.Text;
run.Content.Add(t);
run.Properties.Style = style;
doc.Document.Sections[0].Paragraphs[0].Runs.Add(run);
Unfortunately, I need to fully qualify the Run class as there is a Run object in System.Windows.Document as well…and using an alias just looked confusing…
The next part of the code takes the test in the text passed in and works some magic on it. First off, it splits it any place where there are 2 carriage return characters. Double Enter in this simple example means paragraph break, while single enter is works as a soft line break or whatever it is called…
Next it loops through the string array and creates and fills a paragraph for each one before adding it to the document. It fills it by splitting the paragraph content on carriage returns and adding a Break object after each line…
Paragraph p;
string[] paragraphs = Text.Text.Split(new string[] { "\r\r" }, StringSplitOptions.RemoveEmptyEntries);
foreach (var paragraph in paragraphs)
{
string[] lines = paragraph.Split(new string[] { "\r" }, StringSplitOptions.RemoveEmptyEntries);
p = doc.Document.CreateElement<Paragraph>();
run = doc.Document.CreateElement<DarksideCookie.OXML.Word.Elements.Run>();
run.Properties.Font.HighAnsi = comicSans;
run.Properties.Font.ComplexScript = comicSans;
run.Properties.Font.ASCII = comicSans;
run.Properties.Font.EastAsia = comicSans;
for (int i = 0; i < lines.Length; i++)
{
t = doc.Document.CreateElement<Text>();
t.Content = lines[i];
run.Content.Add(t);
if (i < lines.Length - 1)
{
run.Content.Add(doc.Document.CreateElement<Break>());
}
}
p.Runs.Add(run);
doc.Document.Sections[0].Paragraphs.Add(p);
}
It is kind of a rough implementation, but it works for this demo…
Last but not least, it saves the document and empties the form…
using (IStreamStorage storage = new ZipStreamStorage(dlg.OpenFile()))
{
doc.Save(storage);
}
Title.Text = "";
Text.Text = "";
That’s it… After a lot of talk about the “library” and so on, creating the document is really not that complicated or hard…which is exactly what I wanted. But how about extending it. Is that as easy…well, almost…
An image is a completely new PackagePart, so let’s create one. I create a simple class called ImagePart and inherit from PackagePart. The constructor takes 2 arguments, a name and a Stream that represents the image. Both these are stored and the name is exposed through the overridden Name property. The ContentType and RelationshipType is also returned as needed. I try to stay away from strings in my code, so they are added to a static class that contains the needed strings. That way I don’t get typos in my namespaces and thing…
public class ImagePart : PackagePart, IDisposable
{
string _name;
Stream _img;
public ImagePart(string name, Stream image)
{
_name = name;
_img = image;
}
...
protected override string Name
{
get { return _name; }
}
protected override string ContentType
{
get
{
return ContentTypes.Jpg;
}
}
protected override string RelationshipType
{
get { return RelationshipTypes.Image; }
}
}
The Save method is then overridden. In this case, it is not an XML based part, so I override the Save(IStreamStorage) method with an implementation that gets a new stream and copies across the image bytes.
protected override void Save(IStreamStorage storage)
{
using (Stream str = storage.GetStream(_name))
{
byte[] buffer = new byte[_img.Length];
_img.Read(buffer, 0, (int)_img.Length);
str.Write(buffer, 0, buffer.Length);
}
}
The last detail is that I have added support for IDisposable so that I don’t hog loads of memory unnecessarily…
That’s the part. Next I need an element to display it and I have chosen a VERY simple version. OpenXML supports a lot more complicated ways of showing graphics, but for now, this will do.
I create a class called Picture and inherit from RunContent as it will be placed inside a Run. I start by overriding the abstract properties NodeName and NodeNamespace, but quickly realize that the Picture class will actually need some extra namespaces. Not just the one that the <pict> element has been defined in. So, in the constructor, I use my reference to the PackagePart and add the extra namespaces that I need.
public class Picture : RunContent
{
string _id;
Relationship _rel;
Size _size;
public Picture(PackagePart part) : base(part)
{
_id = Guid.NewGuid().ToString("N");
Part.Namespaces.Add(Namespaces.Vml);
Part.Namespaces.Add(Namespaces.Relationships);
}
...
protected override string NodeName
{
get { return "pict"; }
}
protected override KeyValuePair<string, string> NodeNamespace
{
get { return Namespaces.WordprocessingML.Main; }
}
public Relationship Image
{
get { return _rel; }
set
{
if (value == null)
{
throw new ArgumentNullException("Image cannot be empty");
}
Part.Relationships.Add(value);
_rel = value;
}
}
public Size Size
{
get { return _size; }
set
{
if (value == null)
{
throw new ArgumentNullException("Size cannot be empty");
}
_size = value;
}
}
}
The image to use is defined by a Relationship, and the size by a Size object. Since we are using the CreateElement<T>() method, we unfortunately cannot not pass the values to the constructor, so we need to make sure that they are set. This is done in the property setters as well as in the Save() method.
Speaking of save, implementing it for the Picture class is not that complicated. But since we have that pesky constructor limitation, I override the Save() method and verify that all the necessary props have been set. If not, I throw some exceptions… If everything is ok, I call the base class’ Save() method, which will in turn write the element start for me and then pass control back to me in the SaveContent() method where I write the XML needed…
protected override void Save(System.Xml.XmlWriter writer)
{
if (Image == null)
throw new Exception("Image cannot be empty");
if (Size == null)
throw new Exception("Size cannot be empty");
base.Save(writer);
}
protected override void SaveContent(System.Xml.XmlWriter writer)
{
writer.WritePrefixedStartElement("shape", Namespaces.Vml);
writer.WriteAttributeString("style", string.Format("width:{0}pt;height:{1}pt", _size.Width, _size.Height));
writer.WritePrefixedStartElement("imageData", Namespaces.Vml);
writer.WriteAttributeString(Namespaces.Relationships.Key, "id", Namespaces.Relationships.Value, _rel.ID);
writer.WriteEndElement();
writer.WriteEndElement();
}
That’s it. As you can see, as long as you know your way around the OpenXML spec it isn’t that hard to build new functionality. And if you don’t know the spec (like me) use the OpenXML Developer site or use Word to do what you want and see how it saves it…
And the final stretch home…adding the image to the document… After having written the paragraphs to the document I check to see if there is an image available. If there is, I create a new ImagePart and add it to the document. The WordDocument class has a little helper method that can help us get the right path for the file, so I use that…
if (_image != null)
{
ImagePart img = new ImagePart(doc.GetMediaPartName("image1.jpg"), _image);
Relationship rel = doc.AddPart(img);
...
}
Then it is time to add it to the actual document. This is done by creating a new Paragraph object, a Run object and a Picture object. Then setting the Picture’s properties before adding it to the Run, which is added to the Paragraph, which is inserted as the first paragraph of the document…
p = doc.Document.CreateElement<Paragraph>();
run = doc.Document.CreateElement<DarksideCookie.OXML.Word.Elements.Run>();
Picture pict = doc.Document.CreateElement<Picture>();
pict.Image = rel;
pict.Size = new Size(400, 240);
run.Content.Add(pict);
p.Runs.Add(run);
doc.Document.Sections[0].Paragraphs.Insert(0,p);
And then I can re-run the application and save an image with the document. There is another tiny trick with the image in the code that I haven’t shown. It has to do with the fact that I both want to use the image in the application as well as add it to the document. When the OpenFileDialog returns, I first use the OpenRead() method to get a Stream to the file and store this stream…but you can see that in the download if you want to…
That’s all for me this time. Hope you will have use for this in the future and that you don’t hesitate to ask me if there is something that you wonder… And yeah…of course… don’t forget to download the project…HERE…
Cheers!
[UPDATE]
I found a little bug in the code. The Break class inherited from the wrong base class. So I updated the project and the downloaded source…