12 April 2018

XmlDictionary bug

I think this is a bug in the .Net framwork.

When you serialize a class you convert its properties and values into a big string of xml, or a blob of binary data that uses less memory. The data will always contain the name of the class, xml namespaces and the names of the properties. We could make the data shorter if we substituted tokens for all these strings.

This code...

 writer = XmlDictionaryWriter.CreateBinaryWriter(stream, dic)

..will create an XmlDictionaryWriter that uses the XmlDictionary provided to perform this substitution. The problem is, that when I tried it, the binary data was the same length as when I did it without the dictionary -- the dictionary did nothing.

Digging into the reference source I discovered that the serializer will call TryLookup on the XmlDictionary to get the data required for the substitution:

 public virtual bool TryLookup(XmlDictionaryString value, out XmlDictionaryString result)
        {
            if (value == null)
                throw System.Runtime.Serialization.DiagnosticUtility.ExceptionUtility.ThrowHelperError(new ArgumentNullException("value"));
            if (value.Dictionary != this)
            {
                result = null;
                return false;
            }
            result = value;
            return true;
        }

 The problem is that this was always true:
value.Dictionary != this....

Nearly every instance of XmlDictionaryString had it's own dictionary at the time of serialization. If you test the XmlDictionaryStirng objects just after creating them, then they have the expected dictionary - the one that you set. It must get changed during serialization.

I'm using VB.Net, so I simply re-implemented that class and skipped that check, see below.


The dictionary also useful to find out exactly which strings can be tokenized - just dump out value.value to get the parts of the xml that it wants to shorten.

Imports System.Collections.Generic
Imports System.Xml

Public Class MyDIck
  Implements IXmlDictionary

  Dim lookup As Dictionary(Of String, XmlDictionaryString)
  Dim strings() As XmlDictionaryString
  Dim nextId As Integer

  Sub New()
    lookup = New Dictionary(Of String, XmlDictionaryString)
  End Sub

  Public Sub New(capacity As Integer)
    lookup = New Dictionary(Of String, XmlDictionaryString)(capacity)
    strings = New XmlDictionaryString(capacity - 1) {}
  End Sub

  Public Function Add(value As String) As XmlDictionaryString
    Dim str As XmlDictionaryString = Nothing
    If lookup.TryGetValue(value, str) = False Then
      If strings Is Nothing Then
        strings = New XmlDictionaryString(3) {}
      ElseIf nextId = strings.Length Then
        Dim newSize = nextId * 2
        If newSize = 0 Then newSize = 4
        Array.Resize(strings, newSize)
      End If
      str = New XmlDictionaryString(Me, value, nextId)
      strings(nextId) = str
      lookup.Add(value, str)
      nextId += 1
    End If
    Return str
  End Function

  Public Function TryLookup(xds As XmlDictionaryString, ByRef result As XmlDictionaryString) As Boolean Implements IXmlDictionary.TryLookup
    Return lookup.TryGetValue(xds.Value, result)
  End Function

  Public Function TryLookup(key As Integer, ByRef result As XmlDictionaryString) As Boolean Implements IXmlDictionary.TryLookup
    If key < 0 OrElse key >= nextId Then
      result = Nothing
      Return False
    End If
    result = strings(key)
    Return True
  End Function

  Public Function TryLookup(value As String, ByRef result As XmlDictionaryString) As Boolean Implements IXmlDictionary.TryLookup
    Return lookup.TryGetValue(value, result)
  End Function

End Class