Serialisation

Lesson 1: Serializing Objects

What is it?

Implemented by System.Runtime.Serialization namespace.

Process of serializing / de-serializing objects so they can be stored or transferred and then re-created.

Serializing = process of converting object into linear byte sequence.

De-serializing = process of converting sequence of bytes into object.

Windows relies on serialization for many tasks, e.g. Web services, remoting, copying items to clipboard, etc.

If simple text data then use standard file handling mechanisms. Comes into its own when dealing with complex object, e.g. Current date / time.

How to serialize

  • Create stream object to hold serialized output
FileStream fs = new FileStream("Serialization.Data", FileMode.Create);
  • Create BinaryFormatter object

BinaryFormatter bf = new BinaryFormatter();

  • Call BinaryFormatter.Serialize() to serialize object and output result to stream
bf.Serialize(fs, System.DateTime.Now);

How to De-serialize

  • Create a stream object to read the serialized output
FileStream fs = new FileStream("Serialization.Data", FileMode.Open);
  • Create a BinaryFormatter object
BinaryFormatter bf = new BinaryFormatter();
  • Create a new object to store the de-serialized data
DateTime previousTime = new DateTime();
  • Call BindaryFormatter.Deserialize() to de-serialize data and cast to correct type
previousTime = (DateTime) bf.Deserialize(fs);

The runtime proceeds through de-serialization process sequentially. Can be complicated in object being de-serialized refers to another object. If an object reference is encountered the Formatter queries the ObjectManager to determine if referenced object has already been de-serialized. If it has (a backward reference) then the Formatter completes the reference. If it has not (a forward reference) then the Formatter registers a fixup with the ObjectManager which will complete the reference when the referenced object is de-serialized.

Creating classes that can be serialized

Serialize / de-serialize support added to custom class via Serializable attribute. Should always add even if when developing class serialization support not immediately required. If default serialization handling is OK then no further code is required - the runtime will serialize all members (including private). Can control serialization process to improve efficiency, meet custom requirements, etc.

Serialization can allow other code to see / modify object instance data that would normally be inaccessible. Code performing serialization therefore needs a SecurityPermission attribute with the SerializationFormatter flag specified. By default this permission is not given to internet / intranet downloaded code, only code on the local computer is granted the permission.

Disable serialization of specific members

Some class members (e.g. Temporary or calculated values), do not need to be stored. Disable serialization of member by adding NonSerialized attribute to it, e.g.

[NonSerialized] public decimal total;

To allow class to automatically initialise a non-serialized member implement the IDeserializationCallback.OnDeserialization interface member, e.g.

[Serializable]
class ShoppingCart : IDeserializationCallback
{
  public int productId;

  public decimal price;

  public int quantity;

  [NonSerialized] public decimal total;

  ...

  void IDeserializationCallback.OnDeserialization(Object sender)
  {
    // After deserialization calculate the total

    total = price * quantity;
  }
}

Version Compatibility

If classes evolve and gain new members then the new class will not be able to deserialize objects created by previous versions. Two solutions:

  1. Implement custom serialization
  2. Apply OptionalField attribute to new members, e.g.
[Serializable]
class ShoppingCart : IDeserializationCallback
{
  public int productId;

  public decimal price;

  public int quantity;

  [NonSerialized] public decimal total;

  [OptionalField] public bool taxable;
}

To populate optional fields either implement the IDeserializationCallback.OnDeserialization interface member or respond to serialization events (described later).

Note, .NET 2 can de-serialize objects with unused members (i.e. Those that have been removed from a class). Previous versions would throw an exception if additional members were encountered when attempting to de-serialize an object.

Best Practices for Version Compatibility
  • Never remove a serialized field
  • Never apply NonSerializedAttribute to field if attribute not applied in previous versions
  • Never change name or type of serialized field
  • When adding new serialized fields apply the OptionalFieldAttribute
  • When removing a NonSerializedAttribute from a field that was not serializable in previous version apply the OptionalFieldAttribute
  • For optional fields set meaningful defaults using serialization callbacks, unless 0 or null defaults are acceptable

Serialization Format

.NET provides two serialization formats

  • BinaryFormatter most efficient way to serialize objects that will be read-only. Compatible serialization between different versions of .NET framework.
  • SoapFormatter is XML based formatter. Most reliable way to send objects across network links or that are to be read by non .NET applications. More likely to traverse firewalls. Can be 3 to 4 times size of BinaryFormatter generated streams. Does not support serialization compatibility between version of .NET framework.

Control SOAP Serialization

Soap serialization is intended to be read by variety of platforms so configuration is required (rarely need to change defaults for BinaryFormatter). Control formatting using following attribytes:

AttributeApplies ToSpecifies
SoapAttribute  Public field, property, parameter, return valueThe class member will be serialized as an XML attribute
SoapElementPublic field, property, parameter, return valueThe class will be serialized as an XML element
SoapEnumPublic field that is an enumeration identifierThe element name of an enumeration member
SoapIgnorePublic properties and fieldsThe property or field is ignored when the class is serialized
SoapIncludePublic derived class declarations and public methods for Web Services Description Language (WSDL) documents  The type should be included when generating schemas (to be recognised when serialized)

 

Serialization guidelines

  • When in doubt mark class as Serializable
  • Mark calculated or temporary members as NonSerialized
  • Use SoapFormatter when portability required.

Lesson 2: XML Serialization

Use XML Serialization when exchanging data with application that may not be .NET based and there is no intention to serialize private members.

Benefits over standard serialization:

  • Interoperability - XML is text based standard that all modern development environments support.
  • Administrator friendly - serialized objects can be viewed / edited in any text editor. Good for customisation, troubleshooting and developing new applications incorporating existing ones
  • Forward compatibility - XML is self describing and easily processed. When new application developed it is easy to process existing serialized objects.

But has following limitations:

  • Can serialize only public data
  • Cannot serialize object graphs; use XML serialization only on objects themselves
  • Class must have parameterless constructor available

Serialization

  • Create a stream
FileStream fs = new FileStream("SerializedSata.xml", FileMode.Create);
  • Create XmlSerializer object - pass in type of object to be serialized
XmlSerializer xs = new XmlSerializer(typeof(DateTime));
  • Call XmlSerializer.Serializer to send object to stream
xs.Serialize(ds, System.DateTime.Now);

Serialization Control

If serialize class that meets requirements for XML serialization, but has no XML serialization attributes applied then default settings will be used.

Following class:

public class ShoppingCartItem 
{  
  public Int32 productId;

  public decimal price;

  public Int32 quantity;

  public decimal total;  
}

generates...


<?xml version="1.0">

<ShoppingCartItem>

    <productId>10</productId>

    <price>10.25></price>

    <quantity>2</quantity>

    <total>20.50</total>

</ShoppingCartItem>

applying the following attributes...

[XmlRoot("CartITem")]
public class ShoppingCartItem
{
  [XmlAttribute] public Int32 productId;

  public decimal price;

  public Int32 quantity;

  [XmlIgnore] public decimal total;
}

generates...

<?xml version="1.0">

    <CartItem productId="10">

    <price>10.25></price>

    <quantity>2</quantity>

</CartItem>

Attributes let you meet most XML serialization requirements. For complete control implement the IXmlSerializable interface, e.g. to separate data into bytes instead of buffering large data sets

Schema Conformance

XML schema defines structure of XML document.

Many schemas already exist - where possible leverage an existing one.

From XML schema can use XML Schema Definition tool (xsd.exe) to produce set of classes that are strongly types to the schema and annotated with appropriate attributes. When instance of class is serialized the generated XML adheres to the schema.

This approach is simpler than using other classes in framework, e.g. XmlReader and XmlWriter to parse and write XML stream.

Lesson 3: Custom Serialization

In some circumstances may need complete control over serialization process.

Override .NET serialization for a class by implementing ISerializable interface and applying Serializable attribute.

Useful in classes where a member is invalid after de-serialization but a value needs to be provided to reconstruct the full state of the object.

For classes that have declarative or imperative security at class level (or on its constructors) then must implement ISerializable interface.

To implement ISerializable interface write GetObjectData method and special constructor used during de-serialization. Runtime will generate warning if GetObjectData method is not implemented, but not if the constructor is missing - be warned!

When GetObjectData method is called your code must populate the SerializationInfo object provided. Call its AddValue method to store the name / value pairs to be stored - internally this creates SerializationEntry structures to store the information. Any text can be used as the name. There is complete freedom to choose which member variables are added to the SerializationInfo object, but there must be enough to allow de-serialization to take place.

When runtime calls constructor it provides the SerializationInfo object previously populated. Retrieve values form this object to populate member variables.

[Serializable]
class ShoppingCart : ISerializable
{
  public int productId;

  public decimal price;

  public int quantity;

  [NonSerialized] public decimal total;

  // Standard constructor
  public ShoppingCart() {...}

  // De-serialization constructor
  public ShoppingCart(SerializationInfo info, StreamingContext context)
  {
    productId = info.GetInt32("Product Id");

    price = info.GetDecimal("Price");

    quantity = infor.GetInt32("Quantity");

    total = price * quantity;
  }

  [SecurityPermissionAttribute(SecurityAction.Demand, SerializationFormatter = true)]
  public virtual void GetObjectData(SerializationInfo info, StreamingContext context)
  {
    info.AddValue("Product ID", productId);

    info.AddValue("Price", price);

    info.AddValue("Quantity", quantity);
  }
}

Must validate data in de-serialization constructor and throw

SerializationException

if invalid data provided. This is to minimise the risk of an attacker providing fake serialization information - always assume calls to the serialization constructor are made by attackers.

Responding to Serialization Events

BinaryFormatter raises the following four events:

  1. Serializing - raised just before serialization takes place. Apply the OnSerializing attribute to the method that should run during this event.
  2. Serialized - raised just after serialization has taken place. Apply the OnSerialized attribute to the method that should run during this event.
  3. Deserializing - raised just before de-serialization takes place. Apply the OnDeserializing attribute to the method that should run during this event.
  4. Deserialized - raised just after de-serialization takes place and after IDeserialzationCallback.OnDeserialization has been called - note the IDeserialzationCallback.OnDeserialization method should be used on non BinaryFormatters. OnDeserialized attribute to the method that should run during this event.

The methods do not access the serialization stream, but allow the object to be altered before and after serialization has taken place. The attributes can be applied at all levels with the inheritance hierarchy, with each method being called in the hierarchy from the base to the most derived. This approach avoids the complexity of implementing the ISerializable interface by giving responsibly for serialization / de-serialization to the most derived implementation.

Methods handling these events must

  • Accept a StreamingContext object
  • Return void

e.g. In the shopping cart example

[Serializable]
class ShoppingCart
{
  public int productId;

  public decimal price;

  public int quantity;

  [NonSerialized] public decimal total;

  [OnSerializing]

  void CalculateTotal(StreamingContext sc)
  {
    total = price * quantity;
  }

  [OnDeserialized]
  void CheckTotal(StreamingContext sc)
  {
    if (total == 0) { CalculateTotal(sc);}
  }
}

Change serialization based on context

Normally the destination for serialization does not matter, but there may be situations where it is important. For example, typically members that contain information about the current process should not be serialized as they will not make any sense to the de-serializing process. However, this information may be useful if the de-serializing process is the same as that performing the serialization.

The StreamingContext structure provides information about the destination of a serialized object. It provides two properties:

  • Context - reference to object containing user-defined information
  • State - bit flags indicating source / destination of object being serialized / de-serialized
    • CrossProcess - different process on same machine
    • CrossMachine - different process on different machine
    • File - source or destination is a file (do not assume which process will read the file)
    • Persistence - source or destination is a store such as a database, file, etc. (do not assume which process will read the file)
    • Remoting - source or destination is remoting to an unknown location - may be on same machine, may not.
    • Other - source or destination is unknown
    • Close - the object graph is being cloned. The serialization code can assume the same process will de-serialize the data so it is safe to access handles and other unmanaged resources.
    • CrossAppDomain - source or destination is in another AppDomain
    • All - source or destination may be any of the previous contexts. The default value.

If serializing / de-serializing an object and want to provide context information then modify the IFormatter.ContextStreamingContext property before calling the formatters Serialize or Deserialize methods. The property is implemented by both the BinaryFormatter and SoapFormatter classes, by default the Context is set to null and State set to All.

Download