Theme index -- Keyboard shortcut: 'u'  Previous theme in this lecture -- Keyboard shortcut: 'p'  Next slide in this lecture -- Keyboard shortcut: 'n'Input and Output Classes

A complete PDF version of the text book is now available. The PDF version is an almost complete subset of the HTML version (where only a few, long program listings have been removed). See here.

39.  Serialization

In this material we care about object-oriented programming. All our data are encapsulated in objects. When we deal with IO it is therefore natural to look for solutions that help us with output and input of objects.

For each class C it is possible to decide a storage format. The storage format of class C tells which pieces of data in C instances to save on secondary storage. The details of the storage format need to be decided. This involves (1) which fields to store, (2) the sequence of fields in the stored representation, and (3) use of a binary or a textual representation. However, as long as we have pairs of WriteObject and ReadObject operations for which ReadObject(WriteObject(C-object)) is equivalent to C-object the details of the storage format are of secondary interest.

Instances of class C may have references to instances of other classes, say D and E. In general, an instance of class C may be part of an object graph in which we find C-object, D-object, E-objects as well as objects of other types. We soon realize that the real problem is not how to store instances of C in isolation. Rather, the problem is how to store an object network in which C-objects take part (or in which a C-object is a root).

People who have devised a storage format for a class C, who have written then WriteObject and ReadObject operations for class C, and who have dealt with the IO problem of object graphs quickly realize that the invented solutions generalizes to arbitrary classes. Thus, instead of solving the object IO problem again and again for specific classes, it is attractive to solve the problem at a general level, and make the solution available for arbitrary classes. This is exactly what serialization is about. The serialization problem has been solved by the implementers of C#. It is therefore easy for the C# programmer to save and retrieve objects via serialization.

39.1 Serialization39.4 Considerations about Serialization
39.2 Examples of Serialization in C#39.5 Serialization and Alternatives
39.3 Custom Serialization39.6 Attributes
 

39.1.  Serialization
Contents   Up Previous Next   Slide Annotated slide Aggregated slides    Subject index Program index Exercise index 

Serialization provides for input and output of a network of objects. Serialization is about object output, and deserialization is about object input.

  • Serialization

    • Writes an object o to a file

    • Also writes the objects referred from o

  • Deserialization

    • Reads a serialized file in order to reestablish the serialized object o

    • Also reestablishes the network of objects originally referred from o

Serialization of objects is, in principle, simple to deal with from C#. There are, however, a couple of circumstances that complicate the matters:

The need to control (customize) the details of serialization and deserialization is unavoidable, at least when the ideas should be applied on real-life examples.

The support of several different techniques for doing serialization is due to the development of C#. In C# 2.0 serialization relies almost exclusively on the use of serialization and deserialization attributes. In C# 1.0 it was also necessary to implement certain interfaces to control and customize the serialization. In this version of the material, we only describe serialization controlled by attributes.

  • Serialization and deserialization is supported via classes that implement the Iformatter interface:

    • BinaryFormatter and SoapFormatter

  • Methods in Iformatter:

    • Serialize and Deserialize

In the following section we will discuss an example that uses BinaryFormatter.

 

39.2.  Examples of Serialization in C#
Contents   Up Previous Next   Slide Annotated slide Aggregated slides    Subject index Program index Exercise index 

Below we show the class Person and class Date, similar to the ones we used for illustration of privacy leaks in Section 16.5. Class Person in Program 39.1 encapsulates a name and two date objects: birth date and death date. For a person still alive, the death date refer to null. Redundantly, the age instance variable holds the age of the person. The Update method can be used to update the age variable.

The Date class shown in Program 39.2 is a very simple implementation of a date class. (In the paper version of the material we only show an outline of the Date class. The complete version is available in the web version). The Person class relies on the Date. We use class Date for illustration of serialization; In real life you should always use the struct DateTime. The Date class encapsulates year, month, and day. In addition it holds a nameOfDay instance variable (with values such as Sunday or Monday), which is redundant. With appropriate calendar knowledge, the nameOfDay can be calculated from year, month, and day. The Person class needs age calculation, which is provided by the YearDiff method of class Date. Internally in class Date, YearDiff relies on the methods IsBefore and Equals. (Equals is defined according the standard recommendations, see Section 28.16. We have not, in this class, included a redefinition of GetHashCode and therefore we get a warning from the compiler when class Date is compiled. )

The redundancy is class Person and class Date is introduced on purpose, because it helps us illustrate the serialization control in Program 39.2. In most circumstances we would avoid such redundancy, at least in simple classes.

The preparation of class Person and class Date for serialization is very simple. We mark both classes with the attribute [Serializable], see line 3 in both classes. As of now you can consider [Serializable] as some magic, special purpose notation. In reality [Serializable] represents application of an attribute. When we are done with serialization we have seen several uses of attributes, and therefore we will be motivated to understand the general ideas of attributes in C#. We discuss the general ideas behind attributes in Section 39.6.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
using System;

[Serializable]
public class Person{

  private string name;
  private int age;    // Redundant
  private Date dateOfBirth, dateOfDeath;

  public Person (string name, Date dateOfBirth){
    this.name = name;
    this.dateOfBirth = dateOfBirth;
    this.dateOfDeath = null;
    age = Date.Today.YearDiff(dateOfBirth);
  }

  public Date DateOfBirth {
    get {return new Date(dateOfBirth);}
  }

  public int Age{
    get {return Alive ? age : dateOfDeath.YearDiff(dateOfBirth);}
  }

  public bool Alive{
    get {return dateOfDeath == null;}
  }

  public void Died(Date d){
    dateOfDeath = d;
  }

  public void Update(){
    age = Date.Today.YearDiff(dateOfBirth);
  }

  public override string ToString(){
    return "Person: " + name + 
            "  *" + dateOfBirth + 
            (Alive ? "" : "  +" + dateOfDeath) +
            "  Age: " + age;
  }

}
Program 39.1    The Person class - Serializable.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
using System;

[Serializable]
public class Date{
  private ushort year;
  private byte month, day;
  private DayOfWeek nameOfDay;    // Redundant

  public Date(int year, int month, int day){
    this.year =  (ushort)year; 
    this.month = (byte)month; 
    this.day =   (byte)day;
    this.nameOfDay = (new DateTime(year, month, day)).DayOfWeek;
  }

  public Date(Date d){
    this.year = d.year; this.month = d.month; 
    this.day = d.day; this.nameOfDay = d.nameOfDay;
  }

  public int Year{get{return year;}}
  public int Month{get{return month;}}
  public int Day{get{return day;}}

  // return this minus other, as of usual birthday calculations.
  public int YearDiff(Date other){
    if (this.Equals(other))
       return 0;
    else if ((new Date(other.year, this.month, this.day)).IsBefore(other))
      return this.year - other.year - 1; 
    else
      return this.year - other.year;
  }

  public override bool Equals(Object obj){
     if (obj == null)
       return false;
     else if (this.GetType() != obj.GetType())
       return false;
     else if (ReferenceEquals(this, obj))
       return true;
     else if (this.year == ((Date)obj).year &&
              this.month == ((Date)obj).month &&
              this.day == ((Date)obj).day)
       return true;
     else return false;
   }

  // Is this date less than other date
  public bool IsBefore(Date other){
    return 
      this.year < other.year ||
      this.year == other.year && this.month < other.month ||
      this.year == other.year && this.month == other.month && this.day < other.day;
  }


  public static Date Today{
    get{
      DateTime now = DateTime.Now;
      return new Date(now.Year, now.Month, now.Day);}
  }

  public override string ToString(){
    return string.Format("{0} {1}.{2}.{3}", nameOfDay, day, month, year);
  }
}
Program 39.2    The Date class - Serializable.

In Program 39.3 it is illustrated how to serialize and deserialize a graph of objects. The graph, which we serialize, consists of one Person and the two Date objects referred by the Person object. The serialization, which takes place in line 13-17, is done by sending the Serialize message to the BinaryFormatter object. The serialization relies on a binary stream, as represented by an instance of class FileStream, see Section 37.4.

The deserialization, as done in line 24-28, will in most real-life settings be done in another program. In our example we reset the program state in line 19-22 before the deserialization. The actual deserialization is done by sending the Deserialize message to the BinaryFormatter object. As in the serialization, the file stream with the binary data, is passed as a parameter.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
using System;
using System.IO;
using System.Runtime.Serialization;
using System.Runtime.Serialization.Formatters.Binary;

class Client{

  public static void Main(){
    Person p = new Person("Peter", new Date(1936, 5, 11));
    p.Died(new Date(2007,5,10));
    Console.WriteLine("{0}", p);

    using (FileStream strm = 
               new FileStream("person.dat", FileMode.Create)){
      IFormatter fmt = new BinaryFormatter();
      fmt.Serialize(strm, p);
    }

    // -----------------------------------------------------------
    p = null;
    Console.WriteLine("Reseting person");
    // -----------------------------------------------------------
    
    using (FileStream strm = 
               new FileStream("person.dat", FileMode.Open)){
      IFormatter fmt = new BinaryFormatter();
      p = fmt.Deserialize(strm) as Person;
    }

    Console.WriteLine("{0}", p);
  }

}
Program 39.3    The Person client class - applies serialization and deserialization.

The program output shown in Listing 39.4 tells that the Person object and the two Date objects have survived the serialization and deserialization processes. In between the two output lines in line 11 and line 30 of Program 39.3 the three objects have been transferred to and reestablished from the binary file.

1
2
3
Person: Peter  *Monday 11.5.1936  +Thursday 10.5.2007  Age: 71
Reseting person
Person: Peter  *Monday 11.5.1936  +Thursday 10.5.2007  Age: 71
Listing 39.4    Output of the Person client class.


Exercise 10.5. Serializing with an XML formatter

In the programs shown on the accompanying slide we have used a binary formatter for serialization of Person and Date object.

Modify the client program to use a so-called Soap formatter in the namespace System.Runtime.Serialization.Formatters.Soap. SOAP is an XML language intended for exchange of XML documents. SOAP is related to the discipline of web services in the area of Internet technology.

After the serialization you should take a look at the file person.dat, which is written and read by the client program.

Solution


 

39.3.  Custom Serialization
Contents   Up Previous Next   Slide Annotated slide Aggregated slides    Subject index Program index Exercise index 

In the Person and Date classes, shown in Section 39.2, the redundant instance variables do not need to be serialized. In class Person, age does need to be serialized because it can be calculated from dateOfBirth and dateOfDeath. In class Date, nameOfDay does need to serialized because it can calculated from calendar knowledge. In relation to serialization and persistence, we say that these two instance variables are transient. It is sufficient to serialize the essential information, and to reestablish the values of the transient instance variables after deserialization. In Program 39.5 and Program 39.6 we show the serialization and the deserialization respectively.

The serialization is controlled by marking some fields (instance variables) as [NonSerialized], see line 9 of Program 39.5 and line 9 of Program 39.6.

The deserialization is controlled by a method marked with the attribute [OnDeserialized()], see line 21 of Program 39.5. This method is called when deserialization takes place. The method starting at line 21 of Program 39.5 assigns the redundant age variable of a Person object.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
using System;
using System.Runtime.Serialization;

[Serializable]
public class Person{

  private string name;

  [NonSerialized()]
  private int age;    

  private Date dateOfBirth, dateOfDeath;

  public Person (string name, Date dateOfBirth){
    this.name = name;
    this.dateOfBirth = dateOfBirth;
    this.dateOfDeath = null;
    age = Date.Today.YearDiff(dateOfBirth);
  }

  [OnDeserialized()] 
  internal void FixPersonAfterDeserializing(
                           StreamingContext context){
    age = Date.Today.YearDiff(dateOfBirth);
  }

  public Date GetDateOfBirth(){
    return new Date(dateOfBirth);
  }

  public int Age{
    get {return Alive ? age : dateOfDeath.YearDiff(dateOfBirth);}
  }

  public bool Alive{
    get {return dateOfDeath == null;}
  }

  public void Died(Date d){
    dateOfDeath = d;
  }

  public void Update(){
    age = Date.Today.YearDiff(dateOfBirth);
  }

  public override string ToString(){
    return "Person: " + name + 
            "  *" + dateOfBirth + 
            (Alive ? "" : "  +" + dateOfDeath) +
            "  Age: " + age;
  }

}
Program 39.5    The Person class - Serialization control with attributes.

The Date class shown below in Program 39.6 follows the same pattern as the Person class of Program 39.5.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
using System;
using System.Runtime.Serialization;

[Serializable]
public class Date{
  private ushort year;
  private byte month, day;

  [NonSerialized()]
  private DayOfWeek nameOfDay;

  public Date(int year, int month, int day){
    this.year =  (ushort)year; 
    this.month = (byte)month; 
    this.day =   (byte)day;
    this.nameOfDay = (new DateTime(year, month, day)).DayOfWeek;
  }

  public Date(Date d){
    this.year = d.year; this.month = d.month; 
    this.day = d.day; this.nameOfDay = d.nameOfDay;
  }

  [OnDeserialized()]
  internal void FixDateAfterDeserializing(
                                   StreamingContext context){
    nameOfDay = (new DateTime(year, month, day)).DayOfWeek;    
  }

  public int Year{get{return year;}}
  public int Month{get{return month;}}
  public int Day{get{return day;}}

  // return this minus other, as of usual birthday calculations.
  public int YearDiff(Date other){
    if (this.Equals(other))
       return 0;
    else if ((new Date(other.year, this.month, this.day)).IsBefore(other))
      return this.year - other.year - 1; 
    else
      return this.year - other.year;
  }

  public override bool Equals(Object obj){
     if (obj == null)
       return false;
     else if (this.GetType() != obj.GetType())
       return false;
     else if (ReferenceEquals(this, obj))
       return true;
     else if (this.year == ((Date)obj).year &&
              this.month == ((Date)obj).month &&
              this.day == ((Date)obj).day)
       return true;
     else return false;
   }

  // Is this date less than other date
  public bool IsBefore(Date other){
    return 
      this.year < other.year ||
      this.year == other.year && this.month < other.month ||
      this.year == other.year && this.month == other.month && this.day < other.day;
  }


  public static Date Today{
    get{
      DateTime now = DateTime.Now;
      return new Date(now.Year, now.Month, now.Day);}
  }

  public override string ToString(){
    return string.Format("{0} {1}.{2}.{3}", nameOfDay, day, month, year);
  }
}
Program 39.6    The Date class - Serialization control with attributes .

 

39.4.  Considerations about Serialization
Contents   Up Previous Next   Slide Annotated slide Aggregated slides    Subject index Program index Exercise index 

We want to raise a few additional issues about serialization:

  • Security

    • Encapsulated and private data is made available in files

  • Versioning

    • The private state of class C is changed

    • It may not be possible to read serialized objects of type C

  • Performance

    • Some claim that serialization is relatively slow

 

39.5.  Serialization and Alternatives
Contents   Up Previous Next   Slide Annotated slide Aggregated slides    Subject index Program index Exercise index 

As mentioned in the introduction of this chapter - Chapter 39 - serialization deals with input and output of objects and object graphs. It should be remembered, however, that there are alternatives to serialization. As summarized below, it is possible to program object IO at a low level (using binary of textual IO primitives from Chapter 37). At the other end of the spectrum it is possible us database technology.

  • Serialization

    • An easy way to save and restore objects in between program sessions

    • Useful in many projects where persistency is necessary, but not a key topic

    • Requires only little programming

  • Custom programmed file IO

    • Full control of object IO

    • May require a lot of programming

  • Objects in Relational Databases

    • Impedance mismatch: "Circular objects in retangular boxes"

    • Useful when the program handles large amounts of data

    • Useful if the data is accessed simultaneous from several programs

    • Not a topic in this course

 

39.6.  Attributes
Contents   Up Previous Next   Slide Annotated slide Aggregated slides    Subject index Program index Exercise index 

In our treatment of serialization we made extensive use of attributes, see for instance Section 39.3. In this section we will discuss attributes at a more general level, and independent of serialization.

Attributes offer a mechanism that allows the programmer to extend the programming language in simple ways. Attributes allow the programmer to associate extra information (meta data) to selected and pre-defined constructs in C#. The constructs to which it is possible to attach attributes are assemblies, classes, structs, constructors, delegates, enumeration types, fields (variables), events, methods, parameters, properties, and returns.

We all know that members of a class in C# have associated visibility modifiers, see Section 11.16. In case visibility modifiers were not part of C#, we could have used attributes as a way to extend the language with different kinds of member visibilities. Certain attributes can be accessed by the compiler, and hereby these attributes can affect the checking done by the compiler and the code generated by the compiler. Attributes can also be accessed at run-time. There are ways (using reflection) for the running program to access the attributes of given constructs, such that the attribute and attribute values can affect the program execution.

Program 39.7 illustrates the use of the predefined Obsolete attribute. Being "obsolete" means "no longer in use". In line 3, the attribute is associated with class C. In line 9, another usage of the attribute is associated with method M in class D.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
using System;

[Obsolete("Use class D instead")]
class C{
  // ...
}

class D{
  [Obsolete("Do not call this method.",true)]
  public void M(){
  }
}

class E{
  public static void Main(){
    C c = new C();
    D d = new D();
    d.M();
  }
}
Program 39.7    An obsolete class C, and a class D with an obsolete method M.

The compiler is aware of the Obsolete attribute. When we compile Program 39.7 we can see the effect of the attribute, see Listing 39.8. Use of the Obsolete method M in class D leads to a compile-time error, because the second parameter in line 9 of the Obsolete clause in Program 39.7 is true. If false is used instead, we will only get a warning.

1
2
3
4
5
6
7
8
>csc prog.cs
Microsoft (R) Visual C# 2005 Compiler version 8.00.50727.42
for Microsoft (R) Windows (R) 2005 Framework version 2.0.50727
Copyright (C) Microsoft Corporation 2001-2005. All rights reserved.

prog.cs(16,5): warning CS0618: 'C' is obsolete: 'Use class D instead'
prog.cs(16,15): warning CS0618: 'C' is obsolete: 'Use class D instead'
prog.cs(18,5): error CS0619: 'D.M()' is obsolete: 'Do not call this method.'
Listing 39.8    Compiling class C, D, and E.

C# comes with a lot of predefined attributes. Obsolete is one of them, and we encountered quite a few in Section 39.3 in the context of serialization. Unit testing frameworks for C# also heavily rely on attributes.

It is also possible to define our own attributes. An attribute is defined as a class. Attributes defined in this way are subclasses of the class System.Attribute. As a naming convention, the names of all attribute classes should have "Attribute" as a suffix. Thus, an attribute X is defined by a class XAttribute, which inherits from the class System.Attribute. The attribute usage notation [X(a,b,c)] in front of some C# construct C causes an instance of class XAttribute, made with the appropriate three-parameter constructor, to be associated with C. In the attribute usage notation [X(a,b,c,d=e)] d refers to a property of class XAttribute. The property d must be read-write (both gettable and settable), see Section 18.5. Thus, as it appears, an attribute accepts both positional parameters and keyword parameters.

Below, in Program 39.9 we have reproduced the class behind the Obsolete attribute. You should notice the three different constructors and the read/write property IsError. The attribute AttributeUsage attribute in 5-6 illustrates how attributes help define attributes. AttributeUsage define the constructs to which it possible to associate the MyObsolete attribute. The expression AttributeTargets.Method | AttributeTargets.Property denotes two values in the combined enumeration type AttributeTargets which carries a so-called flag attribute. Combined enumerations are discussed in Focus box 6.3.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
// In part, reproduced from the book "C# to the Point"

using System;

[AttributeUsage(AttributeTargets.Method | 
                AttributeTargets.Property)]
public sealed class MyObsoleteAttribute: Attribute{
  string message;
  bool isError;

  public string Message{
    get {
      return message;
    }
  }

  public bool IsError{
    get {
      return isError;
    }
    set {
      isError = value;
    }
  }

  public MyObsoleteAttribute(){
    message = ""; isError = false;
  }

  public MyObsoleteAttribute(string msg){
    message = msg; isError = false;
  }

  public MyObsoleteAttribute(string msg, bool error){
    message = msg; isError = error;
  }

}
Program 39.9    A reproduction of class ObsoleteAttribute.

In Program 39.10 we show a sample use of the attribute programmed in Program 39.9. The program does not compile because we attempt to associate the MyObsolete attribute to a class in line 3. As explained above, we have restricted MyObsolete to be connected with only methods and properties.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
using System;

[MyObsolete("Use class D instead")]
class C{
  // ...
}

class D{
  [MyObsolete("Do not call this method.",IsError=true)]
  public void M(){
  }
}

class E{
  public static void Main(){
    C c = new C();
    D d = new D();
    d.M();
  }
}
Program 39.10    Sample usage of the reproduced class - causes a compilation error.

Generated: Monday February 7, 2011, 12:20:06
Theme index -- Keyboard shortcut: 'u'  Previous theme in this lecture -- Keyboard shortcut: 'p'  Next slide in this lecture -- Keyboard shortcut: 'n'