analytics

Wednesday, January 4, 2012

Java Serialization

Serialization involves saving the current state of an object to a stream, and restoring an equivalent object from that stream. The stream functions as a container for the object. Its contents include a partial representation of the object's internal structure, including variable types, names, and values. The container may be transient (RAM-based) or persistent (disk-based). A transient container may be used to prepare an object for transmission from one computer to another. A persistent container, such as a file on disk, allows storage of the object after the current session is finished. In both cases the information stored in the container can later be used to construct an equivalent object containing the same data as the original.

The serialized objects are JVM independent and can be re-serialized by any JVM. In this case the "in memory" java objects state are converted into a byte stream. This type of the file can not be understood by the user. It is a special types of object i.e. reused by the JVM (Java Virtual Machine). This process of serializing an object is also called deflating or marshalling an object. The example code in this article will focus on persistence.

Implementation

For an object to be serialized, it must be an instance of a class that implements either the Serializable or Externalizable interface. Both interfaces only permit the saving of data associated with an object's variables. They depend on the class definition being available to the Java Virtual Machine at reconstruction time in order to construct the object.

The Serializable interface relies on the Java runtime default mechanism to save an object's state. Writing an object is done via the writeObject() method in the ObjectOutputStream class (or the ObjectOutput interface). Writing a primitive value may be done through the appropriate write() method. Reading the serialized object is accomplished using the readObject() method of the ObjectInputStream class, and primitives may be read using the various read() methods.

The Externalizable interface specifies that the implementing class will handle the serialization on its own, instead of relying on the default runtime mechanism. This includes which fields get written (and read), and in what order. The class must define a writeExternal() method to write out the stream, and a corresponding readExternal() method to read the stream. Inside of these methods the class calls ObjectOutputStream writeObject(), ObjectInputStream readObject(), and any necessary write() and read() methods, for the desired fields.


Hiding Data

Sometimes you may wish to prevent certain fields from being stored in the serialized object. The Serializable interface allows the implementing class to specify that some of its fields do not get saved or restored. This is accomplished by placing the keyword transient before the data type in the variable declaration. For example, you may have some data which is confidential and can be re-read from a master file later (as opposed to saving it with the serialized object). Or you decide (wisely) to preserve the privacy of file references by declaring any such variables as transient. Otherwise, all fields automatically get written without any additional effort by the class.

In addition to those fields declared as transient, static fields are not serialized (written out), and so cannot be deserialized (read back in).

Another way to use Serializable, and control which fields get written, is to override the writeObject() method of the Serializable interface. Inside of this method, you are responsible for writing out the appropriate fields. If you take this approach, you will want to override readObject() as well, to control the restoration process. This is similar to using Externalizable, except that interface requires writeExternal() and readExternal().

For the Externalizable interface, since both writeExternal() and readExternal() must be declared public, this increases the risk that a rogue object could use them to determine the format of the serialized object. For this reason, you should be careful when saving object data with this interface.

Versioning

The ability to save and restore objects leads to an interesting question: what happens when an object has been stored for so long, that upon restoration it finds that its format has been superceded by a new, different version of the class?
The stream reading the serialized representation is responsible for accounting for any differences. The intent is that a newer version of a Java class should be able to interoperate with older representations of the same class, as long as there have not been certain changes in the class structure. The same does not necessarily hold true for an older version of the class, which may not be able to effectively deal with a newer representation.

So, we need some way to determine at runtime (or more appropriately, deserialization-time) whether we have the necessary backward compatibility.
In Java 1.1, changes to classes may be specified using a version number. A specific class variable, serialVersionUID (representing the Stream Unique Identifier, or SUID), may be used to specify the earliest version of the class that can be deserialized. The SUID is declared as follows:
  static final long serialVersionUID = 2L;

This particular declaration and assignment specifies that version 2 is as far back as this class can go. It is not compatible with an object written by version 1 of the class, and it cannot write a version 1 object. If it encounters a version 1 object in a stream (such as when restoring from a file), an InvalidClassException will be thrown.
The SUID is a measure of backward compatibility. The same SUID can be used for multiple representations of a class, as long as newer versions can still read the older versions.

If you do not explicitly assign a SUID, a default value will be assigned when the object gets serialized. This default SUID is a hash, or unique numeric value, which is computed using the class name, interfaces, methods, and fields. The exact algorithm is defined by the S ecure Hash Algorithm (SHA).

How can you obtain the SUID for a class at runtime to determine compatibility? First, query the Virtual Machine for information about the class represented in the stream, using methods of the class ObjectStreamClass. Here is how we can get the SUID of the current version of the class named MyClass, as known to the Virtual Machine:

ObjectStreamClass myObject = ObjectStreamClass.lookup(
Class.forName( "MyClass" ) );
long theSUID = myObject.getSerialVersionUID();

Now when we restore an Externalizable object, we can compare its SUID to the class SUID just obtained. If there is a mismatch, we should take appropriate action. This may involve telling the user that we cannot handle the restoration, or we may have to assign and use some default values.

If we are restoring a Serializable object, the runtime will check the SUID for us when it attempts to read values from the stream. If you override readObject(), you will want to compare the SUIDs there.

How do you determine what changes between class versions are acceptable? For an earlier version, which may contain fewer fields, trying to read a serialized object from a later version of the same class may cause problems. There is a tendency to add fields to a class as that class evolves, which means that the earlier version does not know about the newer fields. In contrast, since a newer version of a class may look for fields that are not present in the older version, it assigns default values to those fields.

This can be seen in the example code when we add a new field to the MyVersionObject class, but don't update the SUID. The new class can still read the older stream representation, even though no values exist in that stream for the new fields. It assigns 0 to the new int, and null to the new String, but doesn't throw any exceptions. If we then increment the SUID (from 1 to 2) to indicate that we do not consider older class versions compatible with this version, we throw an InvalidClassException when attempting to read a version 1 object from the stream.

Example

The following program illustrates how to use object serialization and deserialization. It begins by instantiating an object of class MyClass. This object has three instance variables that are of types String, int, anddouble. This is the information we want to save and restore.

A FileOutputStream is created that refers to a file named "serial," and an ObjectOutputStream is created for that file stream. The writeObject() method of ObjectOutputStream is then used to serialize our object. The object output stream is flushed and closed.

A FileInputStream is then created that refers to the file named "serial," and an ObjectInputStream is created for that file stream. The readObject() method of ObjectInputStream is then used to deserialize our object. The object input stream is then closed.

Note that MyClass is defined to implement the Serializable interface. If this is not done, aNotSerializableException is thrown. Try experimenting with this program by declaring some of theMyClass instance variables to be transient. That data is then not saved during serialization.

SerializationDemo.java

import java.io.*; 
public class SerializationDemo { 
public static void main(String args[]) { 
// Object serialization 
try { 
MyClass object1 = new MyClass("Hello", -7, 2.7e10); 
System.out.println("object1: " + object1); 
FileOutputStream fos = new FileOutputStream("serial"); 
ObjectOutputStream oos = new ObjectOutputStream(fos); 
oos.writeObject(object1); 
oos.flush(); 
oos.close(); 
} 
catch(Exception e) { 
System.out.println("Exception during serialization: " + e); 
System.exit(0); 
} 
// Object deserialization 
try { 
MyClass object2; 
FileInputStream fis = new FileInputStream("serial"); 
ObjectInputStream ois = new ObjectInputStream(fis); 
object2 = (MyClass)ois.readObject(); 
ois.close(); 
System.out.println("object2: " + object2); 
} 
catch(Exception e) { 
System.out.println("Exception during deserialization: " + 
e); 
System.exit(0); 
} 
} 
}


MyClass.java

class MyClass implements Serializable { 
String s; 
int i; 
double d; 
public MyClass(String s, int i, double d) { 
this.s = s; 
this.i = i; 
this.d = d; 
} 
public String toString() { 
return "s=" + s + "; i=" + i + "; d=" + d; 
} 
}

This program demonstrates that the instance variables of object1 and object2 are
identical. The output is shown here:

object1: s=Hello; i=-7; d=2.7E10
object2: s=Hello; i=-7; d=2.7E10

No comments:

Post a Comment