System.Yaml.Serialization.YamlSerizlizer class

日本語版はこちら

YamlSerizlizer class has instance methods Serialize and Deserialize, with which C# native objects can be converted into / from YAML text without any preparations.

What kind of objects can be serialized?

YamlSerializer can serialize / deserialize most of native C# objects, i.e. primitive types (bool, char, int,...), enums, built-in non-primitive types (string, decimal), structures, classes and arrays of these types.

On the other hand, it does not deal with IntPtr (which is a primitive type though) and pointer types (void*, int*, ...) because these types are, by their nature, not persistent.

Classes without default constructor can be deserialized only when the way of activating an instance is explicitly specified by AddActivator<T>(Func<object>).

object obj = new object[]{ 
    null,
    "abc", 
    true, 
    1, 
    (Byte)1,
    1.0, 
    "1",
    new double[]{ 1.1, 2, -3 },
    new string[]{ "def", "ghi", "1" },
    new System.Drawing.Point(1,3), 
    new System.Drawing.SolidBrush(Color.Blue)
};

var serializer = new YamlSerializer();
string yaml = serializer.Serialize(obj);
// %YAML 1.2
// ---
// - null
// - abc
// - True
// - 1
// - !System.Byte 1
// - !!float 1
// - "1"
// - !<!System.Double[]> [1.1, 2, -3]
// - !<!System.String[]>
//   - def
//   - ghi
//   - "1"
// - !System.Drawing.Point 1, 3
// - !System.Drawing.SolidBrush
//   Color: Blue
// ...

object restored;
try {
    restored = YamlSerializer.Deserialize(yaml)[0];
} catch(MissingMethodException) {
    // default constructor is missing for SolidBrush
}

// Let the library know how to activate an instance of SolidBrush.
YamlNode.DefaultConfig.AddActivator<System.Drawing.SolidBrush>(
    () => new System.Drawing.SolidBrush(Color.Black /* dummy */));

// Then, all the objects can be restored correctly.
restored = serializer.Deserialize(yaml)[0];

A YAML document generated by YamlSerializer always have a %YAML directive and explicit document start ("---") and end ("...") marks. This allows several documents to be written in a single YAML stream.

var yaml = "";
var serializer = new YamlSerializer();
yaml += serializer.Serialize("a");
yaml += serializer.Serialize(1);
yaml += serializer.Serialize(1.1);
// %YAML 1.2
// ---
// a
// ...
// %YAML 1.2
// ---
// 1
// ...
// %YAML 1.2
// ---
// 1.1
// ...

object[] objects = serializer.Deserialize(yaml);
// objects[0] == "a"
// objects[1] == 1
// objects[2] == 1.1

Since a YAML stream can consist of multiple YAML documents as above, Deserialize(...) returns an array of object.

Type mapping from native types to YAML's standard types

C# Type YAML Type/Tag
bool !!bool
int !!int
double !!float
string !!str, !!null
N/A !!binary
DateTime !!timestamp
object[] !!seq
Dictionary<object,object> !!map


!!binary is not available in general way. Instead, an array of any value type can be serialized / deserialized with base64 encoding with using YamlSerializeAttribute.

Serializing structures and classes

For structures and classes, by default, all public fields and public properties are serialized. Note that protected / private members are always ignored.

Serialization methods

Readonly value-type members are also ignored because there is no way to assign a new value to them on deserialization, while readonly class-type members are serialized. When deserializing, instead of creating a new object and assigning it to the member, the child members of such class instance are restored independently. Such a deserializing method is refered to YamlSerializeMethod.Content.

On the other hand, when writeable fields / properties are deserialized, new objects are created by using the parameters in the YAML description and assigned to the fields / properties. Such a deserializing method is refered to YamlSerializeMethod.Assign. Writeable properties can be explicitly specified to use YamlSerializeMethod.Content method for deserialization, by adding YamlSerializeAttribute to its definition.

Another type of serializing method is YamlSerializeMethod.Binary. This method is only applicable to an array-type field / property that contains only value-type members.

If serializing method YamlSerializeMethod.Never is specified, the member is never serialized nor deserialized.

public class Test1
{
    public int PublicProp { get; set; }         // processed (by assign)
    protected int ProtectedProp { get; set; }           // Ignored
    private int PrivateProp { get; set; }               // Ignored
    internal int InternalProp { get; set; }             // Ignored

    public int PublicField;                     // processed (by assign)
    protected int ProtectedField;                       // Ignored
    private int PrivateField;                           // Ignored
    internal int InternalField;                         // Ignored

    public List<string> ClassPropByAssign // processed (by assign)
    { get; set; }

    public int ReadOnlyValueProp { get; private set; }  // Ignored
    public List<string> ReadOnlyClassProp // processed (by content)
    { get; private set; }

    [YamlSerialize(YamlSerializeMethod.Content)]
    public List<string> ClassPropByContent// processed (by content)
    { get; set; }

    public int[] IntArrayField =                // processed (by assign)
       new int[10];

    [YamlSerialize(YamlSerializeMethod.Binary)]
    public int[] IntArrayFieldBinary =          // processed (as binary)
       new int[100];

    [YamlSerialize(YamlSerializeMethod.Never)]
    public int PublicPropHidden;                        // Ignored

    public Test1()
    {
        ClassPropByAssign = new List<string>();
        ReadOnlyClassProp = new List<string>();
        ClassPropByContent = new List<string>();
    }
}

public void TestPropertiesAndFields1()
{
   var test1 = new Test1();
   test1.ClassPropByAssign.Add("abc");
   test1.ReadOnlyClassProp.Add("def");
   test1.ClassPropByContent.Add("ghi");
   var rand = new Random(0);
   for ( int i = 0; i < test1.IntArrayFieldBinary.Length; i++ )
       test1.IntArrayFieldBinary[i] = rand.Next();

   var serializer = new YamlSerializer();
   string yaml = serializer.Serialize(test1);
   // %YAML 1.2
   // ---
   // !YamlSerializerTest.Test1
   // PublicProp: 0
   // ClassPropByAssign: 
   //   Capacity: 4
   //   ICollection.Items: 
   //     - abc
   // ReadOnlyClassProp: 
   //   Capacity: 4
   //   ICollection.Items: 
   //     - def
   // ClassPropByContent: 
   //   Capacity: 4
   //   ICollection.Items: 
   //     - ghi
   // PublicField: 0
   // IntArrayField: [0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
   // IntArrayFieldBinary: |+2
   //   Gor1XAwenmhGkU5ib9NxR11LXxp1iYlH5LH4c9hImTitWSB9Z78II2UvXSXV99A79fj6UBn3GDzbIbd9
   //   yBDjAyslYm58iGd/NN+tVjuLRCg3cJBo+PWMbIWm9n4AEC0E7LKXWV5HXUNk7I13APEDWFMM/kWTz2EK
   //   s7LzFw2gBjpKugkmQJqIfinpQ1J1yqhhz/XjA3TBxDBsEuwrD+SNevQSqEC+/KRbwgE6D011ACMeyRt0
   //   BOG6ZesRKCtL0YU6tSnLEpgKVBz+R300qD3/W0aZVk+1vHU+auzyGCGUaHCGd6dpRoEhXoIg2m3+AwJX
   //   EJ37T+TA9BuEPJtyGoq+crQMFQtXj1Zriz3HFbReclLvDdVpZlcOHPga/3+3Y509EHZ7UyT7H1xGeJxn
   //   eXPrDDb0Ul04MfZb4UYREOfR3HNzNTUYGRsIPUvHOEW7AaoplIfkVQp19DvGBrBqlP2TZ9atlWUHVdth
   //   7lIBeIh0wiXxoOpCbQ7qVP9GkioQUrMkOcAJaad3exyZaOsXxznFCA==
   // ...
}

Default values of fields and properties

YamlSerializer is aware of System.ComponentModel.DefaultValueAttribute. So, when a member of a structure / class instance has a value that equals to the default value, the member will not be written in the YAML text.

YamlSerializer also checkes for the result of ShouldSerializeXXX method. For instance, just before serializing Font property of some type, bool ShouldSerializeFont() method is called if exists. If the method returns false, Font property will not be written in the YAML text. ShouldSerializeXXX method can be non-public.

using System.ComponentModel;

public class Test2
{
    [DefaultValue(0)]
    public int Default0 = 0;

    [DefaultValue("a")]
    public string Defaulta = "a";

    public int DynamicDefault = 0;

    bool ShouldSerializeDynamicDefault()
    {
        return Default0 != DynamicDefault;
    }
}

public void TestDefaultValue()
{
    var test2 = new Test2();
    var serializer = new YamlSerializer();

    // All properties have defalut values.
    var yaml = serializer.Serialize(test2);
    // %YAML 1.2
    // ---
    // !YamlSerializerTest.Test2 {}
    // ...


    test2.Defaulta = "b";
    yaml = serializer.Serialize(test2);
    // %YAML 1.2
    // ---
    // !YamlSerializerTest.Test2
    // Defaulta: b
    // ...

    test2.Defaulta = "a";
    var yaml = serializer.Serialize(test2);
    // %YAML 1.2
    // ---
    // !YamlSerializerTest.Test2 {}
    // ...

    test2.DynamicDefault = 1;
    yaml = serializer.Serialize(test2);
    // %YAML 1.2
    // ---
    // !YamlSerializerTest.Test2
    // DynamicDefault: 1
    // ...

    test2.Default0 = 1;
    yaml = serializer.Serialize(test2);
    // %YAML 1.2
    // ---
    // !YamlSerializerTest.Test2
    // Default0: 1
    // ...
}

TypeConverter

If an object has TypeConverterAttribute, the serialization is done by using the type converter.

So, when an instance of System.Drawing.Point is serialized, the result is not like !System.Drawing.Point { x: 1, y: 2 } but like !System.Drawing.Point 1, 2.

Culture

YamlSerializer always uses System.Globalization.CultureInfo.InvariantCulture for encoding / decoding native object into their string expression. So, even when the System.Globalization.CultureInfo.CurrentCulture converts the floating point value 1.234 as 1,234, YamlSerializer convert it as compatible as the YAML's standard.

Collection classes

If an object implements ICollection<T>, IList or IDictionary, the child objects are serialized, as well as its other public members. A pseudproperty ICollection.Items or IDictionary.Entries appears to hold the child objects.

Multitime appearance of a same object

YamlSerializer preserve C# objects' graph structure. Namely, when a same objects are refered to from several points in the object graph, the structure is correctly described in YAML text and restored objects preserve the structure. YamlSerializer can safely manipulate directly / indirectly self refering objects, too.

public class TestClass
{
    public List<TestClass> list = 
        new List<TestClass>();
}

public class ChildClass: TestClass
{
}

void RecursiveObjectsTest()
{
    var a = new TestClass();
    var b = new ChildClass();
    a.list.Add(a);
    a.list.Add(a);
    a.list.Add(b);
    a.list.Add(a);
    a.list.Add(b);
    b.list.Add(a);
    var serializer = new YamlSerializer();
    string yaml = serializer.Serialize(a);
    // %YAML 1.2
    // ---
    // &A !TestClass
    // list: 
    //   Capacity: 8
    //   ICollection.Items: 
    //     - *A
    //     - *A
    //     - &B !ChildClass
    //       list: 
    //         Capacity: 4
    //         ICollection.Items: 
    //           - *A
    //     - *A
    //     - *B
    // ...

    var restored = (TestClass)serializer.Deserialize(yaml)[0];
    Assert.IsTrue(restored == restored.list[0]);
    Assert.IsTrue(restored == restored.list[1]);
    Assert.IsTrue(restored == restored.list[3]);
    Assert.IsTrue(restored == restored.list[5]);
    Assert.IsTrue(restored.list[2] == restored.list[4]);
}

This is not the case if the object is string. Same instances of string are repeatedly written in a YAML text and restored as different instance of string when deserialized, unless the content of the string is extremely long (longer than 999 chars).

// 1000 chars
string long_str =
    "0123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789" +
    "0123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789" +
    "0123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789" +
    "0123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789" +
    "0123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789" +
    "0123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789" +
    "0123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789" +
    "0123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789" +
    "0123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789" +
    "0123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789";
string short_str = "12345";
object obj = new object[] { long_str, long_str, short_str, short_str };
var serializer = new YamlSerializer();
string yaml = serializer.Serialize(obj);
// %YAML 1.2
// ---
// - &A 01234567890123456789012345678901234567890123456789 ... (snip) ... 789
// - *A
// - "12345"
// - "12345"
// ...

YAML text written / read by YamlSerializer

When serializing, YamlSerializer intelligently uses various YAML 1.2 styles, namely the block style, flow style, explicit mapping and implicit mapping, to maximize readability of the YAML stream.

[Flags]
enum TestEnum: uint 
{ 
    abc = 1, 
    あいう = 2 
} 

public void TestVariousFormats()
{
    var dict = new Dictionary<object, object>();
    dict.Add(new object[] { 1, "a" }, new object());
    object obj = new object[]{
        dict,
        null,
        "abc",
        "1",
        "a ",
        "- a",
        "abc\n", 
        "abc\ndef\n", 
        "abc\ndef\n  ghi", 
        new double[]{ 1.1, 2, -3, 3.12, 13.2 },
        new int[,] { { 1, 3}, {4, 5}, {10, 1} },
        new string[]{ "jkl", "mno\npqr" },
        new System.Drawing.Point(1,3),
        TestEnum.abc,
        TestEnum.abc | TestEnum.あいう,
    };
    var config = new YamlConfig();
    config.ExplicitlyPreserveLineBreaks = false;
    var serializer = new YamlSerializer(config);
    string yaml = serializer.Serialize(obj);

    // %YAML 1.2
    // ---
    // - !<!System.Collections.Generic.Dictionary%602[[System.Object,...],[System.Object,...]]>
    //   IDictionary.Entries: 
    //     ? - 1
    //       - a
    //     : !System.Object {}
    // - null
    // - abc
    // - "1"
    // - "a "
    // - "- a"
    // - "abc\n"
    // - |+2
    //   abc
    //   def
    // - |-2
    //   abc
    //   def
    //     ghi
    // - !<!System.Double[]> [1.1, 2, -3, 3.12, 13.2]
    // - !<!System.Int32[,]> [[1, 3], [4, 5], [10, 1]]
    // - !<!System.String[]>
    //   - jkl
    //   - |-2
    //     mno
    //     pqr
    // - !System.Drawing.Point 1, 3
    // - !TestEnum abc
    // - !TestEnum abc, あいう
    // ...
}

When deserializing, YamlSerializer accepts any valid YAML 1.2 documents. TAG directives, comments, flow / block styles, implicit / explicit mappings can be freely used to express valid C# objects. Namely, the members of the array can be given eighter in a flow style or in a block style.

Line breaks in YAML text

By default, YamlSerializer outputs a YAML stream with line break of "\r\n". This can be customized either by setting YamlNode.DefaultConfig.LineBreakForOutput or by giving an customized instance of YamlConfig to the constructor of YamlSerializer.

var serializer = new YamlSerializer();
var yaml = serializer.Serialize("abc");
// %YAML 1.2\r\n    // line breaks are explicitly shown in this example
// ---\r\n
// abc\r\n
// ...\r\n

// Customized configuration
var config = new YamlConfig();
config.LineBreakForOutput = "\n";
serializer = new YamlSerializer(config);
var yaml = serializer.Serialize("abc");
// %YAML 1.2\n
// ---\n
// abc\n
// ...\n

// Change the default behavior
YamlNode.DefaultConfig.LineBreakForOutput = "\n";

var serializer = new YamlSerializer();
serializer = new YamlSerializer();
var yaml = serializer.Serialize("abc");
// %YAML 1.2\n
// ---\n
// abc\n
// ...\n

By default, line breaks in multiline values are explicitly presented as escaped style. Although, this makes it hard to be read, it is necessary to preserve the exact content of the data, because the YAML specification requires a YAML parser to normalize every line break that is not escaped in a YAML document to be a single line feed "\n" when deserializing.

In order to have the YAML documents easy to be read, set YamlConfig.ExplicitlyPreserveLineBreaks false. Then, the multiline values will be written in literal style.

Of course, it causes all the line breaks to be normalized into a single line feeds "\n" when being deserialized.

var serializer = new YamlSerializer();
var text = "abc\r\n  def\r\nghi\r\n";
// abc
//   def
// ghi

// By default, line breaks explicitly appear in escaped form.
var yaml = serializer.Serialize(text);
// %YAML 1.2
// ---
// "abc\r\n\
// \  def\r\n\
// ghi\r\n"
// ...

// Original line breaks are preserved
var restored = (string)serializer.Deserialize(yaml)[0];
// "abc\r\n  def\r\nghi\r\n"


YamlNode.DefaultConfig.ExplicitlyPreserveLineBreaks = false;

// Literal style is easier to be read.
var yaml = serializer.Serialize(text);
// %YAML 1.2
// ---
// |+2
//   abc
//     def
//   ghi
// ...

// Original line breaks are lost.
var restored = (string)serializer.Deserialize(yaml)[0];
// "abc\n  def\nghi\n"

This library offers two work arounds for this problem, although both of which violates the official behavior of a YAML parser defined in the YAML specification.

One is to set YamlConfig.LineBreakForInput to be "\r\n". Then, the YAML parser normalizes all line breaks into "\r\n" instead of "\n".

The other is to set YamlConfig.NormalizeLineBreaks false. It disables the line break normalization both at output and at input. Namely, the line breaks are written and read as-is when serialized / deserialized.

var serializer = new YamlSerializer();

// text with mixed line breaks
var text = "abc\r  def\nghi\r\n"; 
// abc\r        // line breaks are explicitly shown in this example
//   def\n
// ghi\r\n

YamlNode.DefaultConfig.ExplicitlyPreserveLineBreaks = false;

// By default, all line breaks are normalized to "\r\n" when serialized.
var yaml = serializer.Serialize(text);
// %YAML 1.2\r\n
// ---\r\n
// |+2\r\n
//   abc\r\n
//     def\r\n
//   ghi\r\n
// ...\r\n

// When deserialized, line breaks are normalized into "\n".
var restored = (string)serializer.Deserialize(yaml)[0];
// "abc\n  def\nghi\n"

// Line breaks are normalized into "\r\n" instead of "\n" when deserializing.
YamlNode.DefaultConfig.LineBreakForInput = "\r\n";
restored = (string)serializer.Deserialize(yaml)[0];
// "abc\r\n  def\r\nghi\r\n"

// Line breaks are written as is,
YamlNode.DefaultConfig.NormalizeLineBreaks = false;
var yaml = serializer.Serialize(text);
// %YAML 1.2\r\n
// ---\r\n
// |+2\r\n
//   abc\r
//     def\n
//   ghi\r\n
// ...\r\n

// and are read as is.
restored = (string)serializer.Deserialize(yaml)[0];
// "abc\r  def\nghi\r\n"

// Note that when the line breaks of YAML stream is changed 
// between serialization and deserialization, the original
// line breaks are lost.
yaml = yaml.Replace("\r\n", "\n").Replace("\r", "\n");
restored = (string)serializer.Deserialize(yaml)[0];
// "abc\n  def\nghi\n"

It is repeatedly stated that although these two options are useful in many situations, they make the YAML parser violate the YAML specification.

Last edited Oct 15, 2009 at 2:43 PM by osamu, version 26

Comments

No comments yet.