fastJSON

Preface

The code is now on :

Introduction

This is the smallest and fastest polymorphic JSON serializer, smallest because it's only 25kb when compiled, fastest because most of the time it is (see performance test section) and polymorphic because it can serialize and deserialize the following situation correctly at run-time with what ever object you throw at it:
class animal { public string Name { get; set;} }
class cat: animal { public int legs { get; set;} }
class dog : animal { public bool tail { get; set;} }
class zoo { public List<animal> animals { get; set;} }

var zoo1 = new zoo();

zoo1.animals = new List<animal>();
zoo1.animals.Add(new cat());
zoo1.animals.Add(new dog());
This is a very important point because it simplifies your coding immensely and is a cornerstone of object orientated programming, strangely few serializers handle this situation, even the XmlSerializer in .NET doesn't do this and you have to jump through hoops to get it to work. Also this is a must if you want to replace the BinaryFormatter serializer which what most transport protocols use in applications and can handle any .NET object structure (see my WCF Killer article).

The What and Why of JSON

JSON (Java Script Object Notation) is a text or human readable format invented by Douglas Crockford around 1999 primarily as a data exchange format for web applications (see www.JSON.org). The benefits of which are ( in regards to XML which was used before):
  • Structured data format like XML
  • High signal to noise ratio in other words it does away with extra characters which are not conclusive to the data ( angle brackets and slashes in XML)
  • Compact data format
  • Simple parsing rules which makes the processing of data easy and fast
So its good for the following scenarios:
  • Data exchange between same or different platforms like Java, .NET services over the wire.
  • Data storage: MongoDB (www.mongodb.org) uses JSON as an internal storage format.

Features of this implementation

  • Just 3 classes + 2 helpers : 1158 lines of code
  • JSON standard compliant with the following additions
    • "$type" is used to denote object type information [ Json.NET does this as well ].
    • "$schema" is used to denote the dataset schema information
    • "$map" is used for post processing runtime types when assigned to the object type.
    • "$types" is used for global type definition where the instances reference this dictionary of types via a number ( reduces JSON size for large number of embedded types)
  • Works on .NET 2.0+ : some implementations in the list of alternatives below require at least .NET 3.5
  • Extremely small size : 25kb when compiled
  • Blazingly fast (see the performance tests section)
  • Can dynamically create types
  • Handles Guid, Dataset, Dictionary, Hashtable and Generic lists
  • Handles Nullable types
  • Handles byte arrays as base64 strings
  • Handles polymorphic collections of objects
  • Thread safe
  • Handles value type arrays (e.g. int[] char[] etc.)
  • Handles value type generic lists (e.g. List<int> etc.)
  • Handles special case List<object[]> (useful for bulk data transfer)
  • Handles Embedded Classes (e.g. Sales.Customer)
  • Handles polymorphic object type deserialized to original type (e.g object ReturnEntity = Guid, DataSet, valuetype, new object[] { object1, object2 } ) [needed for wire communications].
  • Ability to disable extensions when serializing for the JSON purists (e.g. no $type, $map in the output).
  • Ability to deserialize standard JSON into a type you give to the deserializer, no polymorphism is guaranteed.
  • Special case optimized output for Dictionary<string,string>.
  • Override null value outputs.
  • Handles XmlIgnore attributes on properties.
  • Datatable support.
  • Indented JSON output via IndentOutput property.
  • Support for SilverLight 4.0+.
  • RegisterCustomType() for user defined and non-standard types that are not built into fastJSON (likeTimeSpan, Point, etc.).
    • This feature must be enabled via the CUSTOMTYPE compiler directive as there is about a 1% performance hit.
    • You supply the serializer and deserializer routines as delegates.
  • Added support for public Fields.
  • Added ShowReadOnlyProperties to control the output of readonly properties (default is false = won't be outputted).
  • Automatic UTC datetime conversion if the date ends in "Z" (JSON standard compliant now).
  • Added UseUTCDateTime property to control the output of UTC datetimes.
  • Dictionary<string, > are now stored optimally not in K V format.
  • Support for Anonymous Types in the serializer (deserializer is not possible at the moment)
  • Support for dynamic types
  • Support for circular referneces in object structures
  • Support for multi dimensional arrays i.e. int[][]

Limitations

  • Currently can't deserialize value type array properties (e.g. int[] char[] etc.)
  • Currently can't handle multi dimensional arrays.
  • Silverlight 4.0+ support lacks HashTable, DataSet, DataTable as it is not part of the runtime.

What's out there

In this section I will discuss some of the JSON alternatives that I have personally used. Although I can't say it is a comprehensive list, it does however showcase the best of what is out there.

XML

If you are using XML, then don't. It's too slow and bloated, it does deserve an honorable mention as being the first thing everyone uses, but seriously don't. It's about 50 times slower than the slowest JSON in this list. The upside is that you can convert to and from JSON easily.

BinaryFormatter

Probably the most robust format for computer to computer data transfer. It has a pretty good performance although some implementation here beat it.
ProsCons
  • Can handle anything with a Serializable attribute on it
  • Pretty compact output
  • Version unfriendly : must be deserialized into the exact class that was serialized
  • Not good for storing of data because of the versioning problem
  • Not human readable
  • Not for communication outside of the same platform (e.g. both sides must be .NET)

Json.NET

The most referenced JSON serializer for the .NET framework is Json.NET from (http://JSON.codeplex.com/) and the blog site (http://james.newtonking.com/pages/JSON-net.aspx). It was the first JSON implementation I used in my own applications.
ProsCons
  • Robust output which can handle datasets
  • First implementation I saw which could handle polymorphic object collections
  • Large dll size ~320kb
  • Slow in comparison to the rest in the list
  • Source code is hard to follow as it is large

LitJSON

I had to look around a lot to find this gem (http://litjson.sourceforge.NET/), which is still at version 0.5 since 2007. This was what I was using before my own implementation and it replaced the previous JSON serializer which was Json.NET. Admittedly I had to change the original to fit the requirements stated above.
ProsCons
  • Can do all that Json.NET does (after my changes).
  • Small dll size ~57kb
  • Relatively fast
  • Didn't handle datasets in the original source code ( I wrote it my self afterwards in my own application)
  • The lexer class is difficult to follow
  • Requires .NET 3.5 ( Got around this limitation by implementing a Linqbridge class which works with .NET 2.0)

ServiceStack Serializer

An amazingly fast JSON serializer from Demis Bellot found at (http://www.servicestack.NET/mythz_blog/?p=344). The serializer speed is astonishing, although it does not support what is needed from the serializer. I have included it here as a measure of performance.
ProsCons
  • Amazingly fast serializer
  • Pretty small dll size ~91kb
  • Can't handle polymorphic object collections
  • Requires at least .NET 3.5
  • Fails on Nullable types
  • Fails on Datasets
  • Fails on other "exotic" types like dictionaries, hash tables etc.

Microsoft Json Serializer (v1.7 update)

By popular demand and my previous ignorance about the Microsoft JSON implementation and thanks to everyone who pointed this out to me, I have added this here.
ProsCons
  • Included in the framework
  • Can serialize basic polymorphic objects
  • Can't deserialize polymorphic objects
  • Fails on Datasets
  • Fails on other "exotic" types like dictionaries, hash tables etc.
  • 4x slower that fastJSON in serialization

Using the code

To use the code do the following:
// to serialize an object to string
string jsonText = fastJSON.JSON.Instance.ToJSON(c);

// to deserialize a string to an object
var newobj = fastJSON.JSON.Instance.ToObject(jsonText);
The main class is JSON which is implemented as a singleton so it can cache type and property information for speed.

Additions in v1.7.5

// you can set the defaults for the Instance which will be used for all calls
JSON.Instance.UseOptimizedDatasetSchema = true; // you can control the serializer dataset schema
JSON.Instance.UseFastGuid = true;               // enable disable fast GUID serialization
JSON.Instance.UseSerializerExtension = true;    // enable disable the $type and $map inn the output

// you can do the same as the above on a per call basis
public string ToJSON(object obj, bool enableSerializerExtensions)
public string ToJSON(object obj, bool enableSerializerExtensions, bool enableFastGuid)
public string ToJSON(object obj, bool enableSerializerExtensions, bool enableFastGuid, bool enableOptimizedDatasetSchema)

// Parse will give you a Dictionary<string,object> with ArrayList representation of the JSON input
public object Parse(string json)

// if you have disabled extensions or are getting JSON from other sources then you must specify
// the deserialization type in one of the following ways
public T ToObject<T>(string json)
public object ToObject(string json, Type type)

Additions v1.7.6

JSON.Instance.SerializeNullValues = true;    // enable disable null values to output

public string ToJSON(object obj, bool enableSerializerExtensions, bool enableFastGuid, bool enableOptimizedDatasetSchema, bool serializeNulls)

Additions v1.8

For all those who requested why there is no support for type "X", I have implemented a open closed principal extension to fastJSON which allows you to implement your own routines for types not supported without going through the code.
To allow this extension you must compile with CUSTOMTYPE compiler directive as there is a performance hit associated with it.
public void main()
{
     fastJSON.JSON.Instance.RegisterCustomType(typeof(TimeSpan), tsser, tsdes);
     // do some work as normal
}

private static string tsser(object data)
{
     return ((TimeSpan)data).Ticks.ToString();
}

private static object tsdes(string data)
{
     return new TimeSpan(long.Parse(data))
}

Performance Tests

All test were run on the following computer:
  • AMD K625 1.5Ghz Processor
  • 4Gb Ram DDR2
  • Windows 7 Home Premium 64bit
  • Windows Rating of 3.9
The tests were conducted under three different .NET compilation versions
  • .NET 3.5
  • .NET 4 with processor type set to auto
  • .NET 4 with processor type set to x86
The Excel screen shots below are the results of these test with the following descriptions:
  • The numbers are elapsed time in milliseconds.
  • The more red the background the slower the times
  • The more green the background the faster the times.
  • 5 tests were conducted for each serializer.
  • The "AVG" column is the average for the last 4 tests excluding the first test which is basically the serializer setting up its internal caching structures, and the times are off.
  • The "min" row is the minimum numbers in the respective columns below.
  • The Json.NET serializer was tested with two version of 3.5r6 and 4.0r1 which is the current one.
  • "bin" is the BinaryFormatter tests which for reference.
  • The test structure is the code below which is a 5 time loop with an inner processing of 1000 objects.
  • Some data types were removed from the test data structure so all serializers could work.

The test code template

The following is the basic test code template, as you can see it is a loop of 5 tests of what we want to test each done count time (1000 times). The elapsed time is written out to the console with tab formatting so you can pipe it to a file for easier viewing in an Excel spreadsheet.
int count = 1000;
private static void fastjson_serialize()
{
    Console.WriteLine();
    Console.Write("fastjson serialize");
    for (int tests = 0; tests < 5; tests++)
    {
        DateTime st = DateTime.Now;
        colclass c;
        string jsonText = null;
        c = CreateObject();
        for (int i = 0; i < count; i++)
        {
            jsonText = fastJSON.JSON.Instance.ToJSON(c);
        }
        Console.Write("\t" + DateTime.Now.Subtract(st).TotalMilliseconds + "\t");
    }
}

The test data structure

The test data are the following classes which show the polymorphic nature we want to test. The "colclass" is a collection of these data structures. In the attached source files more exotic data structures like Hashtables, Dictionaries, Datasets etc. are included.
[Serializable()]
public class baseclass
{
    public string Name { get; set; }
    public string Code { get; set; }
}

[Serializable()]
public class class1 : baseclass
{
    public Guid guid { get; set; }
}

[Serializable()]
public class class2 : baseclass
{
    public string description { get; set; }
}

[Serializable()]
public class colclass
{
    public colclass()
    {
        items = new
List<baseclass>();
        date = DateTime.Now;
        multilineString = @"
        AJKLjaskljLA
   ahjksjkAHJKS
   AJKHSKJhaksjhAHSJKa
   AJKSHajkhsjkHKSJKash
   ASJKhasjkKASJKahsjk
        ";
        gggg = Guid.NewGuid();
        //hash = new Hashtable();
        isNew = true;
        done= true;
    }
    public bool done { get; set; }
    public DateTime date {get; set;}
    //public DataSet ds { get; set; }
    public string multilineString { get; set; }
    public List<baseclass> items { get; set; }
    public Guid gggg {get; set;}
    public decimal? dec {get; set;}
    public bool isNew { get; set; }
    //public Hashtable hash { get; set; }

} 

.NET 3.5 Serialize

  • fastJSON is second place in this test by a margin of nearly 35% slower than Stacks.
  • fastJSON is nearly 2.9x faster than binary formatter.
  • Json.NET is nearly 1.9x slower in the new version 4.0r1 against its previous version of 3.5r6
  • Json.NET v3.5r6 is nearly 20% faster than binary formatter.

.NET 3.5 Deserialize

  • fastJSON is first place in this test to Stacks by a margin of 10%.
  • fastJSON is nearly 4x faster than nearest other JSON.
  • Json.NET is nearly 1.5x faster in version 4.0r1 than its previous version of 3.5r6

.NET 4 Auto Serialize

  • fastJSON is first place in this test by a margin of nearly 20% against Stacks.
  • fastJSON is nearly 4.9x faster than binary formatter.
  • Json.NET v3.5r6 is on par with binary formatter.

.NET 4 Auto Deserialize

  • fastJSON is first place by a margin of 11%.
  • fastJSON is 1.7x faster than binary formatter.
  • Json.NET v4 1.5x faster than its previous version.

.NET 4 x86 Serialize

  • fastJSON is first place in this test by a margin of nearly 21% against Stacks.
  • fastJSON is 4x faster than binary formatter.
  • Json.NET v3.5r6 1.7x faster than the previuos version.

.NET 4 x86 Deserialize

  • fastJSON is first place by a margin of 5% against Stacks.
  • fastJSON is 1.7x faster than binary formatter which is third.

Exotic data type tests

In this section we will see the performance results for exotic data types like datasets, hash tables, dictionaries, etc.. The comparison is between fastJSON and the BinaryFormatter as most of the other serializers can't handle these data types. These include the following:
  • Datasets
  • Nullable types
  • Hashtables
  • Dictionaries
fastJSON/exotic.png
  • fastJSON is 5x faster than BinaryFormatter in serialization
  • fastJSON is 20% faster than BinaryFormatter in deserialization
  • Datasets are performance killers by a factor of 10

Performance Conclusions

  • fastJSON is faster in all test except the when running the serializer under .NET 3.5 for which Stacks is faster by only 35% (note must be made that Stacks is not polymorphic and can't handle all types so it is not outputting data correctly within the tests).
  • .NET 4 is faster than .NET 3.5 by around 15% in these test except for the fastJSON serializer which is 90% faster..
  • You can replace BinaryFormatter with fastJSON with a huge performance boost ( this lean way lends it self to compression techniques on the text output also).
  • Start up costs for fastJSON is on average 2x faster than Stacks and consistently faster than everyone else.

Performance Conclusions v1.4

fastJSON/v1.4.png
As you can see from the above picture v1.4 is noticably faster. The speed boost make fastJSON faster than SerializerStack in all tests even on .net v3.5.
  • fastJSON serializer is 6.7x faster than binary with a dataset.
  • fastJSON deserializer is 2.1x faster than binary with a dataset.
  • fastJSON serializer is 6.9x faster than binary without a dataset.
  • fastJSON deserializer is 1.6x faster than binary without a dataset.

Performance Conclusions v1.5

fastJSON/v1.5.png
  • The numbers speak for themselves fastJSON serializer 6.65x faster without dataset and 6.88x faster than binary, the deserializer is 2.7x faster than binary.
  • The difference in numbers in v1.5 which is slower than v1.4 is because of extra properties in the test for Enums etc.

Performance Conclusions v1.6

fastJSON/v1.6.png
  • Guid are 2x faster now with base64 encoding you can revert back to old style with the UseFastGuid = false on the JSON.Instance
  • Datasets are ~40% smaller and ~35% faster.
  • fastJSON serializer is now ~2.3x faster than deserializer and the limit seems to be 2x.

Performance Conclusions v1.7

fastJSON/v1.7.png
  • int, long parse are 4x faster.
  • unicode string optimizations, reading and writing non english strings are faster.
  • ChangeType method optimized
  • Dictionary optimized using TryGetValue

Points of Interest

I did a lot of performance tuning with a profiler and here are my results:
  • Always use a StringBuilder and never strings concats.
  • Never do the following stringbuilder.append("string1 + "string2") because it kills performance, replace it with two stringbuilder appends. This point blew my mind and was 50% faster in my tests with the profiler.
  • Never give the stringbuilder a capacity value to start with e.g. var stringbuilder = new StringBuilder(4096); . Strange but it is faster without it.
  • I tried replacing the StringBuiler with a MemoryStream but it was too slow (100% slower).
  • The simplest and the most direct way is probably the fastest as well, case in point reading values as opposed to lexer parser implementations.
  • Always use cached reflection properties on objects.

Appendix v1.9.8

Some reformatting was done to make the use of fastJSON easier in this release which will break some code but is ultimately better in the long run. To use the serializer in this version you can do the following :
// per call customization of the serializer
string str = fastJSON.JSON.Instance.ToJSON(obj, 
                 new fastJSON.JSONParamters { EnableAnonymousTypes = true }); // using the parameters

fastJSON.JSON.Instance.Parameters.UseExtensions = false; // set globally
This removes a lot of the ToJSON overloads and gives you more readable code.
Also in this release support for anonymous types has been added, this will give you a JSON string for the type, but deserialization is not possible at the moment since anonymous types are compiler generated.
DeepCopy has been added which allows you to create an exact copy of your objects which is useful for business application rollback/cancel semantics.

Appendix v2.0.0

Finally got round to adding Unit Tests to the project (mostly because of some embarrassing bugs that showed up in the changes), hopefully the tests cover the majority of use cases, and I will add more in the future.
Also by popular demand you can now deseialize root level basic value types, Lists and Dictionaries. So you can use the following style code :
var o = fastJSON.JSON.Instance.ToObject<List<Retclass>>(s); // return a generic list

var o = fastJSON.JSON.Instance.ToObject<Dictionary<Retstruct, Retclass>>(s); // return a dictionary
A breaking change in this version is the Parse() method now returns number formats as long and decimalnot string values, this was necessary for array returns and compliance with the json format (keep the type information in the original json, and not loose it to strings). So the following code is now working :
List<int> ls = new List<int>();
ls.AddRange(new int[] { 1, 2, 3, 4, 5, 10 }); 
var s = fastJSON.JSON.Instance.ToJSON(ls);
var o = fastJSON.JSON.Instance.ToObject(s); // long[] {1,2,3,4,5,10}
Be aware that if you do not supply the type information the return will be longs not ints. To get what you expect use the following style code:
var o = fastJSON.JSON.Instance.ToObject<List<int>>(s); // you get List<int>
Check the unit test project for sample code regarding the above cases.

Appendix v2.0.3 - Silverlight Support

Microsoft in their infinate wisdom has removed some functionality which was in Silverlight4 from Silverlight5. SofastJSON will not build or work on Silverlight5.

Appendix v2.0.10 - MonoDroid Support

In this release I have added a MonoDroid project file and fastJSON now compiles and works on Android devices running the excellent work done by Miguel de Icaza and his team at Xamarin. This is what Silverlight should have been and I am really excited about this as it will open a lot of opportunities one of which is the newRaptorDB.

Appendix v2.0.11 - Unicode Changes



My apologies to everyone regarding my misreading of the JSON standard regarding Unicode, my interpretation was that the output should be in ASCII format and hence all non ASCII characters should be in the \uxxxx format.
In this version you can control the output format with the UseEscapedUnicode parameter and all the strings will be in Unicode format (no \uxxxx), the default is true for backward compatibility.

Appendix - fastJSON vs Json.net rematch

After being contacted by James Newton King for a retest with his new version of Json.net which is v5r2, I redid the tests and here is the results (times are in milliseconds):
As you can see there are 5 test and the AVG column is the average of the last 4 tests so to exclude the startup of each library, the DIFF column is the difference between the two libraries and fastJSON being the base of the test.
Things to note :
  • fastJSON is about 2x faster than Json.net in both serialize and deserialize.
  • Json.net is about 1.5-2x faster that it's previous versions which is a great job of optimizatons done and congratualtions in order.

Appendix v2.0.17 - dynamic objects

By popular demand the support for dynamic objects has been added so you can do the following withfastJSON:
string s = "{\"Name\":\"aaaaaa\",\"Age\":10,\"dob\":\"2000-01-01 00:00:00Z\",\"inner\":{\"prop\":30}}";
dynamic d = fastJSON.JSON.Instance.ToDynamic(s);
var ss = d.Name;
var oo = d.Age;
var dob = d.dob;
var inp = d.inner.prop; 

Appendix v2.0.28.1 - Parametric Constructors

As of this version fastJSON can now handle deserializing parametric constructor classes without a default constructors, like:
public class pctor
{
      public pctor(int a) // pctor() does not exist
      {
      }
}
Now to do this fastJSON is using the FormatterServices.GetUninitializedObject(type) in the framework which essentially just allocates a memory region for your type and gives it to you as an obeject by passing all initializations including the constructor. While this is really fast, it has the unfortunate side effect of ignoring all class initialization like default values for properties etc. so you should be aware of this if you are restoring partial data to an object (if all the data is in json and matches the class structure then you are fine).
To control this you can set the ParametricConstructorOverride to true in the JSONParameters.

Appendix v2.1.0 - Circular References & Breaking Changes

As of this version I fixed a design flaw since the start which was bugging me, namely the removal of theJSON.Instance singleton. This means you type less to use the library which is always a good thing, the bad thing is that you need to do a find replace in your code and the nuget package will not be drop in and you have to build with the new version.
Also I found a really simple and fast way to support circular reference object structures. So a complex structure like the following will serialize and deserialize properly ( the unit test is CircularReferences()):
var o = new o1 { o1int = 1, child = new o3 { o3int = 3 }, o2obj = new o2 { o2int = 2 } };
o.o2obj.parent = o;
o.child.child = o.o2obj;
To do this fastJSON replaces the circular reference with :
{"$i" : number } // number is an index for the internal reference
also a $circular : true is added to the top of the json for the deserializer to know, so the above structure yields the following json :
{
   "$circular" : true,
   "$types" : {
      "UnitTests.Tests+o1, UnitTests, Version=1.0.0.0, Culture=neutral, PublicKeyToken=null" : "1",
      "UnitTests.Tests+o2, UnitTests, Version=1.0.0.0, Culture=neutral, PublicKeyToken=null" : "2",
      "UnitTests.Tests+o3, UnitTests, Version=1.0.0.0, Culture=neutral, PublicKeyToken=null" : "3"
   },
   "$type" : "1",
   "o1int" : 1,
   "o2obj" : {
      "$type" : "2",
      "o2int" : 2,
      "parent" : {
         "$i" : 1
      }
   },
   "child" : {
      "$type" : "3",
      "o3int" : 3,
      "child" : {
         "$i" : 2
      }
   }
}

Appendix v2.1.3 - Milliseconds and Raspberry Pi

After much request I have added the support for millisecond resolution to the DateTime serialization, while the JSON standard does not explicitly state the format but there seems to be general concensus about it. So if you enable the JSONParameters.DateTimeMilliseconds flag then you will get :
"2014-09-15 09:40:16.006Z"
Important Note : when deserializing the above the resulting DateTime will not equal the original value since the DateTime object also has a Tick value which is not serialized and will be 0 so an object comparison will befalse.
On a different note, I recently got a Raspberry Pi, installed mono on it and copied fastJSON on it, the results are below:
This is quite incedible and says a lot for the mono team when by just coping my DLL files from a Windows system it works.
As a comparison I have added the results for my new dev notebook which has an i7 4702MQ and 8gb of ram below:

No comments:

Post a Comment

Genuine websites to earn money.

If you are interested in PTC sites then this article is for you. I have personally tried many of the sites and found that the best thing ...