Dataset.ReadXML too slow (compared to the binary format)

Marc Al 206 Reputation points
2021-01-22T08:09:50.417+00:00

Hello,

I have a project in .NET 4.6 that I would like to "upgrade" to .NET Core 5.0
One of the problems I have is that I use the Binary format that is obsolete.

  • In .NET 4.6, it takes a little less than 3 sec. to load (with the binary format).
  • In .NET Core, the Dataset.ReadXML takes 28 sec. to load the same dataset.
  • I made tests with JsonConvert (from Newtonsoft) it is a little faster (a little more 16 than sec.) but relations and empty tables' columns are not copied .
    • With the one from .NET it doesn't work (recursive problem)
  • I can force the use of the BinaryFormatter to import a DataSet (but the format is incompatible between booth versions) and it is not possible to export a dataset to the binary format in .NET core5.

Is there a way to improve the reader (7-8 sec. would be great but 28 sec. ( 9 times slower) is a little big too slow) ?

Thank, you
Marc

Developer technologies | .NET | Other
{count} votes

5 answers

Sort by: Most helpful
  1. Marc Al 206 Reputation points
    2021-01-22T11:38:06.223+00:00

    Hello,

    The XML has a size of 360 MB (while the binary does a little less than 70)
    The dataset is used to transfer data between computers or as a cache (in my case it has 170 tables and a lot of relations) so it is not really possible to use classes.

    In the beginning of the program I merge all the datasets to allow the user to see a total of all the datas.

    Thank you
    Marc

    0 comments No comments

  2. Ken Tucker 5,861 Reputation points
    2021-01-23T16:22:07.067+00:00

    That file is pretty large. The BinaryFormatter has security issues and is not recommended anymore. It is marked as obsolete and in Asp.Net applications is disabled by default. This article shows how to enable in Asp.net and suppress the warnings in other type solutions. If you are ok with the security risk you can use it for now but I highly recommend you find another way to pass the data. Maybe you can use a Sqlite database instead to pass the data but without know more about the application it is hard to give a good recommendation.

    0 comments No comments

  3. Marc Al 206 Reputation points
    2021-01-24T06:35:08.67+00:00

    Hello,

    Thank you for the help.

    Even if you allow the binary formatter like in the article, the dataset is not serializable anymore in .NET Core 5 so you can't serialize it (and so you have nothing to deserialize).

    I will try to do a test with SQLite database when I have the time. It is a good idea I didn't think about.

    0 comments No comments

  4. Ken Tucker 5,861 Reputation points
    2021-01-24T11:40:30.307+00:00

    Interesting I was able to serialize and deserialize a simple dataset with the binary formatter

            DataSet customerDS = new DataSet("CustomerOrders");
            DataTable ordersTable = customerDS.Tables.Add("Orders");
            DataColumn pkCol = ordersTable.Columns.Add("OrderID", typeof(Int32));
            ordersTable.Columns.Add("OrderQuantity", typeof(Int32));
            DataColumn dcCompany = new DataColumn("CompanyName", typeof(string));
            ordersTable.Columns.Add(dcCompany);
            ordersTable.PrimaryKey = new DataColumn[] { pkCol };
    
            DataTable companyTable = new DataTable();
            DataColumn pk = new DataColumn("CompanyName", typeof(string));
            companyTable.Columns.Add(pk);
            companyTable.Columns.Add("Address", typeof(string));
            companyTable.PrimaryKey = new DataColumn[] { pk };
            customerDS.Tables.Add(companyTable);
    
            var relCompanyOrder = new DataRelation("CustomersOrders",
                dcCompany, pk);
    
            var newRow = companyTable.NewRow();
            newRow["CompanyName"] = "Microsoft";
            newRow["Address"] = "1 microsoft way";
    
            companyTable.Rows.Add(newRow);
    
            var newOrder = ordersTable.NewRow();
            newOrder["CompanyName"] = "Microsoft";
            newOrder["OrderID"] = 1;
            newOrder["OrderQuantity"] = 1;
    
            ordersTable.Rows.Add(newOrder);
            using (FileStream sw = new FileStream("Test.bin", FileMode.Create))
            {
                BinaryFormatter fmt = new BinaryFormatter();
                #pragma warning disable SYSLIB0011
                fmt.Serialize(sw, customerDS);
               #pragma warning restore SYSLIB0011
                sw.Close();
            }
    
            DataSet cloned = new DataSet();
    
            using (FileStream sw = new FileStream("Test.bin", FileMode.Open))
            {
                BinaryFormatter fmt = new BinaryFormatter();
               #pragma warning disable SYSLIB0011
               cloned =  fmt.Deserialize(sw) as DataSet;
                #pragma warning restore SYSLIB0011
                sw.Close();
            }
    

  5. Marc Al 206 Reputation points
    2021-01-25T18:19:14.527+00:00

    Hello,

    I have made some tests to understand why my Dataset didn't work and I have seen the problem :
    When you fill a dataset with a DataAdapter, the string columns have the property MaxLength set.
    You must clear all the values of MaxLength (set to -1) and you can export to binary format
    So you must do the following code

    private void ClearMaxLength (DataSet ATst)
            {
                foreach (DataTable dtbTmp in ATst.Tables)
                {
                    foreach (DataColumn CurCol in dtbTmp.Columns)
                        if (CurCol.MaxLength > 0)
                            CurCol.MaxLength = -1;
                }
            }
    

    And after that you can serialize (in my case in a zip directly)

               ds.RemotingFormat = SerializationFormat.Binary;
                IFormatter myFormatter = new System.Runtime.Serialization.Formatters.Binary.BinaryFormatter();
    
                try
                {
                    Stream myStreamFina = new FileStream(FileName, FileMode.Create, FileAccess.Write);
                    var ZipStream = new GZipStream(myStreamFina, CompressionMode.Compress, false);
                    myFormatter.Serialize(ZipStream, ds);
                    ZipStream.Flush();
                    myStreamFina.Flush();
                    ZipStream.Close();
                    myStreamFina.Close();
    

    I don't understand why there is a problem in Core5 and not in NET4.6

    Thanks again for the help.

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.