Inserting multiple items at once in Azure Mobile Services

With the Azure Mobile Services client SDK, it’s fairly simple to insert an item into a table in Azure. Inserting multiple items can be also done fairly easily, simply inserting one at a time. But there are some scenarios where we want to actually do multiple insertions at once, to minimize the number of networking requests made from the client to the service. In the E2E Test Application that we have for our iOS client, the first thing we do in our query tests is to pre-populate a table with some data (if the table was empty), and since we don’t that to affect the test performance (we use about 250 items), we decided to insert them all at once. This has also been asked in our forums, so I decided to post how I went out to implement it.

The idea is fairly simple: instead of sending one item, we send a list of items which we want to be inserted. At the server side, instead of letting the default behavior for the insert operation kick in, we instead loop through the items and insert them one-by-one. Since the database and the instance running the mobile services are often located in the same datacenter, the latency between the two components is a lot smaller than from the client making the insert call and the mobile service, so we have some significant performance gains. And by having the logic to do multiple insertions at the server side, we don’t need to implement that at the client (doing it in managed code is trivial with the async / await keywords; doing that in iOS or in JavaScript not so much).

The way that I chose to implement the logic was that the service would receive not an array of items directly, but an object with one of its members being the array to be inserted. That’s easier to implement in the managed client (using an IDataMemberJsonConverter instead of an ICustomMobileServiceTableSerialization). For JavaScript it really doesn’t matter, since we’re dealing with JS objects in either case. And for the iOS client, we currently can only do that, since the insert operation only takes a dictionary (not an array) as an argument.

Let’s look at the server script. In this example, we’re adding populating our table called ‘iosMovies’, and the data to be inserted is an array in the “movies” property of the received item. The first thing we do is to check whether we actually need to do the insertion (that’s something specific to the scenario of my app; maybe in your case you always want to do the insertion, so that step won’t be necessary). By reading only the first element of the table, we can check whether the table is populated or not (since this script runs in the “insert’ operation, we assume that the table is either fully populated or it needs the items to be inserted). If the data is already there, we respond with an appropriate status. Otherwise, we proceed with the multiple insert operations.

  1. function insert(item, user, request) {
  2.     var table = tables.getTable('iosMovies');
  3.     table.take(1).read({
  4.         success: function (items) {
  5.             if (items.length > 0) {
  6.                 // table already populated
  7.                 request.respond(200, { id: 1, status: 'Already populated' });
  8.             } else {
  9.                 // Need to populate the table
  10.                 populateTable(table, request, item.movies);
  11.             }
  12.         }
  13.     });
  14. }

Populating the table with multiple items is done by looping through the items to be inserted, and inserting them one by one (*). Since the insert operation in the table object is asynchronous, we can’t use a simple sequential loop; instead, we’ll insert each element when the callback for the previous one is called. After the last one is inserted, we can then respond to the client with an appropriate status.

There’s one extra step that we had to do in this example, which caught me by surprise at first – updating a field of type Date (or DateTime in the managed client, or NSDate in the iOS client). When I first ran that code, the table was created, and the data was seemingly correct. But when I checked the type of the columns in the database, my field ‘ReleaseDate’ had been translated into a column of type ‘string’. ‘datetime‘ is a supported primitive type in Azure Mobile Services, so I was expecting the column to have the appropriate value. The problem is that dates and strings are transmitted using the same type in JSON, string (there’s no such a thing as a “JSON date”). If a string value arrives with a specific format (the ISO 8601 date format, with millisecond precision), then the Mobile Service runtime converts that value into a JavaScript Date object. However, that conversion is only done in top-level objects – it doesn’t traverse the whole incoming object graph (for performance reasons), so in my case my “date” value ended up as a normal string. To fix that, prior to inserting the data we fixed that by converting the ‘ReleaseDate’ field in in the input array.

  1. function populateTable(table, request, films) {
  2.     var index = 0;
  3.     films.forEach(changeReleaseDate);
  4.     var insertNext = function () {
  5.         if (index >= films.length) {
  6.             request.respond(201, { id: 1, status: 'Table populated successfully' });
  7.         } else {
  8.             var toInsert = films[index];
  9.             table.insert(toInsert, {
  10.                 success: function () {
  11.                     index++;
  12.                     if ((index % 20) === 0) {
  13.                         console.log('Inserted %d items', index);
  14.                     }
  15.  
  16.                     insertNext();
  17.                 }
  18.             });
  19.         }
  20.     };
  21.  
  22.     insertNext();
  23. }
  24.  
  25. function changeReleaseDate(obj) {
  26.     var releaseDate = obj.ReleaseDate;
  27.     if (typeof releaseDate === 'string') {
  28.         releaseDate = new Date(releaseDate);
  29.         obj.ReleaseDate = releaseDate;
  30.     }
  31. }

So that’s the script for the server. Now for the client code. Let’s look at the flavors we have. First, managed code, where we have the classes for that data:

  1. [DataTable(Name = "w8Movies")]
  2. public class AllMovies
  3. {
  4.     [DataMember(Name = "id")]
  5.     public int Id { get; set; }
  6.     [DataMember(Name = "status")]
  7.     public string Status { get; set; }
  8.     [DataMember(Name = "movies")]
  9.     [DataMemberJsonConverter(ConverterType = typeof(MovieArrayConverter))]
  10.     public Movie[] Movies { get; set; }
  11. }
  12.  
  13. public class MovieArrayConverter : IDataMemberJsonConverter
  14. {
  15.     public object ConvertFromJson(IJsonValue value)
  16.     {
  17.         // unused
  18.         return null;
  19.     }
  20.  
  21.     public IJsonValue ConvertToJson(object instance)
  22.     {
  23.         Movie[] movies = (Movie[])instance;
  24.         JsonArray result = new JsonArray();
  25.         foreach (var movie in movies)
  26.         {
  27.             result.Add(MobileServiceTableSerializer.Serialize(movie));
  28.         }
  29.  
  30.         return result;
  31.     }
  32. }
  33.  
  34. [DataTable(Name = "w8Movies")]
  35. public class Movie
  36. {
  37.     public int Id { get; set; }
  38.     public string Title { get; set; }
  39.     public int Duration { get; set; }
  40.     public DateTime ReleaseDate { get; set; }
  41.     public int Year { get; set; }
  42.     public bool BestPictureWinner { get; set; }
  43.     public string Rating { get; set; }
  44. }

And to insert multiple movies at once, we create an instance of our type which holds the array, and call insert on that object:

  1. Func<int, int, int, DateTime> createDate =
  2.     (y, m, d) => new DateTime(y, m, d, 0, 0, 0, DateTimeKind.Utc);
  3. AllMovies allMovies = new AllMovies
  4. {
  5.     Movies = new Movie[]
  6.     {
  7.         new Movie {
  8.             BestPictureWinner = false,
  9.             Duration = 142, Rating = "R",
  10.             ReleaseDate = createDate(1994, 10, 14),
  11.             Title = "The Shawshank Redemption",
  12.             Year = 1994 },
  13.         new Movie {
  14.             BestPictureWinner = true,
  15.             Duration = 175, Rating = "R",
  16.             ReleaseDate = createDate(1972, 3, 24),
  17.             Title = "The Godfather",
  18.             Year = 1972 },
  19.         new Movie {
  20.             BestPictureWinner = true,
  21.             Duration = 200, Rating = "R",
  22.             ReleaseDate = createDate(1974, 12, 20),
  23.             Title = "The Godfather: Part II",
  24.             Year = 1974 },
  25.         new Movie {
  26.             BestPictureWinner = false,
  27.             Duration = 168, Rating = "R",
  28.             ReleaseDate = createDate(1994, 10, 14),
  29.             Title = "Pulp Fiction",
  30.             Year = 1994 },
  31.     }
  32. };
  33. try
  34. {
  35.     var table = MobileService.GetTable<AllMovies>();
  36.     await table.InsertAsync(allMovies);
  37.     AddToDebug("Status: {0}", allMovies.Status);
  38. }
  39. catch (Exception ex)
  40. {
  41.     AddToDebug("Error: {0}", ex);
  42. }

For Objective-C, we don’t need to create the types, so we can use the NSDictionary and NSArray classes directly. The implementation of the ‘getMovies’ method can be found in our GitHub repository.

  1. NSArray *movies = [ZumoQueryTestData getMovies];
  2. NSDictionary *item = @{@"movies" : movies};
  3. MSTable *table = [client getTable:queryTestsTableName];
  4. [table insert:item completion:^(NSDictionary *item, NSError *error) {
  5.     if (error) {
  6.         NSLog(@"Error populating table: %@", error);
  7.     } else {
  8.         NSLog(@"Table is populated and ready for query tests");
  9.     }
  10. }];

Similarly for JavaScript, we can just use “regular” objects and arrays to insert the data:

  1. var table = client.getTable('w8Movies');
  2. var allMovies = {
  3.     movies: [
  4.         {
  5.             BestPictureWinner: false,
  6.             Duration: 142,
  7.             Rating: "R",
  8.             ReleaseDate: new Date(Date.UTC(1994, 10, 14)),
  9.             Title: "The Shawshank Redemption",
  10.             Year: 1994
  11.         },
  12.         {
  13.             BestPictureWinner: true,
  14.             Duration: 175,
  15.             Rating: "R",
  16.             ReleaseDate: new Date(Date.UTC(1972, 3, 24)),
  17.             Title: "The Godfather",
  18.             Year: 1972
  19.         },
  20.         {
  21.             BestPictureWinner: true,
  22.             Duration: 200,
  23.             Rating: "R",
  24.             ReleaseDate: new Date(Date.UTC(1974, 12, 20)),
  25.             Title: "The Godfather: Part II",
  26.             Year: 1974
  27.         },
  28.         {
  29.             BestPictureWinner: false,
  30.             Duration: 168,
  31.             Rating: "R",
  32.             ReleaseDate: new Date(Date.UTC(1994, 10, 14)),
  33.             Title: "Pulp Fiction",
  34.             Year: 1994
  35.         }
  36.     ]
  37. };
  38. table.insert(allMovies).done(function (inserted) {
  39.     document.getElementById('result').innerText = inserted.status || allMovies.status;
  40. });

That’s about it. This is one of the ways we have to prevent multiple networking requests between the client and the mobile service to perform multiple insertions at once.

(*) Bonus info: As I mentioned before, this still does multiple calls, between the mobile service and the database, but since they’re co-located, the latency is small. You also can make really only one call, from the service to the database, by using the mssql object, and creating one insert request for multiple rows at once. You’d need to do the translation between the data types and the SQL expression yourself, and use the “union all” trick to create a temporary table and insert from that table – see an example below. Notice that you’d also need to create the table columns (if they don’t exist yet), since the dynamic schema feature doesn’t work with the mssql object – once you go down to that level, you’re in full control of the database communication.

  1. INSERT INTO w8Movies (Title, [Year], Duration, BestPictureWinner, Rating, ReleaseDate)
  2. SELECT 'The Shawshank Redemption', 1994, 142, FALSE, 'R', '1994-10-14'
  3. UNION ALL
  4. SELECT 'The Godfather', 1972, 175, TRUE, 'R', '1972-03-24'
  5. UNION ALL
  6. SELECT 'The Godfather: Part II', 1974, 200, TRUE, 'R', '1974-12-20'
  7. UNION ALL
  8. SELECT 'Pulp Fiction', 1994, 168, TRUE, 'R', '1994-10-14'

So which one to use? I really didn’t see much advantage in going full SQL for my scenario. Once a communication between the mobile service and the database is successful, chances are that over the next second or so the subsequent ones will be as well, so doing ~250 insert operations, although not really an atomic transaction, has a very good chance of being one. As usual, it may vary according to the scenario of your application.