Retrieving more than 1000 rows from windows azure storage
I recently hit an issue with my code when trying to retrieve
data from Windows Azure Diagnostics table,
I noticed that I'm getting only 1000 rows every time even though I have
not set any limit on the number of rows I want to retrieve, it turns out that
there is a limit to 1000 rows for windows azure storage.
The limit is not only on the number of rows, there is also a limit on the amount of time it takes the query to execute, the
Table service may return a maximum of 1,000 items at one time and make take only up to 5 seconds. If the result set contains more than 1,000 items or takes too long to execute the table service will return a Continuation Token that you can use to retrieve the rest of your
rows data, when you received all your data your token is null.
Originally this was my code:
1: WADPerformanceCountersTableContext wadContext = new WADPerformanceCountersTableContext(cloudStorageAccount.TableEndpoint.ToString(), cloudStorageAccount.Credentials);
2:
3: var items = from x in wadContext.WADPerformanceCountersTable
4: where x.EventTickCount >= start.Ticks && x.EventTickCount <= end.Ticks
5: select x;
6:
7: foreach (WADPerformanceCountersTable item in items)
8: {
9: //use item
10: }
This is the code after adding the use for continuation token:
1: ListRowsContinuationToken continuationToken = null;
2:
3: CloudStorageAccount cloudStorageAccount = CloudStorageAccount.Parse(connectionString);
4:
5: WADPerformanceCountersTableContext wadContext = new WADPerformanceCountersTableContext(cloudStorageAccount.TableEndpoint.ToString(), cloudStorageAccount.Credentials);
6:
7:
8: do
9: {
10: //Query using the start time and end time, take only n rows
11: //The response headers will contain a continuation key that can be used to retrieve the next n rows
12: var allItems = (from x in wadContext.WADPerformanceCountersTable
13: where x.EventTickCount >= start.Ticks && x.EventTickCount <= end.Ticks
14: select x).Take(c_pageSize);
15:
16: var query = allItems as DataServiceQuery<WADPerformanceCountersTable>;
17:
18: if (continuationToken != null)
19: {
20: query = query.AddQueryOption("NextPartitionKey", continuationToken.PartitionKey);
21: if (continuationToken.RowKey != null)
22: {
23: query = query.AddQueryOption("NextRowKey", continuationToken.RowKey);
24: }
25: }
26:
27: var response = query.Execute() as QueryOperationResponse;
28:
29: foreach (WADPerformanceCountersTable item in allItems)
30: {
31: //use item
32: }
33: //now check if there are more rows left to be retrieved, if there are more rows execute another request to get the remaining items
34: if (response.Headers.ContainsKey("x-ms-continuation-NextPartitionKey"))
35: {
36: continuationToken = new ListRowsContinuationToken();
37: continuationToken.PartitionKey = response.Headers["x-ms-continuation-NextPartitionKey"];
38: if (response.Headers.ContainsKey("x-ms-continuation-NextRowKey"))
39: {
40: continuationToken.RowKey = response.Headers["x-ms-continuation-NextRowKey"];
41: }
42: }
43: else
44: {
45: continuationToken = null;
46: }
47:
48: } while (continuationToken != null);
49:
ListRowsContinuationToken is a simple class defined as
follows:
1: public class ListRowsContinuationToken
2: {
3: public string PartitionKey { get; set; }
4: public string RowKey { get; set; }
5: }
Finally, I found an even easier way to get this done! The trick is to convert the query into a CloudTableQuery - once you do this you will be able to retrieve all rows with a single call (just make sure that this is what you really want to do)
1: CloudStorageAccount cloudStorageAccount = CloudStorageAccount.Parse(connectionString);
2: WADPerformanceCountersTableContext wadContext = new WADPerformanceCountersTableContext(cloudStorageAccount.TableEndpoint.ToString(), cloudStorageAccount.Credentials);
3:
4: Console.Out.WriteLine("Reading data from WAD storage");
5:
6: //Query using the start time and end time, take only 1000 rows
7: //The response headers will contain a continuration key that can be used to retrieve the next 1000 rows
8: var query = (from x in wadContext.WADPerformanceCountersTable
9: where x.EventTickCount >= start.Ticks && x.EventTickCount <= end.Ticks
10: select x);
11:
12:
13: var allItemsAsTableService = query.AsTableServiceQuery();
14: IEnumerable<WADPerformanceCountersTable> allItems = allItemsAsTableService.Execute();
15:
16: foreach (WADPerformanceCountersTable item in allItems)
17: {
18: //use item
19: }
I chose the last solution because I needed minimal changes to my code, and I didn't really need paging, if you need paging, continuation token will be the way to go!