Google’s Chart API
Throughout the software development cycle there is a wide array of areas where you need to generate various reports which include nifty looking charts. Whether you are generating performance reports, bug trends, tracking memory or CPU utilization, or following code coverage results; generating a meaningful graph is important for delivering and allowing the easy consumption of key pieces of information. Traditionally I would use the Excel COM objects to automate the generation of graphs from our data sets, but Google recently released their charting API (https://code.google.com/apis/chart/) so I figured I would take a look at writing a tool which would use Google’s interface to generate and save off the resulting image. Being on vacation provided a good opportunity to delve into this project, so here I will go into some snippets of code where I read in a comma delimited data file and generate the inputs to Google’s interface.
Standard Disclaimer: The code snippets below are for illustrative purposes and may not be "production" quality – this was a "quick and dirty" application for private usage.
Program Outline
This gives you an outline of what the main flow of my application looks like:
// parse through the command line options
// read in the entire text file
// count # of columns by scanning the first line for the delimiters
// count # of rows by counting all the new lines in the file
// create 2d array of doubles
// parse in the values, and keep track of min/max values
// build a google encoded value string for each row
// build the axis labels
// create the color string
// build the entire url string
// send http request
// save the returned file on 200 Ok
Notes and Interesting Snippets
To start with, we read the entire file into memory (as opposed to reading it in line by line), this makes it easier to handle cases where the input file is in Unicode, UTF-8, or ANSI. This also gives us the opportunity to do some pre-processing: we can determine how many rows and columns of data we have so we know the limits to our data set. Then we can allocate a 2D array up-front without having to worry about resizing.
Once we have our 2D array, we can parse through and read the data values
1 // parse in the values, and keep track of min/max values
2 char* index = dataStart;
3 unsigned row = 0;
4 unsigned col = 0;
5 char dataBreak[3] = {0};
6 dataBreak[0] = delim;
7 dataBreak[1] = '\n';
8 while(*index)
9 {
10 if(index >= fileBuffer + dataLen) break;
11 rawData[col][row] = atof(index);
12 maxData = max(rawData[col][row], maxData);
13 minData = min(rawData[col][row], minData);
14
15 size_t offset = strcspn(index, dataBreak);
16 if(offset == 0) break;
17 index += offset;
18
19 if(*index == '\0') break;
20 if(*index == '\n')
21 {
22 row++;
23 col = 0;
24 }
25 if(*index == delim) ++col;
26 ++index;
27 }
On line 6 we pull the data delimiter from a variable, this allows us to support files which are tab delimited (for example) instead of comma. Lines 12 and 13 keep track of the minimum and maximum value; this is needed as we will want to scale the data into the range of 0 to 61 for input into Google's API. Line 15 will scan past all the data until it hits a new line or the delimiter. If we were super cool we would want to skip over (if present) a column of text which contains x-axis information.
1 // build a google encoded value string for each row
2 dataRows.strcpy("s:");
3 for(unsigned c=0; c<numCol; c++)
4 {
5 for(unsigned r=0; r<numRows; r++)
6 {
7 dataRows.scatf("%c", GoogleEncode(rawData[c][r], minData, maxData));
8 }
9 if(c < numCol-1)
10 dataRows.strcat(",");
11 }
Here we loop through our data arrays and convert the raw data into the Google number scheme needed for a line chart. The "dataRows" variable is a string class, which allows concatenation of printf style formatting (using the scatf member function, as seen on line 7). So be sure to substitute your own favorite string class.
Here is the GoogleEncode helper function:
char GoogleEncode(double data, double minData, double maxData)
{
const char baseGoogle[] =
"ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789";
double range = maxData - minData;
if(range == 0.0)
return baseGoogle[0];
if(minData < 0.0) data += minData;
else data -= minData;
data = data * 61 / range;
return baseGoogle[(int)data];
}
If "range" is 0.0, then all the input values are the same so we can just return a constant value (and avoid the divide by zero). Otherwise we slide the data into the positive dimension, and then we scale them into a base 62 value and return the appropriate character. The only question I have for Google is why they put 0-9 at the end of their number system - I would definitely be interested to hear the story behind that decision.
Generating the y-axis labels is just a matter of pulling out some evenly spaced values between "minData" and "maxData”. A simple loop accomplishes this:
double range = maxData - minData;
for(double r=minData; r<=maxData; r+=range/min(10.0, numRows-1))
{
yAxis.scatf("|%1.1lf", r);
}
We cap the number of labels at 10 to prevent too many entries. You may want to scale this value up or down depending on the actual pixel height of the graph being generated. Also be sure to handle the case where minData == maxData.
Wrapping It Up
So given the following input file, my tool generated this line chart:
Normal, Linear 0, 0.5 0.01, 0.5 0.03, 0.5 0.1, 0.5 0.3, 0.5 0.7, 0.5 0.9, 0.5 0.97, 0.5 1, 0.5 0.97, 0.5 0.9, 0.5 0.7, 0.5 0.3, 0.5 0.1, 0.5 0.03, 0.5 0.01, 0.5 0, 0.5 |
|
Given the flexibility of the Google API, there are plenty of options/settings which a tool like this could take into consideration, but we've definitely gone on long enough for a single post.