February 2017

Volume 32 Number 2

[Azure]

Serverless Architecture with Azure Functions

By Joseph Fultz | February 2017

From tools to machines to computers, we look for ways to automate repetitive work and standardize the context in which we work so that we can focus on high-value specialized contributions to complete tasks and solve problems. In parallel, it’s clear that as the IT industry has evolved, we’ve strived to achieve higher density at every level of a system, from the CPU to the server farm, in hopes of attaining the highest efficiency output from our systems. A serverless architecture is the point at which those two streams converge. It’s the point at which an individual’s effort is most granularly focused on the specific task and the waste in the system is at a minimum.

In a serverless world, developers create solutions instead of infrastructures and monitor execution and not environment health. Finance pays for time slices, not a mostly idle virtual machine (VM) farm. In Microsoft Azure, there are many consumable services that can be chained together to form an entire solution. A new key component of this is Azure Functions, which provides the serverless compute capability in a complete solution ecosystem. In this article, we’ll explore what it means to have a serverless architecture and explore Azure Functions tooling.

Serverless Architecture with Azure Functions

There are many definitions for serverless architecture. Despite the nomenclature, serverless architecture isn’t code that runs without servers. In fact, servers are still very much required; you just don’t have to think about them. You might think it’s the next iteration of Platform as a Service (PaaS), and while close, it’s not really that, either. So what, exactly, is it? Fundamentally, serverless architecture is the next evolution of cloud services, built on top of PaaS, abstracting away VMs, application frameworks, and external dependencies via bindings so that developers can focus simply on the code to implement business logic.

It’s prudent to discuss the typical properties of Azure Functions and serverless architectures. Azure Functions provides the serverless compute component of a serverless architecture. As shown in Figure 1, Azure Functions builds on top of Azure App Service and the WebJobs SDK, adding a bit of extra magic to host and run the Azure Function code and provide some niceties such as runtime binding.

Azure Functions Architecture
Figure 1 Azure Functions Architecture

Thus, all the benefits you get by using Azure App Service are there and can be found just below the services in the settings. Additionally, it also means that the WebJobs SDK can be used locally to create a local runtime environment. While not an architectural benefit, Azure Functions is a polyglot platform that supports a variety of languages, including Node.js, C#, PhP, Python and even Windows PowerShell. Runtimes and dependencies are handled by the platform. This is an advantage for those working in mixed environments as it lets teams with differing preferences and skills leverage the same platform for building and delivering functionality. Using Azure Functions as the serverless compute component of a serverless architecture provides several key benefits:

  • Reduced time to market: Because the underlying infrastructure is managed by the platform, developers are free to focus on application code that implements business logic. Azure Functions might be considered a further decomposition of microservices to nanoservices. Both paradigms provide a boon to the development process as development, testing and deployment activities are narrowly focused on a small bit of functionality that’s managed separately from other related, but discrete services.
  • Lower total cost of ownership: Because there’s infrastructure or OSes to maintain, developers can focus on providing value to the business. Additionally, the investment needed for DevOps and maintenance is significantly simplified and reduced.
  • Pay per execution: A key cost savings is typically found by only paying for the consumed cycles in conjunction with increasing the functional density of the compute resources used.

As for your approach when building using Azure Functions, a key to realizing the full value is to write only the smallest unit of logic to do a single scope of work and to keep the dependencies to a minimum. When working with Azure Functions, it’s best to keep to these practices:

  • An Azure Function should be single purpose in nature. Think of them as short, concise statements rather than compound sentences.
  • Operations are idempotent in nature. That means that the resulting state of the system from a call to an API endpoint will be unchanged if called subsequent times with the same parameters.
  • Keep execution times brief. The goal should be to receive input, perform the desired operations and get the results to the downstream consumers. For long-running processes, you might consider Azure WebJobs or even hosting the service in Azure Service Fabric.
  • In an attempt to keep the overall complexity low and execution quick, it’s best to minimize internal dependencies. Adding too much runtime weight will slow initial load times and add complexity to the system.
  • External integration through input and out bindings. Some common guidance given for high-performance sites is to write stateless services. This helps by not complicating or slowing a service to keep, serialize, and deserialize runtime state, as well as simplifying debugging efforts because you don’t have to discover and attempt to reproduce state to figure out what happened; in stateless it’s just a matter of passing the parameter values back through.

Properties of serverless architecture include the following:

  • The unit of work in a serverless architecture takes the form of a stateless function invoked by events.
  • Scaling, capacity and infrastructure management is provided as a service.
  • Execution-based billing model where you only pay for the time your code is running.

Some challenges of serverless architecture include the following:

  • Complexity: While serverless architecture simplifies many things for the developer, the platform abstractions require you to rethink the way in which you build applications. Managing a monolithic application as a single unit is more straightforward than managing a fleet of purpose-built functions and the dependencies between them. This is where capabilities such as those provided by Azure API Management come into play, giving you a way to create a consistent outward facing namespace that tie together all of the discretely defined and managed functions.
  • Tooling: Being relatively new, the tools to write, debug and test functions are still under development. As of this writing, they’re currently in preview, but Visual Studio 2017 will have the tooling built-in. Also, management and monitoring tools are still evolving, but there’s a basic interface for Azure Functions where you can see requests, success, errors and request-response details. There’s plenty of work to do when it comes to tying in the platform with application monitoring tools and the Azure Functions team is working to grow and evolve such support.
  • Organizational Support: It’s a non-trivial consideration for some to move to a serverless paradigm. Many organizations are challenged to move to fully automated continuous integration (CI)/continuous delivery (CD) pipeline and microservices architecture. The move to a serverless design can add to those difficulties as it’ll often challenge current standards and require educating resources on what’s available, how to tie it together and how to manage it.
  • No Runtime Optimization: In a traditional design, you might optimize the execution environment to the workload, changing the type and amount for things such as RAM, swap, disk and network connectivity. With Azure Functions, only minimal changes can be made, such as which storage account to use.

Traditional Architecture vs. Serverless Architecture

A typical set of design artifacts for a system includes a logical design, technical design and software architecture. The logical design typically defines the capabilities of a system and what it does, whereas the technical design typically defines what the system is. As you move to a serverless architecture, you become more concerned about what a system does more so than what it is. In many IT shops you might find more architecture diagrams that look something like Figure 2.

Traditional Technical Architecture
Figure 2 Traditional Technical Architecture

This type of artifact is particularly important to the folks in hosting, networking, and the DBA group as they need to know what to provision and configure. However, in a Platform-as-a-Service (PaaS) environment, you’re focused on the functional properties and you leave provisioning details to the platform having only to define the configuration of the PaaS services itself. The details about your input into the configuration will be kept in templates and all of the conversations you have are focused on capabilities, integration, and functionality instead of discussions about what are the optimal amounts in RAM, CPUs, and disk.

The result diagrams might be a bit closer to what you typically see in a logical diagram as is depicted in Figure 3. Such diagrams can start to focus on function parts and how they tie together, instead of the focus being on the configuration of application hosts. This simplification of message and clarification of purpose will help not only in technical discussions about intent and implementation, but will inevitably help convey the value of a system to a more non-technical office as the focus turns from what the system is to a discussion about what it does for someone.

Serverless Architecture
Figure 3 Serverless Architecture

Azure Functions in Action

The Azure IoT platform provides a rich set of services to support the collection and analysis of data. Our example entails an existing IoT implementation built to collect vehicle telemetry and store it in the cloud, with basic analytic functionality. We now want to explore adding new capabilities to the solution, such as being able to query the data in real time in order to find the closest vehicle to a given location. You might use this type of functionality in a free-floating car-sharing program or even to find your car in a parking lot.

We must note some caveats in this example implementation as we’re purposefully taking some shortcuts in order to demonstrate the tools and platform. First, we’re working directly with a single partition DocumentDB. In a more prime-time implementation we would have at minimum sharded the DocumentDB based on the expected volume of data, but we might have also chosen to do other things, such as adding Azure Redis Cache and Azure Elastic Search as the means for optimizing some of the read paths. Second, since the current DocumentDB API would require a bit more client-side processing to get to the records that we want to compare, we’ve taken the shortcut of just asking for the top 100 records. Partitioning and search capabilities would be a more typical path to find the records needed in a large set. In any case, you’d still use the function to find the potential records and then compare them to the given location and return the closest vehicle out of the set.

Creating the Function

At the time of this writing, the Visual Studio tooling isn’t available to help speed the development process. To that end, we’ll start by creating the Azure Function via the portal interface. After you start the Azure Functions creation in the Azure portal, you’ll be presented with the blade to set a few settings for your function. Note the options for the App Service plan selection: Consumption Plan and App Service Plan. Choosing between these two seems like the simplest of choices, but in reality you’re being presented with choosing between the old style of managing resources and the idealistic goals of a serverless architecture. In choosing App Service Plan, you must make guesses about how much processing power is needed. If you choose too much, you’re paying for resources that you’re not using, but choose too little and you might have issues with your implementation that ultimately impacts your customers and your bottom line. Choosing Consumption Plan is the preferred style as it leaves the scaling up and down to the system and you, as the consumer, pay for as much or as little as your app consumes.

Your final step is to actually create the function itself. You’re going to start with the WebHook premade function using C#. This will preconfigure a Trigger to prosecute the function based on receiving an HTTP request. By selecting the Integrate item on the left menu, several options can be chosen as a matter of configuration for Trigger, Input and Ouput. The Trigger selection page has some useful information on it, including information about the WebHook bindings, information about how to use the API keys, and some sample code for calling the WebHook from both C# and Node.js. Once presented with the dialog, you’ll configure the Input and the Output bindings, as shown in Figure 4.

Input Binding Configuration
Figure 4 Input Binding Configuration

While you can include libraries and adapters to make calls to external systems, the system works via Bindings and has first-class support for many data sources such as EventHubs and DocumentDB. The development experience is enhanced through the Bindings infrastructure that’s part of Azure Functions. Bindings abstract the actual target software infrastructure (DocumentDB, EventHubs, Storage and so on), letting the developer code to the parameter representing the target, while remaining flexible because the target can be changed by changing the configuration. Here, you’ve configured it to talk to the source DocumentDB. From that configuration you’ll be able to write code directly against the DocumentDB client within the function and you won’t have to make the connection yourself. Note the document parameter name: inputDocument. That’s the variable that’s passed into the function you’ll use to make calls to the DocumentDB.

Here, the output options can be seen and include a number of storage, queuing and other external systems. All of the items that can be selected and configured through the UI can be accessed later via the function app Settings UI and can be configured as part of a template or through the programmatic interfaces. You’re simply returning the JSON result over HTTP to the caller. Because the HTTP(res) is already set for Output with an output parameter defined, you’ll accept it as is.

Developing the Code

Once Develop is selected in the left menu, you’re presented with an online editor. This is good for quick iterative changes and seeing your log in real time, and it gives you an interface for triggering the WebHook Function. Tooling is underway for both Visual Studio and Visual Studio Code that will give a richer experience. If you’re working with Azure Functions today, you’d want to use an IDE, and Visual Studio easily connects to the Function App through the Server Explorer.

There are a few things you need to do in editing the files. Right below the Code window there’s a link for View Files. Open the File Explorer directly to the right of the Code window. First, you’ll need the DocumentDB client. There are a number of libraries that are automatically available to be referenced by using the #r directive at the top. The list of those libraries, as well as a number of other pieces of developer information, can be found in the online developer reference at bit.ly/2gaWT9x. For example, you’ll need access to the DocumetDB client objects because there’s first-class support for DocumentDB there. You’ll simply add the #r directive for that assembly at the top of the file. If a library that you need isn’t included, say one you maintain and publish to NuGet, it can be added to a project.json file (package.json in Node.js).

At this point you’re ready to edit the run.csx file, which will hold all of your code. To do this you’ll edit it directly in the online IDE for Azure Functions, as shown in Figure 5.

Editing Function Code
Figure 5 Editing Function Code

Starting with the template code, first add your own custom external library to the function, as it contains the code for the haversine function. If you have custom classes or functions that aren’t too large and are specific to the function, you can add them directly into the run.csx file. However, if you have reusable pieces, you’ll want to go a different route and include their compiled version in a \bin folder or reference them as NuGet packages via the project.json file and reference the library with #r. Alternately, you could place the code into a different .csx file and use the #load directive.

You need a couple of functions to help determine the distance between the vehicles for which you’re checking proximity and the point they passed into the function. It’s been a while since school days and it turns out that you don’t regularly need a haversine formula. Wikipedia provides a good reference for it at bit.ly/2gCWrgb and we borrowed the C# function from bit.ly/2gD26mK and made a few changes. We’ve created the necessary functions as static members of a haversine class:

namespace SphericalDistanceLib
{
  public class Haversine
  {
    public static double CalculateDistance(
      double currentLong, double currentLat,
      double vehicleLong, double vehicleLat){…}
    public static double ToRadians(double degrees){...}
  }
}

The code is compiled and then uploaded into a bin folder relative to the root folder of the function. In our case the path is FindNearVehicle/bin, but the bin folder must be created as it’s not there by default. With that bit of setup completed, you turn your focus to the code you need. At the top of the function you’ll need to ensure that you’re referencing any libraries you need. In particular, you need the DocumentDB client object types, Newtonsoft, and the custom library that was uploaded to the /bin folder. These get added at the top of the file using the #r directive:

#r "Microsoft.Azure.Documents.Client"
#r "Newtonsoft.Json"
#r "SphericalDistanceLib.dll"

As previously noted, you’ll grab the last 100 records based on the created_at timestamp field. DocumentDB has a nice SQL syntax that makes that pretty easy:

IQueryable<Document> telemDocs = inputDocument.CreateDocumentQuery<Document>(
  UriFactory.CreateDocumentCollectionUri(dbName, collectionName),
  new SqlQuerySpec("SELECT TOP 100 c.vehicle.vin, c.vehicle.model,
  c.location FROM c ORDER BY c.created_at DESC"), queryOptions);

You’re using the Document type, which eases things a bit, because you’ll cast it to a Dynamic type to add properties to the object easily. The SQL is passed in the form of a SqlQuerySpec and you’ll project only the vin, model and location into your object. At this point you have to iterate the list of documents, calculate the distance using the haversine function in the external library, and determine the nearest one and return it. However, it gets a little tricky.

You need to keep track of all of the vins you’ve seen, because you only want the latest location record for that vin. Because you’ll have ordered it in descending order, the first document is the last document you’ve received. You’ll check the vin for null, because you’re looking at vehicles that are being driven and if there’s a null in the vin, you can assume that the document is invalid. If the element has a non-null value, you’ll simply attempt to add the vin to a HashSet. If it’s unique to the list, the addition will succeed, but if not, it’ll fail and you know that you have the most recent record for that vehicle already in the list, as shown in Figure 6.

Figure 6 Last Known Location for Distinct Vins

HashSet<string> seen = new HashSet<string>();
foreach(dynamic doc in telemDocs){
  if(seen.Add(doc.vin)){
    // Calculate the distance
    if (doc.location != null) {
      doc.distance =
        Haversine.CalculateDistance(double.Parse(currentLong),
        double.Parse(currentLat), double.Parse(
        doc.location.lon.ToString()),
        double.Parse(doc.location.lat.ToString()));
      lastKnownLocations.Add(doc);
    }
    else{
      // Location is null, so we won't take this
      // record as last known location
      seen.Remove(doc.vin);
    }
  }
}

You add the entire document to the lastKnownLocations list, making it easy to turn around and query out the first document based on ordering by the least distance value:

var nearestVehicle = lastKnownLocations.OrderBy(x => ((dynamic)x).distance).First();
return currentLat == null || currentLong == null
  ? req.CreateResponse(HttpStatusCode.BadRequest,
  "Please pass a lat and lon on the query string or in the request body")
  : req.CreateResponse(HttpStatusCode.OK, nearestVehicle);

The first document in the ordered list can be returned, as seen in the last line, where a null param check is also handled, and the serialization of the document is handled for you.

Running the Example

The last step of this is to see it in action. At the top right of the development view there’s a link with a flask icon attached to it labeled Test. Clicking this will open the testing UI, as shown in Figure 7. Within it you can change the HTTP method, add parameters, add headers and set the body to be passed to the selected HTTP method. We typically use Postman when we’re doing a lot of this work, because we have a library of tests. However, the built-in testing facilities for HTTP functions are excellent and immediately available for quick iterative work.

Testing a Function
Figure 7 Testing a Function

We’ve grabbed the latitude and longitude and formatted it into the expected JSON format in the Run window. Once Run is clicked, any output from calls to the log object and general log output can be seen in the Logs window with the output and the status seen on the bottom right in the Output window. Taking note of the timing on the log output it looks like it took around 250 ms for our function to run. That’s well within the execution model for which we’re striving with Azure Functions: singlepurpose and relatively short execution times. Pulling the content out of the Output window and formatting it, we can see a lot more clearly that we have the vehicle, the timestamp when the location was recorded, the location and the distance:

{
  "vin": "wmwrxxxxxxxx54251",
  "model": "Cooper Hardtop",
  "location": {
    "lat": 30.4xxxxxx,
    "lon": -97.8xxxxxx,
    "accuracy_m": 22.505,
    "ts": 1473116792970
  },
  "distance": 13.552438042837085
}

When we did the distance calculation we gave the circumference of the Earth in kilometers, so the distance that’s represented in the return is about 13.6 km.

Wrapping Up

In the example, we used a mixture of online tools to develop and prosecute Azure Functions, but a more realistic approach for a development team would be to develop locally and set up continuous integration and deployment via a Git repo. Using the WebJobs.Script and WebJobs SDK, you can set up the ability to develop and run Azure Functions locally. A good walk-through of how to do that can be found at bit.ly/2hhUCt2. You’ll also find there are a number of different sources that can be configured as the source for deployments.

Azure Functions is the new kid on the block in the Azure platform. It’s a key ingredient of serverless compute that’s needed to achieve the benefits of a cloud PaaS implementation. Azure Functions turns the focus to the value of the code and away from managing infrastructure. The platform, tools support, and language support continues to evolve, but it already supports a number of languages and can be tied into your CI/CD pipeline. More information can be found at bit.ly/2ktywWE. If you’re not already working with Azure Functions, you should give it a try. You can get started at bit.ly/2glBJnC.

The Shift to Serverless Compute

Serverless compute represents a fundamental paradigm shift in how we think about cloud computing and building applications for the cloud. Serverless computing provides developers with a completely abstracted and infinitely scalable environment to execute their code and provides a payment model that is purely focused on code execution.

Serverless compute can be looked at as the next step in the evolution of Platform as a Service (PaaS). It fulfils the PaaS promise of abstracting the application infrastructure from the code and providing auto-scale capabilities. The major differentiator for serverless is the per-execution pricing model (rather than paying for the time the code is hosted) and instant, unlimited scale. Serverless compute provides a fully managed compute experience (zero administrative tasks), instant unlimited scale (zero scale configurations) and reacts to events (real-time processing). This enables developers to develop a new class of applications that scale by design and, ultimately, are more resilient.

Developers building serverless applications use serverless compute like Azure Functions, but also use an ever growing number of fully managed Azure or third-party services to compose working end-to-end serverless solutions. Developers today can easily set up and run complete serverless architectures that scale with ease, by design and at little cost. Developers also are free from the burden of managing and monitoring infrastructure; they can focus on their business logic and solve problems related to the business and not to the maintenance of the infrastructure running their code.

Serverless is here to change the way we think about building applications and is here to stay for a long time. Join the conversation at bit.ly/2hhP41O and at bit.ly/2gcbPzr. Submit feature requests to bit.ly/2hhY3jq and reach out to us on Twitter: @Azure with the hashtag #AzureFunctions.    —Yochay Kiriaty


Darren Brust is a cloud solution architect at Microsoft where he spends most of his time assisting enterprise customers as they transition to Microsoft Azure. In his free time, you’re likely to find him at one of his three children’s sporting events or a local Austin coffee shop consuming copious amounts of coffee. You can reach him at dbrust@microsoft.com or on Twitter: @DarrenBrust.

Joseph Fultz is a cloud solution architect at Microsoft. He works with Microsoft customers developing architectures for solving business problems leveraging Microsoft Azure. Formerly, Fultz was responsible for the development and architecture of GM’s car-sharing program (mavendrive.com). Contact him on Twitter: @JosephRFultz or via e-mail at jofultz@microsoft.com.

Thanks to the following Microsoft technical expert who reviewed this article: Fabio Calvacante


Discuss this article in the MSDN Magazine forum