F# on Windows Azure
Windows Azure was announced yesterday, and along with it, the first CTP of the SDK and Visual Studio tools. If you haven’t yet tried it, go take a look. On top of serving as a hosting service for web applications, Azure also provides a really simple way to do distributed compute and storage in the cloud.
Azure supports running .NET applications, which means you can build Azure worker roles using F#! The tools released with Azure don’t have F# support out of the box though, so I’ve posted a few simple templates and samples up on Code Gallery.
Download
F# Templates and Samples for Windows Azure
Templates
Cloud WebCrawl Sample
namespace SearchEngine_WorkerRole
open System
open System.Threading
open Microsoft.ServiceHosting.ServiceRuntime
open System.Net
open System.IO
open System.Text.RegularExpressions
open Microsoft.Samples.ServiceHosting.StorageClient;
open System.Web
open System.Runtime.Serialization.Formatters.Binary
type WorkerRole() =
inherit RoleEntryPoint()
// The page to start crawling from
let startpage = @"https://blogs.msdn.com/lukeh"
// The filter to apply to links while crawling
let pageFilter = fun (url:string) -> url.StartsWith("https://blogs.msdn.com/")
/// Get the contents of a given url
let http(url: string) =
let req = WebRequest.Create(url)
use resp = req.GetResponse()
use stream = resp.GetResponseStream()
use reader = new StreamReader(stream)
let html = reader.ReadToEnd()
html
/// Get the links from a page of HTML
let linkPat = "href=\s*\"[^\"h]*(https://[^&\"]*)\""
let getLinks text = [ for m in Regex.Matches(text,linkPat) -> m.Groups.Item(1).Value ]
/// Handle the message msg using the given queue and blob container
let HandleMessage (msg : Message) (queue : MessageQueue, container: BlobContainer) =
// There was a new item, get the contents
let url = msg.ContentAsString();
let urlBlobName = HttpUtility.UrlEncode(url)
// Don't get the page if we've already seen it
if not(container.DoesBlobExist(urlBlobName))
then
do RoleManager.WriteToLog("Information", String.Format("Handling new url: '{0}'", url));
try
// Get the contents of the page
let content = http url
// Store the page into the blob store
let props = new BlobProperties(urlBlobName)
let _ = container.CreateBlob(props, new BlobContents(System.Text.UTF8Encoding.Default.GetBytes(content)), true);
// Get the links from the page
let links = getLinks content
// Filter down the links and then create a new work item for each
links
|> Seq.filter pageFilter
|> Seq.distinct
|> Seq.filter (fun link -> not(container.DoesBlobExist(HttpUtility.UrlEncode(link))))
|> Seq.iter (fun link -> queue.PutMessage(new Message(link)) |> ignore)
queue.DeleteMessage(msg) |> ignore
with
| _ ->()
/// Main loop of worker process
let rec Loop (queue : MessageQueue, container: BlobContainer) =
// Get the next page to crawl from the queue
let msg = queue.GetMessage(240);
if msg = null
then Thread.Sleep(1000)
else HandleMessage msg (queue, container)
Loop(queue,container)
override wp.Start() =
// Initialize the Blob storage
let blobStorage = BlobStorage.Create(StorageAccountInfo.GetDefaultBlobStorageAccountFromConfiguration());
let container = blobStorage.GetBlobContainer("searchengine");
let a = container.CreateContainer(null, ContainerAccessControl.Public);
// Initialize the Queue storage
let queueStorage = QueueStorage.Create(StorageAccountInfo.GetDefaultQueueStorageAccountFromConfiguration());
let queue = queueStorage.GetQueue("searchworker");
let b = queue.CreateQueue()
// Put an initial message in the queue, using the start page
let c = queue.PutMessage(new Message(startpage));
// Begin the main loop, processing messages in the queue
Loop(queue, container)
override wp.GetHealthStatus() = RoleStatus.Healthy
Worker Roles
The code above defines the implementation of a Worker Role – a process which runs in the background, waiting for work to do, and then processing these work requests. The worker role is set to run 4 instance simultaneously, which means that there will be 4 instances of this worker processing work items as they come in. This gives an implicit parallelism – in fact, the initial release of Azure will run one process per core, so you really are getting effective parallelism this way. Notice also that this requires that the worker processes are inherently stateless. Both aspects make typical functional design approaches that are common in F# natural for developing these worker roles.
Queues and Blobs
This sample uses two of the three data formats supported by Windows Azure. The queue storage holds the work items. The blob storage holds the pages visited during the web crawl. When an instance of the worker role Starts, it connects to the blob store and the queue store, then puts an initial work item in the queue and goes into a loop processing work items out of the queue.
Conclusion
Ideas for any other interesting F# applications on Windows Azure? Download the templates and samples.
Comments
Anonymous
October 28, 2008
PingBack from http://mstechnews.info/2008/10/f-on-windows-azure/Anonymous
October 29, 2008
Can't wait to try this, but I haven't access to the cloud servicesAnonymous
October 29, 2008
Digitake - The Windows Azure SDK actually includes a local cloud emulation environment, called the "dev fabric", which allows you to start experimenting with building Azure apps right now, even if you don't yet have access to deploy the app into the production cloud.Anonymous
October 31, 2008
I'm not quite sure anymore who was first (doesn't really matter anyway) but I just noticed Luke fromAnonymous
November 01, 2008
Recently I posted about the Windows Live Tools Web Role Template that integrates with the Windows Azure