July 2010
Volume 25 Number 07
The Working Programmer - Going NoSQL with MongoDB, Part 3
By Ted Neward | July 2010
Last time, I continued my exploration of MongoDB via the use of exploration tests. I described how to start and stop the server during a test, then showed how to capture cross-document references and discussed some of the reasoning behind the awkwardness of doing so. Now it’s time to explore some more intermediate MongoDB capabilities: predicate queries, aggregate functions and the LINQ support provided by the MongoDB.Linq assembly. I’ll also provide some notes about hosting MongoDB in a production environment.
When We Last Left Our Hero . . .
For reasons of space, I won’t review much of the previous articles; instead, you can read them online in the May and June issues at msdn.microsoft.com/magazine. In the associated code bundle, however, the exploration tests have been fleshed out to include a pre-existing sample set of data to work with, using characters from one of my favorite TV shows. Figure 1 shows a previous exploration test, by way of refresher. So far, so good.
Figure 1 An Example Exploration Test
[TestMethod]
public void StoreAndCountFamilyWithOid()
{
var oidGen = new OidGenerator();
var peter = new Document();
peter["firstname"] = "Peter";
peter["lastname"] = "Griffin";
peter["_id"] = oidGen.Generate();
var lois = new Document();
lois["firstname"] = "Lois";
lois["lastname"] = "Griffin";
lois["_id"] = oidGen.Generate();
peter["spouse"] = lois["_id"];
lois["spouse"] = peter["_id"];
var cast = new[] { peter, lois };
var fg = db["exploretests"]["familyguy"];
fg.Insert(cast);
Assert.AreEqual(peter["spouse"], lois["_id"]);
Assert.AreEqual(
fg.FindOne(new Document().Append("_id",
peter["spouse"])).ToString(), lois.ToString());
Assert.AreEqual(2,
fg.Count(new Document().Append("lastname", "Griffin")));
}
Calling All Old People . . .
In previous articles, the client code has fetched either all documents that match a particular criteria (such as having a “lastname” field matching a given String or an “_id” field matching a particular Oid), but I haven’t discussed how to do predicate-style queries (such as “find all documents where the ‘age’ field has a value higher than 18”). As it turns out, MongoDB doesn’t use a SQL-style interface to describe the query to execute; instead, it uses ECMAScript/JavaScript, and it can in fact accept blocks of code to execute on the server to filter or aggregate data, almost like a stored procedure.
This provides some LINQ-like capabilities, even before looking at the LINQ capabilities supported by the Mongo.Linq assembly. By specifying a document containing a field named “$where” and a code block describing the ECMAScript code to execute, arbitrarily complex queries can be created:
[TestMethod]
public void Where()
{
ICursor oldFolks =
db["exploretests"]["familyguy"].Find(
new Document().Append("$where",
new Code("this.gender === 'F'")));
bool found = false;
foreach (var d in oldFolks.Documents)
found = true;
Assert.IsTrue(found, "Found people");
}
As you can see, the Find call returns an ICursor instance, which, although itself isn’t IEnumerable (meaning it can’t be used in the foreach loop), contains a Documents property that’s an IEnumerable. If the query would return too large a set of data, the ICursor can be limited to return the first nresults by setting its Limit property to n.
The predicate query syntax comes in four different flavors, shown in Figure 2.
Figure 2 Four Different Predicate Query Syntaxes
[TestMethod]
public void PredicateQuery()
{
ICursor oldFolks =
db["exploretests"]["familyguy"].Find(
new Document().Append("age",
new Document().Append("$gt", 18)));
Assert.AreEqual(6, CountDocuments(oldFolks));
oldFolks =
db["exploretests"]["familyguy"].Find(
new Document().Append("$where",
new Code("this.age > 18")));
Assert.AreEqual(6, CountDocuments(oldFolks));
oldFolks =
db["exploretests"]["familyguy"].Find("this.age > 18");
Assert.AreEqual(6, CountDocuments(oldFolks));
oldFolks =
db["exploretests"]["familyguy"].Find(
new Document().Append("$where",
new Code("function(x) { return this.age > 18; }")));
Assert.AreEqual(6, CountDocuments(oldFolks));
}
In the second and third forms, “this” always refers to the object being examined.
You can send any arbitrary command (that is, ECMAScript code) through the driver to the database, in fact, using documents to convey the query or command. So, for example, the Count method provided by the IMongoCollection interface is really just a convenience around this more verbose snippet:
[TestMethod]
public void CountGriffins()
{
var resultDoc = db["exploretests"].SendCommand(
new Document()
.Append("count", "familyguy")
.Append("query",
new Document().Append("lastname", "Griffin"))
);
Assert.AreEqual(6, (double)resultDoc["n"]);
}
This means that any of the aggregate operations described by the MongoDB documentation, such as “distinct” or “group,” for example, are accessible via the same mechanism, even though they may not be surfaced as methods on the MongoDB.Driver APIs.
You can send arbitrary commands outside of a query to the database via the “special-name” syntax “$eval,” which allows any legitimate ECMAScript block of code to be executed against the server, again essentially as a stored procedure:
[TestMethod]
public void UseDatabaseAsCalculator()
{
var resultDoc = db["exploretests"].SendCommand(
new Document()
.Append("$eval",
new CodeWScope {
Value = "function() { return 3 + 3; }",
Scope = new Document() }));
TestContext.WriteLine("eval returned {0}", resultDoc.ToString());
Assert.AreEqual(6, (double)resultDoc["retval"]);
}
Or, use the provided Eval function on the database directly. If that isn’t flexible enough, MongoDB permits the storage of user-defined ECMAScript functions on the database instance for execution during queries and server-side execution blocks by adding ECMAScript functions to the special database collection “system.js,” as described on the MongoDB Web site.
The Missing LINQ
The C# MongoDB driver also has LINQ support, allowing developers to write MongoDB client code such as what’s shown in Figure 3.
Figure 3 An Example of LINQ Support
[TestMethod]
public void LINQQuery()
{
var fg = db["exploretests"]["familyguy"];
var results =
from d in fg.Linq()
where ((string)d["lastname"]) == "Brown"
select d;
bool found = false;
foreach (var d in results)
{
found = true;
TestContext.WriteLine("Found {0}", d);
}
Assert.IsTrue(found, "No Browns found?");
}
And, in keeping with the dynamic nature of the MongoDB database, this sample requires no code-generation, just the call to Linq to return an object that “enables” the MongoDB LINQ provider. At the time of this writing, LINQ support is fairly rudimentary, but it’s being improved and by the time this article reaches print, it will be significantly better. Documentation of the new features and examples will be in the wiki of the project site.
Shipping Is a Feature
Above all else, if MongoDB is going to be used in a production environment, a few things need to be addressed to make it less painful for the poor chaps who have to keep the production servers and services running.
To begin, the server process (mongod.exe) needs to be installed as a service—running it in an interactive desktop session is typically not allowed on a production server. To that end, mongod.exe supports a service install option, “--install,” which installs it as a service that can then be started either by the Services panel or the command line: “net start MongoDB.” However, as of this writing, there’s one small quirk in the --install command—it infers the path to the executable by looking at the command line used to execute it, so the full path must be given on the command line. This means that if MongoDB is installed in C:\Prg\mongodb, you must install it as a service at a command prompt (with administrative rights) with the command C:\Prg\mongodb\bin\mongod.exe --install.
However, any command-line parameters, such as “--dbpath, ” must also appear in that installation command, which means if any of the settings—port, path to the data files and so on—change, the service must be reinstalled. Fortunately, MongoDB supports a configuration file option, given by the “--config” command-line option, so typically the best approach is to pass the full config file path to the service install and do all additional configuration from there:
C:\Prg\mongodb\bin\mongod.exe --config C:\Prg\mongodb\bin\mongo.cfg --install
net start MongoDB
As usual, the easiest way to test to ensure the service is running successfully is to connect to it with the mongo.exe client that ships with the MongoDB download. And, because the server communicates with the clients via sockets, you need to poke the required holes in the firewall to permit communication across servers.
These Aren’t the Data Droids You’re Looking For
Of course, unsecured access to the MongoDB server isn’t likely to be a good thing, so securing the server against unwanted visitors becomes a key feature. MongoDB supports authentication, but the security system isn’t anywhere near as sophisticated as that found with “big iron” databases such as SQL Server.
Typically, the first step is to create a database admin login by connecting to the database with the mongo.exe client and adding an admin user to the admin database (a database containing data for running and administering the entire MongoDB server), like so:
> use admin
> db.addUser("dba", "dbapassword")
Once this is done, any further actions, even within this shell, will require authenticated access, which is done in the shell by explicit authentication:
> db.authenticate("dba", "dbapassword")
The DBA can now add users to a MongoDB database by changing databases and adding the user using the same addUser call shown earlier:
> use mydatabase
> db.addUser("billg", "password")
When connecting to the database via the Mongo.Driver, pass the authentication information as part of the connection string used to create the Mongo object and the same authentication magic will happen transparently:
var mongo = new Mongo("Username=billg;Password=password");
Naturally, passwords shouldn’t be hardcoded directly into the code or stored openly; use the same password discipline as befits any database-backed application. In fact, the entire configuration (host, port, password and so on) should be stored in a configuration file and retrieved via the ConfigurationManager class.
Reaching Out to Touch Some Code
Periodically, administrators will want to look at the running instance to obtain diagnostic information about the running server. MongoDB supports an HTTP interface for interacting with it, running on a port numerically 1,000 higher than the port it’s configured to use for normal client communication. Thus, because the default MongoDB port is 27017, the HTTP interface can be found on port 28017, as shown in Figure 4.
Figure 4 The HTTP Interface for Interacting with MongoDB
This HTTP interface also permits a more REST-style communication approach, as opposed to the native driver in MongoDB.Driver and MongoDB.Linq; the MongoDB Web site has full details, but essentially the HTTP URL for accessing a collection’s contents is given by adding the database name and collection name, separated by slashes, as shown in Figure 5.
Figure 5 The HTTP URL for Accessing a Collection’s Contents
For more details on creating a REST client using WCF, refer to the MSDN article “REST in Windows Communication Foundation (WCF)”.
A Word from Yoda
MongoDB is a quickly evolving product and these articles, while exploring core parts of MongoDB’s functionality, still leave major areas unexamined. While MongoDB isn’t a direct replacement for SQL Server, it’s proving to be a viable storage alternative for areas where the traditional RDBMS doesn’t fare so well. Similarly, just as MongoDB is an evolution in progress, so is the mongodb-csharp project. At the time of this writing, many new improvements were going into beta, including enhancements for working with strongly typed collections using plain objects, as well as greatly improved LINQ support. Keep an eye on both.
In the meantime, however, it’s time to wave farewell to MongoDB and turn our attention to other parts of the developer’s world that the working programmer may not be familiar with (and arguably should be). For now, though, happy coding, and remember, as the great DevGuy Master Yoda once said, “A DevGuy uses the Source for knowledge and defense; never for a hack.”
Ted Neward is a principal with Neward & Associates, an independent firm specializing in enterprise Microsoft .NET Framework and Java platform systems. He’s written more than 100 articles, is a C# MVP, INETA speaker and the author and coauthor of a dozen books, including “Professional F# 2.0” (Wrox, 2010). He consults and mentors regularly. Reach him at ted@tedneward.com and read his blog at blogs.tedneward.com.
Thanks to the following technical experts for reviewing this article: Sam Corder and Craig Wilson