Share via


Standard I/O

Console Appplications in .NET, or Teaching a New Dog Old Tricks

Michael Brook

Code download available at:NETConsoleApps.exe(228 KB)

This article assumes you're familiar with C#

SUMMARY

The Microsoft .NET Framework is not just about Windows Forms and Web services. This article discusses the simplest kind of Framework-based application—the console app—along with the frequently overlooked constructs of standard input/output and the pipe. When designed carefully, console applications offer a surprisingly powerful way of solving complex programming problems. One of the more exciting aspects of this approach is that while each application in the pipe is fairly simple, the result of their interaction can be a relatively complex task. Here the author explores the ins and outs of writing these console apps.

Contents

The Command Prompt
Building Console Apps in .NET
A Design Pattern for Console Apps
Controlling Console Apps with the Process Class
Yes, But Can it Read an RSS Feed?
Conclusion

The Microsoft® .NET Framework is more than Windows® Forms and Web services. In this article, I'll explore the simplest kind of Framework-based application—the console application—along with the frequently overlooked constructs of standard input/output and the pipe. If you are anything like me, you'll find yourself full of new ideas once you become familiar with these venerable concepts.

So, what is a console app? It's tempting to say that console applications are applications invoked from the Windows command prompt (cmd.exe), though this wouldn't be strictly true. Simply put, a console app is any program with access to three basic data streams: standard input, standard output, and standard error. As the names suggest, standard input represents data flowing into the program, standard output represents data flowing out, and standard error represents a special kind of data flowing out (namely, error messages). Along with its command-line arguments, these data streams represent the runtime context of the console app.

It's easy to spot a console application running in Windows; it's the familiar black box with gray text. This is the way Windows makes the standard data streams available to you. Whatever you type (or paste) into the black window becomes standard input to the console application and whatever text you see displayed is actually the standard output and/or standard error (it can be difficult to tell them apart sometimes). Here's an example. If you select Run from the Windows Start menu and type the following, a console window pops up:

at.exe /delete

The console window states: "This operation will delete all scheduled jobs. Do you want to continue this operation?" This is actually the standard output of the program, which is paused while it waits for your response to appear on standard input. By typing "n" and pressing Enter, you are sending at.exe some standard input and causing it to perform some actions (in this case, you are saying "no," causing the program to end). I chose this example because the at.exe /delete command waits for standard input from the user. Other commands (at.exe by itself, for instance), simply send something to standard output and exit before you can see what it was.

The Command Prompt

Console applications, then, are just programs with access to the three basic data streams. Windows makes the data streams available to you through the console window. So far, this is probably more nostalgic than useful, but Windows comes with a special console application called cmd.exe that allows you to do some more interesting things with the three standard data streams. For example, open a command prompt and type the following ("C:\>" of course represents the actual command prompt and should not be typed into the box):

C:\> at.exe /?

You should see some reference information for the at.exe program. Now type the following:

C:\> at.exe /? >atref.txt

Although nothing appears on the screen, the atref.txt file now contains the reference information for the at.exe program. Cmd.exe provides the special > (greater-than) operator for sending standard output to a file. The important point is that the > operator is not part of the operating system, but part of cmd.exe. If you need proof, try typing the same command line into the Run box of your Windows Start menu (thus bypassing cmd.exe altogether). You will see the Windows console briefly appear, but atref.txt is nowhere to be found.

Cmd.exe isn't a magical window on the guts of the operating system—it is just a particularly useful console application that allows you to do some convenient (and powerful) things with the standard data streams of other console applications. This becomes very interesting when you consider that standard input and output are really compatible ideas and that standard output from one program can serve as standard input to another, and so on and on. This idea is nearly as old as the idea of standard data streams themselves and dates back to a version of Unix that first appeared in 1972. The mechanism for linking together console applications in such a way is called the "pipe." Its command-line representation (on Windows and Unix) is the vertical bar character (|). Since its inception, the pipe has had a profound influence on Unix software, which is famous for being comprised of many simple, single-purpose, "pipeable" programs.

Cmd.exe supports pipes too. For example, the following command line sends the reference information for the at.exe program to a program called findstr.exe:

C:\> at.exe /? | findstr.exe "\/delete"

Findstr.exe is a pattern-matching text searcher that comes with Windows. In this case, you are telling it to output only the lines with the word "/delete" in them. The findstr.exe program is highly specific in its purpose. So are most of the basic Windows commands, for that matter. The pipe allows you to string them together to form complex commands. If cmd.exe didn't allow use of the pipe, you would end up with a proliferation of console applications that differed only in some small detail, such as the ability to sort or filter their output. Or, you would end up with prohibitively complex commands with lengthy and confusing command-line switches. Without the pipe, the command I just showed might have been written like so:

C:\> at.exe /? /filter:\/delete /write:atref.txt

This would require the developer of at.exe to understand string pattern matching and file writing in addition to what the at.exe program actually does.

The world of console applications is like an ecosystem, in which programs forge symbiotic relationships and thus become useful. This is especially true on Unix, where that ecosystem has existed much longer. Some of the most useful programs on Unix ("sort," for example) are utterly useless on their own. For some reason, interoperability via the pipe and standard I/O is a relative rarity in the world of Windows. This might be because Windows has historically been more graphically oriented than Unix, or perhaps it is because Visual Basic® (through version 6.0) didn't make it easy to write console applications. Regardless, the .NET Framework takes a great leap forward by looking back. It is easier than ever to write applications that take advantage of standard I/O and the pipe. The rest of this article looks at two sides of console applications—first, how to build them using the .NET Framework, and second, how to interoperate with them from the .NET Framework.

Building Console Apps in .NET

Different programming environments have different semantics for producing console applications. In VBScript, for example, the script engine starts executing code at the top of the file and continues until there is no more code to execute. In .NET, the common language runtime (CLR) looks for a specific entry point in the compiled executable. Namely, it looks for a static method marked with the .entrypoint IL directive (often the Main method). The CLR calls the entry method and the program proceeds from there. When you create a new C# console application in Visual Studio® .NET, you are given a simple class with just a static Main method:

using System; namespace ConsoleApplication1 { class Class1 { [STAThread] static void Main(string[] args) { } } }

When the C# compiler compiles this code, it marks Main with the .entrypoint IL directive, making it the application's starting method. This is the simplest possible console application—it compiles but doesn't actually do anything useful when you run it. Notice how the Main method takes an array of strings as its only argument. This represents the command-line parameters passed to the console application (unlike with standard C applications, the name of the executable is not passed to the application as the first argument). Also notice how the Main method doesn't return anything. Although console apps can return an int (which can then be queried via ERRORLEVEL), return values aren't needed for pipe applications.

All communication with the outside world is done through the standard data streams. Using the System.Console.In, System.Console.Out, and System.Console.Error static properties, the .NET Framework makes the standard data streams available to the console application. System.Console.In is an instance of System.IO.TextReader; Out and Error are instances of System.IO.TextWriter. This proves to be a big advantage since you can work with the standard data streams just as you would a file or network stream.

Figure 1 shows a simple example of working with standard input and output. This is the code for a program called Passthrough, a console application that simply passes standard input through to standard output. You can compile it through Visual Studio .NET or using the command-line tools. For your reference, this and all other examples in this article will compile with a command-line call to the C# compiler, such as:

C:\> csc.exe /t:exe /out:Passthrough.exe Passthrough.cs

If you were to run this program as follows, it would appear as though you ran the dir command by itself:

C:\> dir | Passthrough.exe

Figure 1 Passthrough.exe

using System; namespace ConsoleApps { class Passthrough { [STAThread] static void Main(string[] args) { string currentLine = Console.In.ReadLine(); while(currentLine != null) { Console.Out.WriteLine(currentLine); currentLine = Console.In.ReadLine(); } } } }

An interesting property of console applications that read from standard input is that they can be used (and tested) interactively. For example, if you run Passthrough by itself (Passthrough.exe), it appears as though the program is hung—cmd.exe doesn't give you back the command prompt. If you type something and press Enter, the program echoes it back to you. Passthrough is running just as before, except that it is getting standard input from the keyboard instead of from the dir command. If you were to check your Windows Task Manager, you would see a task for Passthrough.exe. This is to be expected—Passthrough is a normal executing program that Windows identifies with a process ID. Try running the following command:

C:\> Passthrough.exe | Passthrough.exe

Here, the output of your interactive session with Passthrough routes to another instance of Passthrough, whose output routes to the console. If you check Task Manager again, you will see two instances of Passthrough.exe. Why? Because both instances of Passthrough are running simultaneously. In fact, you would see an instance of every program that you included in a pipeline. What does this mean? Remember, pipelines are the domain of cmd.exe—they have no meaning to the underlying operating system. When you construct a pipeline, as in the example I just showed, cmd.exe does the work of launching the required applications and routing their standard data streams according to the specifications of the pipeline. To the operating system, they are separate processes; you can see this through Task Manager. To the user, however, cmd.exe makes it look as though there were a single executable at work.

Most console applications process one line of standard input at a time and send one line to standard output at a time. As soon as a program that is upstream in a pipeline is finished with a line of input, it is available to be processed by a downstream program. The upstream program might have moved well beyond the line that the downstream program is currently working on. While this design will not necessarily improve the overall performance of the entire pipeline, it will tend to result in a lower overall memory requirement as well as a shorter time until the first line of output is finished being processed by the whole pipeline. There are circumstances where line-at-a-time processing is impossible (such as a sorting program or a program that needs to work with an in-memory XML tree). In these cases, standard input needs to be read in its entirety before anything can be sent to standard output. Meanwhile, downstream programs in the pipeline sit idle, waiting for something to appear on their standard input.

Just about anything can be accomplished in the Main routine of a console application. For example, the Main routine in Figure 2 builds on the idea of the Passthrough example but performs a regular expressions-based replacement on each line of standard input. If you were to run the compiled application as follows, you would see the normal output of dir, but with all instances of the string "<DIR>" replaced with "****":

C:\> dir | Replace.exe "<DIR>" "****"

Figure 2 Main

static void Main(string[] args) { if(args.Length < 2) { Console.Error.WriteLine("Usage:"); Console.Error.WriteLine("Replace \"search\" \"replace\""); return; } System.Text.RegularExpressions.Regex re = new System.Text.RegularExpressions.Regex(args[0]); string currentLine = Console.In.ReadLine(); while(currentLine != null) { Console.Out.WriteLine(re.Replace(currentLine, args[1])); currentLine = Console.In.ReadLine(); } }

A Design Pattern for Console Apps

You could continue to put all processing logic in the Main routine of your console application, but eventually you would get tired of always typing the same basic code. All of your programs would start to look pretty similar to Passthrough and Replace. It would be nice if you could inherit some functionality and simply change the parts that you need to change. You can achieve such an effect with a design pattern known as the Template Method. In it, you create a skeleton algorithm in a base class and then allow child classes to fill in the details. Expressed in pseudocode, the skeleton algorithm might look like the following pseudocode snippet:

If the arguments to the program are valid then Do necessary pre-processing For every line in the input Do necessary input processing Do necessary post-processing Otherwise Show the user a friendly usage message

An example of subclass responsibility is to define what pre-processing, post-processing, and input processing actually mean to that particular subclass. You could put logic like this in the static Main method, effectively making it the skeleton algorithm, but this presents two problems. First, console apps compile to executables. .NET executables can be used as class libraries, but Visual Studio .NET makes that difficult to do. Thus, it's hard to allow subclasses to fill in the details because it's difficult to create subclasses.

The second problem is that Main is a static method. Static methods don't belong to instances of a class; they belong to the class itself. This means that all the steps of the algorithm (pre-process, post-process, and so on) would have to be implemented as static methods as well, preventing them from being overridden. Only instance methods can be overridden and, as such, the methods representing the steps of the skeleton algorithm need to be instance methods of a class.

The best way to get around these limitations is to separate each console application into two classes. The first class is just the normal console application class with the static Main entry point (call it the "chassis"). Instead of containing processing logic, however, the Main routine creates an instance of the second class (call it the "engine") and delegates processing to it. For the Passthrough example, the chassis would look like the code in Figure 3. Notice how the engine class has a similar entry point, Main, but that it is an instance method rather than a static method. The Main method of the engine class will become your skeleton algorithm, allowing you to employ the Template Method design pattern. ConsoleEngineBase is the name of the class that will hold the implementation of the skeleton algorithm and all engines will inherit from it, supplying their own versions of the algorithm's steps along the way. The complete code for ConsoleEngineBase is given in Figure 4.

Figure 4 ConsoleEngineBase

using System; namespace Engines { public abstract class ConsoleEngineBase { private string[] m_args; protected bool ReadInput = true; protected System.IO.TextReader In = null; protected System.IO.TextWriter Out = null; protected System.IO.TextWriter Error = null; public ConsoleEngineBase() { //by default, read from/write to standard streams this.In = System.Console.In; this.Out = System.Console.Out; this.Error = System.Console.Error; } public void Main(string[] args) { this.m_args = args; if(this.ValidateArguments()) { this.PreProcess(); if(this.ReadInput) { string currentLine = this.In.ReadLine(); while(currentLine != null) { this.ProcessLine(currentLine); currentLine = this.In.ReadLine(); } } this.PostProcess(); } else this.Error.Write("Usage: " + this.Usage()); } public void Main(string[] args, System.IO.TextReader In, System.IO.TextWriter Out, System.IO.TextWriter Error) { //this version of Main allows alternate streams this.In = In; this.Out = Out; this.Error = Error; this.Main(args); } protected virtual bool ValidateArguments() { //override this to add custom argument checking return true; } protected virtual string Usage() { //override this to add custom usage statement return ""; } protected virtual void PreProcess() { //override this to add custom logic that //executes just before standard in is processed return; } protected virtual void PostProcess() { //override this to add custom logic that //executes just after standard in is processed return; } protected virtual void ProcessLine(string line) { //override this to add custom processing //on each line of input return; } protected string[] Arguments { get {return this.m_args;} } } }

Figure 3 PassthroughChassis

using System; using Engines; namespace ConsoleApps { class PassthroughChassis { [STAThread] static void Main(string[] args) { Engines.ConsoleEngineBase engine = new PassthroughEngine(); engine.Main(args); } } }

Since PassthroughEngine inherits from ConsoleEngineBase, it can be implemented very simply. The complete PassthroughEngine class is shown here:

public class PassthroughEngine : ConsoleEngineBase { public override void ProcessLine(string line) { this.Out.WriteLine(line); } }

Passthrough doesn't need to bother with any special pre- or post-processing or checking of arguments, so it doesn't override these steps of the algorithm. It needs to do one thing—echo every line of input to output—and it does this by providing its own implementation of ProcessLine. Figure 5 summarizes the interactions between the classes in this model.

Figure 5 Class Relationships

Figure 5** Class Relationships **

The benefit of employing the Template Method design pattern is obvious: inherited classes only need to change the parts of the algorithm that they require, resulting in less overall coding. But what are the consequences of breaking every console application into chassis and engine classes? There are at least two negative consequences: first, you end up with twice as many classes as before, and second, you end up with a bunch of really simple, similar chassis classes. I think these drawbacks, however, are more than outweighed by the benefits.

The first, most important benefit is that engine classes are now free to inherit from one another. PassthroughEngine can be implemented with one line of code. A second, more subtle benefit is that the engine can be chosen at run time—they are pluggable. Since all engine classes derive from ConsoleEngineBase, the chassis doesn't really care which one is used. In fact, you could generalize PassthroughChassis to be configurable at run time. Figure 6 shows the code for a chassis class that requires an assembly name and class name as command-line arguments. It uses this information to create an instance of a particular engine class. It then calls its Main method, effectively turning over processing to it. For Passthrough, the command would look like this:

C:\> GenericChassis "Engines" "Engines.PassthroughEngine"

In this case, the PassthroughEngine class is a member of the Engines namespace and resides in an assembly called Engines.dll (note that the DLL extension must be omitted from the argument in this example).

Figure 6 GenericChassis

using System; using Engines; namespace ConsoleApps { class GenericChassis { [STAThread] static void Main(string[] args) { if(args.Length < 2) { Console.Error.WriteLine("Two arguments required:"); Console.Error.WriteLine("assembly and class name."); return; } Engines.ConsoleEngineBase engine = null; string assemName = args[0]; string className = args[1]; string[] newargs = new string[args.Length - 2]; for(int i = 2; i < args.Length; i++) { newargs[i - 2] = args[i]; } try { Runtime.Remoting.ObjectHandle engineHandle = Activator.CreateInstance(assemName, className); engine = (ConsoleEngineBase)engineHandle.Unwrap(); } catch(Exception e) { Console.Error.WriteLine(e.Message); return; } engine.Main(newargs); } } }

This approach—separating the chassis from the engine—is usually called the Strategy design pattern (though I prefer the car metaphor). It allows an algorithm to change without having to create an entire subclass of its associated class. What's so bad about subclassing just to change one method? Having lots of classes that differ only in their implementation of a single algorithm can be confusing and hard to maintain. Families of algorithms are generally thought to be more intuitive than families of classes. In Figure 5, ConcreteEngine1 and ConcreteEngine2 are members of this family. Physically, they are implemented as classes; logically, they exist only to help the chassis implement its Main method. (This might have been more intuitively accomplished with delegates, though it would have complicated the use of the Template Method pattern.)

You may have noticed that there isn't anything console-specific about these engine classes. They simply know how to read and write System.IO.Streams. This is another benefit of this approach—the same logic that you use from the console can be used from (for example) a network server class. Just as you can swap out the engine, you can swap out the chassis, albeit with a little more work.

Even engines that are more complex than Passthrough are similarly straightforward. Think back to the Replace example. Replace is similar to Passthrough except that it performs a regular expressions-based replacement on each line of input. The complete listing for ReplaceEngine is shown in Figure 7. Notice how the PreProcess method sets up a single instance of the System.Text.RegularExpressions.Regex class and the ProcessLine method uses it on each line of input. I'll build further on these later in the article. For now, however, let's look at the other side of console apps: how to control them from within a .NET Framework-based app.

Figure 7 ReplaceEngine

using System; using System.Text.RegularExpressions; namespace Engines { public class ReplaceEngine : ConsoleEngineBase { private Regex replacer = null; protected override void PreProcess() { this.replacer = new Regex(this.Arguments[0]); } protected override void ProcessLine(string line) { this.Out.WriteLine( this.replacer.Replace(line, this.Arguments[1])); } protected override string Usage() { return "Replace \"findexp\" \"replaceexp\""; } protected override bool ValidateArguments() { if(this.Arguments.Length == 2) return true; return false; } } }

Controlling Console Apps with the Process Class

By now, hopefully you are thinking of console applications as building blocks that can be strung together with pipes. Indeed, this is typically how they are used. They can, however, be used in other ways. The .NET Framework class library's Process class allows .NET Framework-based programs to interoperate with console applications, regardless of their origin. The Process class communicates with the console application by way of—you guessed it—the standard input and output streams.

One occasion when this might be useful is when a console application already exists to perform a desired task. Rather than recode it in a CLR-compliant language, it might be easier to interoperate with it via the Process class. Or perhaps using the .NET Framework isn't the best way to implement a particular bit of functionality. Consider this example. Suppose you were asked to create a graphing calculator application. How would you do it using the .NET Framework? Aside from the expected tasks of laying out forms and working with the System.Drawing library, you would find yourself faced with having to figure out how to evaluate expressions. That is, how can you find out the value of (sin(x) + 3)/16 for all values of x?

It isn't an easy problem. You would probably find yourself having to write an expression parser or attempting to compile the expression to Microsoft intermediate language (MSIL) on the fly. Or, you could take the easy way out and use the venerable VBScript Eval function. It isn't part of the .NET Framework, but it sure is easy. It turns out that it takes about 20 lines of VBScript to implement a simple calculator that works on standard input and output. If you were to run this script in a console window, you would find that you can type expressions like "1+1" or "sin(23) + 232.23 / 17" into standard input and see results like "2" or "12.8143678311189" appear on standard output. Since it loops forever, you must type Ctrl-C to end the program when using it in interactive mode.

Meanwhile, back in the .NET world, you need a way of communicating with the VBScript calculator. This is where the Process class comes in. The following code snippet starts up an instance of the VBScript calculator:

System.Diagnostics.Process calcProc = new System.Diagnostics.Process(); System.Diagnostics.ProcessStartInfo i = new System.Diagnostics.ProcessStartInfo(); i.FileName = "cscript.exe"; i.Arguments = "//NoLogo calc.vbs"; i.RedirectStandardOutput = true; i.RedirectStandardInput = true; i.RedirectStandardError = true; i.CreateNoWindow = true; i.UseShellExecute = false; calcProc.StartInfo = i; calcProc.Start();

Let's look at what's going on here. The first step is to create an instance of the Process class, called calcProc. In order to specify detailed start-up parameters for Process, the .NET class library provides a helper class, ProcessStartInfo. The next seven lines create and populate an instance of this class. The properties are described in Figure 8. Finally, the instance of the Process class is handed the instance of the ProcessStartInfo class, and the Process is started.

Figure 8 System.Diagnostics.ProcessStartInfo Properties

Property Description
FileName The name of the executable to start when the Start method is called on the Process class.
Arguments Command-line arguments to pass to the executable specified in FileName.
RedirectStandardOutput Indicates whether the standard output for the target application (that is calc.vbs) should be routed to the instance of the Process class. The alternative is to have it go to the default location for standard output—the console.
RedirectStandardInput Same as RedirectStandardOutput, but for standard input instead.
CreateNoWindow Causes the target application to execute invisibly. Not setting this property causes a command window to appear.
UseShellExecute Indicates whether the Windows shell or the Process class should be used to start the process. This property must be set to false in order to redirect the input and output streams.

Thereafter, the Framework-based application can communicate with the VBScript calculator via the StandardInput and StandardOutput properties of calcProc. For example, the following code sends an expression to the calculator and reads the result:

calcProc.StandardInput.WriteLine("1+1"); string result = calcProc.StandardOutput.ReadLine();

To turn the expression evaluator into a graphing calculator, you just need to send the expression multiple times with different values of x. Remember, the VBScript calculator stays running for the lifetime of the Process class. You just keep sending it standard input and it keeps returning standard output.

Figure 9 Graphing Calculator

Figure 9** Graphing Calculator **

Figure 9 shows the finished graphing calculator. The majority of the code deals with drawing the graph and is tangential to this article (it's included in the code download at the link at the top of this article). There is, however, one more important point to make: the VBScript calculator will continue to run even after you close the Framework-based application that invoked it. To ensure that you don't have an orphaned process, you can send Ctrl-Z in input, or explicitly end it by invoking the Kill method:

calcProc.Kill();

I'll bet you never thought you'd see interoperability between the .NET Framework and VBScript! While this example is contrived, I don't think it is entirely far-fetched. There are many useful VBScript programs (not to mention Perl, Python, and C programs) already written for systems management tasks. While these will eventually get ported to managed code, it won't be for a long time. Moreover, when they do get ported, it won't be all at once. Interoperability with legacy code will always be important and standard input and output are simple, effective ways to achieve it.

Yes, But Can it Read an RSS Feed?

In case you aren't familiar with it, RSS is a way of syndicating information on the Internet in a machine-readable format. It takes advantage of the addressing and transport aspects of the Web (URLs and HTTP) while eschewing its markup language (HTML) in favor of something more structured (the RSS dialect of XML). Programs that read RSS feeds have become commonplace and are frequently considered archetypal examples of how to create XML and network-savvy applications. Figure 10 shows the XML code for a shortened RSS feed.

Figure 10 RSS Feed

<rss version="2.0"> <channel> <title>MSDN: .NET Framework and CLR</title> <link>https://msdn.microsoft.com/netframework/</link> <description> The latest information for developers on the Microsoft .NET Framework and Common Language Runtime (CLR). </description> <language>en-us</language> <ttl>1440</ttl> <item> <title> Improving Web Application Security: Threats and Countermeasures </title> <pubDate>Thu, 12 Jun 2003 07:00:00 GMT</pubDate> <description> Bake security into your application lifecycle. You'll get guidance that is task-based and modular, with tons of implementation steps. </description> <link>...</link> </item> </channel> </rss>

Reading an RSS feed is really a matter of fetching a URL, transforming its contents from machine-readable to human-readable form, and displaying it to the user. Usually there is a caching step between steps one and two so that the whole thing can work offline as well. This pattern of fetch, cache, transform, and display is repeated over and over for every RSS feed in which a particular user is interested. Fetch, cache, transform, and display. It starts to sound like a pipeline. Can the world's first command-line RSS reader be far behind?

Like any good pipeline, the steps are not RSS specific. Figure 11 provides a description of how the first three steps should work. I have left out the "display" step because the console takes care of it in this case. These three console applications fit together in a pipeline, like the one shown in Figure 12.

Figure 11 Reading an RSS Feed

Step Description
Fetch URL Retrieves the contents of the URL given in the first argument. Writes contents to standard out. If an error is encountered, writes error information to standard error and writes nothing to standard out.
Cache file name Takes contents of standard in, writes it to the file given in the first argument, and passes data through to standard out. If standard in is empty, attempts to read the file given in the first argument and sends its contents to standard out. File access errors are written to standard error.
Transform stylesheet Expects valid XML on standard in. If XML is valid, transforms it with the XSLT stylesheet given in the first argument and sends the results to standard out. Writes errors to standard error.

Figure 12 An Application Pipeline

Figure 12** An Application Pipeline **

When implementing fetch, cache, and transform, you should continue with your design patterns as I've described them. To refresh your memory, all engine classes should be implemented as subclasses of the ConsoleEngineBase class. Figure 13 shows the class library so far. Code for all engines is included in the download for this article.

Figure 13 Class Library

Figure 13** Class Library **

As I've shown, it isn't very difficult to add functionality to the engine classes. In all cases, it is primarily a matter of overriding PreProcess, PostProcess, and ProcessLine. All three of these new engines take advantage of PreProcess to create instance-level private resources: in the case of FetchEngine, an instance of the WebClient class; in the case of CacheEngine, a file handle; in the case of TransformEngine, an instance of the XslTransform class. The code for all three engines is included in the download for this article.

Now that the engines are implemented, you can test them. The following command line uses the generic chassis class to put the RSS reader through its paces:

C:\> GenericChassis "Engines" "Engines.FetchEngine" "https:// msdn.microsoft.com/netframework/rss.xml" | GenericChassis "Engines" "Engines.CacheEngine" "netframework.xml" | GenericChassis "Engines" "Engines.TransformEngine" "FormatForConsole.xslt"

That's a lot of typing to view an RSS feed. You might consider creating a batch file with that in it, like this:

GenericChassis "Engines" "Engines.FetchEngine" "%1" | GenericChassis "Engines" "Engines.CacheEngine" "%2" | GenericChassis "Engines" "Engines.TransformEngine" "FormatForConsole.xslt"

Assuming you called your batch file rss.bat, you could execute it by running:

C:\> rss.bat "https:// msdn.microsoft.com/netframework/ rss.xml" "netframework.xml"

If you run it while connected to the Internet, you should see the contents of the RSS feed scroll past your eyes in a somewhat readable format. If you disconnect, the cache engine should kick in, allowing you to still see the feed.

Now let's say you want to take advantage of Windows Forms to improve the user interface. At first, this sounds straightforward: create an instance of the Process class, feed it the command text for the pipeline, read standard output, and show it to the user. Unfortunately, it doesn't prove to be that simple. Unlike the calculator example, this one involves three distinct processes: fetch, cache, and transform. Remember, when you run these via cmd.exe they actually appear in Task Manager as three distinct processes. Cmd.exe takes care of routing the standard data streams among them. The operating system itself has no notion of pipes. Consequently, the Process class has no notion of pipes. Trying to send a pipeline to a Process instance will not get you what you want.

There are two different ways to get around this. First, you could invoke cmd.exe via the Process class and send commands to its standard input. Cmd.exe is, after all, just another console application. It differs from other console applications only in that it expects commands rather than data on standard input. In this scenario, you would let cmd.exe do all the work of parsing the command lines and routing the standard data streams. You could monitor cmd.exe's standard output to get the results.

The second option is to do the work of routing the standard streams yourself. That is, you could create a class called Pipe that joins one Process to another. The Pipe class would basically have the job of looking for standard output of one Process and immediately sending it to standard input of the other Process. You could use a combination of processes and pipes to build complex pipelines in your code.

I tried both options and can confirm that they both work. For the first option, I created a class called CommandRunner that references an instance of Process internally. It has a single method, called RunCommand, that takes a command string and returns the output of the command as a string. The internal Process class serves as a proxy to a running instance of cmd.exe. The RunCommand method sends commands to it using standard input and reads the results on standard output.

Figure 14 Reading Dir Listing

Figure 14** Reading Dir Listing **

But there's a problem here. The standard output stream of cmd.exe never ends, though it frequently contains no data to read. What does this mean? Basically, that read operations on standard output will block indefinitely when the end of the output is reached, effectively locking up the program. Figure 14 shows a typical interaction with cmd.exe via the standard streams. Cmd.exe is sent a command (dir in this case) and responds, as expected, with a directory listing. Since a directory listing can contain any number of lines, the client program doesn't know when to stop reading. Eventually, the client program will attempt to read a line that doesn't exist and will wait for it, blocking execution of the entire program.

For this reason, the code shown in Figure 15 will not work. One way around this is to run an output "monitor" on another thread, but that still leaves ambiguity as to whether the current command is finished or is just in the middle of a long operation. If you run cmd with the /c option, it will execute and exit immediately. However, you can also check this manually. To find out if cmd.exe is finished with the current command, Figure 16 shows an effective, if ugly, solution. The trick here is to insert a known string into the output of the command and then look for it. When it is found, you know you have reached the end of the current command's output.

Figure 16 Looking for End String

public string RunCommand(string cmd) { this.InnerProcess.StandardInput.WriteLine(cmd); this.InnerProcess.StandardInput.WriteLine("echo ---end---"); StringBuilder output = new StringBuilder(); string currentLine = this.InnerProcess.StandardOutput.ReadLine(); while(currentLine != "---end---") { output.Append(currentLine); output.Append("\r\n"); currentLine = this.InnerProcess.StandardOutput.ReadLine(); } return output.ToString(); }

Figure 15 Reading Output

public string RunCommand(string cmd) { this.InnerProcess.StandardInput.WriteLine(cmd); StringBuilder output = new StringBuilder(); string currentLine = this.InnerProcess.StandardOutput.ReadLine(); while(currentLine != null) { output.Append(currentLine); output.Append("\r\n"); //the next line will block when there is no more //output from the command currentLine = this.InnerProcess.StandardOutput.ReadLine(); } return output.ToString(); }

Because of all the subterfuge required to get cmd.exe to cooperate, it's tempting to try to implement your own piping functionality. To accomplish this, I created two new classes. The first class, Pipe, connects two Processes. It continuously reads the output from the "left-side" Process and writes it to the input of the "right-side" Process. Because the Read operation will block as it waits for more input, the read/write loop executes on a separate thread. The following example connects the two commands via the Pipe class:

ConsoleProcess atProc = new ConsoleProcess("at.exe", "/?"); ConsoleProcess findProc = new ConsoleProcess("findstr.exe", "\/delete"); Pipe pipe = new Pipe(atProc, findProc); atProc.Start(); findProc.Start(); pipe.Start(); string output = findProc.StandardOutput.ReadToEnd();

The complete code for the Pipe class is available as a part of the code download.

The second new class, Pipeline, automates some of these actions and allows an arbitrary number of Processes to be strung together. Internally, it keeps a list of Processes and Pipes. Each Pipe connects two of the Processes in the list. Processes are added to the Pipeline via the Add method. As you would expect, the Pipeline class exposes StandardInput and StandardOutput properties, which are just proxies for the StandardInput of the first Process and the StandardOutput of the last Process, respectively. The example I just showed could be recast using the Pipeline class as follows:

ConsoleProcess atProc = new ConsoleProcess("at.exe", "/?"); ConsoleProcess findProc = new ConsoleProcess("findstr.exe", "\/delete"); Pipeline pipeline = new Pipeline(); pipeline.Add(atProc); pipeline.Add(findProc); pipeline.Start(); //pipes are started automatically string output = pipeline.StandardOutput.ReadToEnd();

The code for the Pipeline class is also available in the code download.

Figure 17 An RSS Reader

Figure 17** An RSS Reader **

Either of these techniques (sending commands to cmd.exe or using the Pipe and Pipeline classes) could be used to hook the RSS pipeline into another .NET Framework-based application. The Pipe/Pipeline version is cleaner, but is slightly harder to implement. The cmd.exe version feels like a workaround, but you get all the features of cmd.exe for free. I built a Windows Forms user interface to the RSS pipeline that works both ways (shown in Figure 17 and included as a download), complete with support for clickable hyperlinks.

Conclusion

The reason I like console applications so much is that they reduce software to its bare essentials. Console applications only need to worry about four things: command-line arguments and the three standard data streams. The simplicity of this model looks like a severe limitation, but the pipe allows independent programs to behave symbiotically, enabling them to do more together than the sum of what they could do separately.

Implementing console applications with the .NET Framework is easy. With judicious use of some design patterns, it becomes easier still. The Process class makes accessing console applications from .NET straightforward, as long as you understand the limitations. Windows doesn't understand pipes; cmd.exe does. As such, to get pipe functionality in a Framework-based program, you need to either route all commands through cmd.exe or build your own piping capability. Both ways work, but each has trade-offs.

Console applications are currently an overlooked class of applications on Windows. This, however, is planned to change in the Longhorn timeframe. Microsoft is completely redesigning the console as a managed application for the next major release of its operating system. For more information on the Longhorn Command Shell (MSH), see the presentation from this year's PDC session ARC334, available at PDC 2003 Sessions.

For related articles see:
Design Patterns: Solidify Your C# Application Architecture with Design Patterns

For background information see:
The Unix Programming Environment by Brian W. Kernighan and Rob Pike (Prentice-Hall, 1984)
Design Patterns by Gamma, Helm, Johnson, and Vlissides (Addison-Wesley, 1995)

Michael Brook is an independent consultant based in Pasadena, CA. He spends his time mining the depths of the .NET Class Library in search of the simplest solutions. He can be reached at horst@alumni.princeton.edu.