Cutting Edge
Subclassing and Overriding ASP.NET Pages—Part I
Dino Esposito
Code download available at: Cutting Edge 2007_04.exe(874 KB)
Contents
Possible Scenarios
Possible Techniques
Intercepting the Page Handler
Adding New Event Handlers
Getting a Reference to the Control Tree
Page Override with HTTP Handlers
URL Rewriting
Summary
Aclient recently approached me and said "we need to enter changes to some ASP.NET pages. Can you help?" Like any consultant would, I promptly replied "Sure. Tell me more." But the client didn't actually have much more to share than the URLs to the pages. In a nutshell, the client had to modify some ASP.NET pages, but didn't have the source code. At first, I saw only trouble in this, but the more I talked to the client, the more I started to see an interesting challenge.
You don't need a button's source code to create a derived class, and you don't need an ASP.NET page's source code to modify its behavior. In Windows programming, you typically hook up low level messages and subclass a window. In ASP.NET, you can try to hook up page events to override both the page behavior and output.
After some brainstorming, I came up with a list of realistic scenarios where you might need to modify the runtime behavior of a page without touching the source code. I also came up with a number of techniques you can use to get the job done. In this two-part article, I'll go through these techniques and show you how to change the user interface and behavior of ASP.NET pages with just read-only access or even no access at all to the source code.
Possible Scenarios
Say a certain group of users run a personalized version of a Web application, and that application requires some remote debugging or profiling. You can't interrupt the service, but you still need to figure out what's going on. Performance counters and built-in tracing capabilities may not be enough. And the deployed pages may not even have tracing instructions incorporated or turned on. If this is the case, at a minimum, you'll need to intervene remotely to inject tracing code.
Another situation where this may be necessary is when temporary changes to a variety of pages are required. For a single page, you can just make a copy of the old page and replace it. Updating several pages, though, can be more difficult and make maintenance significantly more complex.
Yet another scenario is when a corporate policy prohibits write access to the source code. In this case, the source code is available, you can read it, but you can't modify it. You can, however, add extensions to the site.
Finally, the most dramatic situation-one that I really did encounter one time-is when the company runs a Web application for which the source code is no longer available for some reason.
As you can see, there are a number of scenarios in which you may need to modify the runtime behavior of pages without having access to the source code. So how would you proceed?
Possible Techniques
There are a number of techniques that allow you to modify a running ASP.NET page without touching its source code. Figure 1 lists a few approaches. Not all techniques are valid in all scenarios, and some of the techniques can be used together. The various techniques may require you to write a new HTTP module or HTTP handler, or enter changes to the web.config file. In most cases, you'll need to restart the application. In fact, changes to the web.config or global.asax files and additions to the Bin folder or the App_Code folder restart the application automatically.
Figure 1 Modify Pages without Source Code Access
Technique | Implementation |
---|---|
Access the control tree | HTTP module |
Modify the page base class | web.config |
Control replacement | web.config |
Build providers | Assembly |
Redirect a page | HTTP handler |
Override a page | HTTP handler or URL rewriting |
By using HTTP modules, you can hook up any events in the application life cycle as well as capture the HTTP handler used to render out the page. When you hold such a reference, you can manage handlers for page events (such as Page_Load and Page_PreRender) and also add or remove controls, change links to resources and images, and add or remove CSS classes. Likewise, an HTTP handler can be used to replace a single page. And URL rewriting can be used to redirect users to pages that don't even exist when the application is first deployed.
If you're having trouble with a particular control, you can use the <tagMapping> element of the web.config file to automatically redirect the page parser to a new control derived from the old one. I'll explain this technique in my next column. For now, let's focus on tools and techniques you can use to override page-level events and modify the UI of a page on the fly without changing the source code.
Intercepting the Page Handler
The processing of each ASP.NET request causes a series of application-level events, as shown in Figure 2. The request starts with the BeginRequest event and finishes with the EndRequest event. A system object called HttpApplication governs the request processing and ensures that special components-HTTP modules-can register handlers for some of these application-level events. Once authenticated and authorized (and if not resolved through the output cache), the request is assigned its own handler component.
Figure 2** Requests **
Each ASP.NET request needs a special component-the HTTP handler-to be processed. A reference to the HTTP handler in charge of the current request is available when the application lifecycle fires the PostMapRequestHandler event.
Application-level events can be wired up in two ways. You can write handlers in the global.asax file (one of the app's source files) or you can write a custom HTTP module that registers a listener for the specified event. Functionally, the two approaches are comparable. However, if you use HTTP modules, you can reach your goal by simply adding a new component and editing the web.config file in a less intrusive way. In both cases the app is restarted.
The first application event to fire after the page handler has been determined is PostMapRequestHandler. As shown in Figure 3, an HTTP module can register a listener for this event and be notified when the handler is found for each ASP.NET request. If you need to do this for only a few pages, you might want to filter out undesired pages in the module code.
Figure 3 HTTP Module to Listen for the Page Handler
Imports System
Imports System.Web
Imports System.Web.UI
Imports System.Web.Compilation
Imports System.Web.Hosting
Imports System.IO
Imports System.Text
Namespace Samples
Public Class SubclassModule : Implements IHttpModule
Private _app As HttpApplication
#Region “IHttpModule”
Public Sub Dispose() Implements IHttpModule.Dispose
End Sub
Public Sub Init(ByVal context As HttpApplication) _
Implements IHttpModule.Init
If context Is Nothing Then
Throw New ArgumentNullException(“[context] argument”)
End If
‘ Cache the HTTP application reference
_app = context
‘ Map the app event to hook up the page
AddHandler context.PostMapRequestHandler, _
AddressOf OnPostMapRequestHandler
End Sub
#End Region
#Region “App Event Handlers”
Private Sub OnPostMapRequestHandler( _
ByVal source As Object, ByVal e As EventArgs)
‘ Get the just mapped HTTP handler and cast to Page
Dim pageHandler As IHttpHandler = Nothing
If TypeOf _app.Context.Handler Is System.Web.UI.Page Then
pageHandler = _app.Context.Handler
End If
‘ If OK, register an event handler for PreRender
If Not pageHandler Is Nothing Then
HookUpPage(pageHandler)
End If
End Sub
#End Region
#Region “Helpers”
Private Sub HookUpPage(ByVal pageHandler As Page)
‘ Wire up as many application events as needed
AddHandler pageHandler.Load, AddressOf OnLoad
‘ ...
End Sub
Private Sub OnLoad(ByVal sender As Object, ByVal e As EventArgs)
Dim pageHandler As Page = DirectCast(sender, Page)
‘ TODO: add your code here
End Sub
#End Region
End Class
End Namespace
An HTTP module is a class that implements two methods: Init and Dispose. The Init method is invoked once, when the application starts up. As in Figure 3, the Init method registers listeners called for each ASP.NET request. Typically, you check the URL of the incoming request within the OnPostMapRequestHandler method. Here's an example:
Sub OnPostMapRequestHandler( _
ByVal source As Object, ByVal e As EventArgs)
Dim url as Uri
url = _app.Context.Request.Url
If Not url.AbsolutePath.Equals( _
"/WebSite1/default.aspx") Then
Return
End If
...
End Sub
An ASP.NET request makes its way through the runtime pipeline together with a system object that represents the context of the ongoing request. This object is an instance of the HttpContext class. Handler, which is one of the properties on the HttpContext object, contains a reference to the HTTP handler object that will serve the request. Unless a custom HTTP handler has been defined for the requested resource, an ASPX request is managed by an instance of a class that inherits from the System.Web.UI.Page class. At this point, you hold a reference to the object that is used to process the page and you can dynamically add new handlers for common page events, such as PreRender and Load. Here's an example:
Private Sub HookUpPage(ByVal pageHandler As Page)
' Wire up as many application events as needed
AddHandler pageHandler.Load, AddressOf OnLoad
...
End Sub
More precisely, the object bound to the Handler property is an instance of the dynamically created Page class modeled after the markup you wrote in the ASPX server resource. In other words, the page handler contains an event handler for the page's Load event if you placed a Page_Load method in the codebehind class.
The AddHandler keyword in Visual Basic® (and the += operator in C#) adds a new handler to the chain of delegates bound to the given event. As a result, any code you add to the Handler object executes after any equivalent handlers defined in the ASPX of the page. The code in the OnLoad method referenced above runs after the code in Page_Load.
At this point, what can you do in OnLoad? Just about anything, as long as you know the structure of the page and the elements in its control tree. If you can read the source code of the page, you can easily grab this information. Otherwise, you'll need a dump of the page's control tree. One way to get this is to turn on page tracing and redirect the output to the ASP.NET built-in trace.axd tool. You can turn on tracing via the web.config file, as shown here:
<system.web>
<trace enabled="true" />
</system.web>
The next step is to invoke the trace.axd utility on the Web site you're working on and take a look at the control tree. Figure 4 shows the utility in action.
Figure 4** Trace.axd in Action **(Click the image for a larger view)
As an example, I created a new Web site using the Personal Starter Kit (see Figure 5). I'll show you how to modify some of its pages without touching the source code. In particular, I'll show you how to automatically move the input focus to the user-name field of the login box.
Figure 5** Default Version of Personal Web Site **(Click the image for a larger view)
Adding New Event Handlers
If you could edit the source code, you would probably add a line or two to the Page_Load handler and call the method Focus on the reference of the Login control. From the control tree (partially shown in Figure 4), you can figure out that the page contains a Login control named Login1. The following code retrieves a child control in the page tree, moves the input focus to its area, and also resets the title of the page:
Private Sub OnLoad(ByVal sender As Object, ByVal e As EventArgs)
Dim pageHandler As Page = DirectCast(sender, Page)
' Move the input focus to control Login1
Dim ctl As Control = FindControlRecursive(pageHandler, "Login1")
ctl.Focus()
' Change title of the page
pageHandler.Title = "This page has been hooked up"
End Sub
The FindControl method on each ASP.NET Control class only lets you peer into the direct children of the control. In no way does it look into the subtree rooted in the control. For this functionality, you'd have to write your own recursive find method, as I've done in my FindControlRecursive method.
Is there a way to avoid having to walk down the control tree for every request of a given page? Not really. You can't cache the control instance because the page tree is rebuilt for each request. And without touching the source code, you can't add extra properties to the codebehind page to expose a given control through a property. Of course, it would be a lot easier if you could make copies of existing source files and replace classes or precompiled assemblies.
Figure 6 shows an altered version of the sample site. The title of the page has been changed via an HTTP module, the input focus has been moved to the login control, and the theme of the site has been changed via external code. To change the theme, you have to wire up the PreInit event:
Private Sub HookUpPage(ByVal pageHandler As Page)
AddHandler pageHandler.PreInit, AddressOf OnPreInit
AddHandler pageHandler.Load, AddressOf OnLoad
End Sub
Private Sub OnPreInit(ByVal sender As Object, _
ByVal e As EventArgs)
Dim pageHandler As Page = DirectCast(sender, Page)
pageHandler.Theme = "Black"
End Sub
Figure 6** Altered Personal Web Site **(Click the image for a larger view)
Page manipulations require getting a reference to the page's control tree. The Controls collection of the page handler is the outermost collection of controls in the page and is where your navigation in the control's tree begins.
Getting a Reference to the Control Tree
Say your client needs a quick way to add new content, such as late-breaking news, to the site. Holding a reference to the page handler is the key. As a first step, you can add a new handler for the PreRender event in the HookUpPage method of the sample HTTP module:
AddHandler pageHandler.PreRender, AddressOf OnPreRender
In the handler, you first get the page reference and then find the desired point of injection, as in Figure 7. Imagine that you want to provide notice that the Web site is driven by an external component and to do this, you want to add a new label at the very top of each page managed by the HTTP module. The following code snippet sets up a Label control and then adds it to the Controls collection of the page:
Sub AddStaticText(ByVal pageHandler As Page)
Dim msg As String = ConfigurationManager.AppSettings("StaticMessage")
Dim lbl As New Label
lbl.ID = "SubclassModule_Label1"
lbl.BackColor = Color.White
lbl.ForeColor = Color.Red
lbl.Text = String.Format("<div>{0}</div>", msg)
pageHandler.Controls.AddAt(0, lbl)
End Sub
Figure 7 HTTP Module to Intercept the Page Handler
Private Sub OnLoad(ByVal sender As Object, ByVal e As EventArgs)
Dim pageHandler As Page = DirectCast(sender, Page)
If pageHandler Is Nothing Then Return
‘ Move the input focus to control Login1
Dim controlName As String = “Login1”
Dim ctl As Control = FindControlRecursive(pageHandler, controlName)
If Not ctl Is Nothing Then ctl.Focus()
‘ Change the page title
pageHandler.Title = “This page has been hooked up”
‘ Add a new control tree with postback support
AddLinkButton(pageHandler)
End Sub
Private Sub OnPreRender(ByVal sender As Object, ByVal e As EventArgs)
Dim pageHandler As Page = DirectCast(sender, Page)
If pageHandler Is Nothing Then Return
‘ Add static text (no postback)
AddStaticText(pageHandler)
End Sub
Private Sub AddStaticText(ByVal pageHandler As Page)
Dim msg As String = ConfigurationManager.AppSettings(“StaticMessage”)
Dim lbl As New Label
lbl.ID = “SubclassModule_Label1”
lbl.BackColor = Color.White
lbl.ForeColor = Color.Red
lbl.Text = String.Format( _
“<div style=’margin:0;width:100%;font-size:20pt’;>{0}</div>”, msg)
pageHandler.Controls.AddAt(0, lbl)
End Sub
Private Sub AddLinkButton(ByVal pageHandler As Page)
Dim controlName As String = “Main”
Dim ctl As Control = FindControlRecursive(pageHandler, controlName)
If ctl Is Nothing Then Return
Dim msg As String = ConfigurationManager.AppSettings(“LinkMessage”)
Dim link As New LinkButton
link.ID = “SubclassModule_LinkButton1”
link.ToolTip = “Click to hide the ‘Create account’ button”
link.BackColor = Color.Blue
link.ForeColor = Color.Yellow
link.Text = String.Format( _
“<hr><div style=’width:100%;font-size:30pt’;>{0}</div><hr>”, msg)
AddHandler link.Click, AddressOf SubclassModule_LinkButton1_Click
ctl.Controls.AddAt(0, link)
End Sub
Private Sub SubclassModule_LinkButton1_Click( _
ByVal sender As Object, ByVal e As EventArgs)
Dim pageHandler As Page = DirectCast(_app.Context.Handler, Page)
If pageHandler Is Nothing Then Return
Dim controlName As String = “Image1”
Dim ctl As WebControl = DirectCast( _
FindControlRecursive(pageHandler, controlName), WebControl)
If ctl Is Nothing Then Return
ctl.Visible = False
End Sub
The Controls property is an instance of the ControlCollection class and features two methods for adding new control instances: Add and AddAt. Add appends new controls to the collection; AddAt is more flexible, allowing you to specify a 0-based index for the desired position. The label text can contain any HTML markup and can be read (all or in part) from any external resources, including databases, XML documents, and the web.config file.
A label or literal control is good at showing dynamic contents but not if you want to generate and handle postback events. Instead, let's add a link button. If the link button is intended to take the user to a new page, then there's no significant difference with the scenario I just considered. But if the link button has to originate a postback, run some code on the server, and then update the page, you need to reconsider the solution.
As obvious as it may seem, a control added during the pre-rendering stage is not yet part of the control tree when the postback event is processed. The ASP.NET page runtime figures out the target of the postback and raises the event before the pre-rendering stage. The sender of the postback is written in the HTTP request.
The ASP.NET runtime needs to find a match between the sender name and an existing control in order to fire the postback event. For this reason, the link button must be created earlier than the PreRender event. A good time for this to occur is the Load event. Imagine the injected control tree contains some input fields, such as check or textboxes. The state of these controls must be updated with data posted from the client. For the update to take place automatically, these controls must be created no later than the Load event. In this case, the ASP.NET runtime guarantees that the viewstate is restored correctly and any relevant posted data is applied.
In the Load event, you add a new LinkButton control and add a handler for its Click event:
AddHandler link.Click, AddressOf SubclassModule_LinkButton1_Click
When the user clicks on the link, the page will post back and ASP.NET will have all the information it needs to resolve the target of the postback correctly. As a result, the specified event handler is invoked.
So what can you do from within the handler? Most everything or, more precisely, everything you know how to do. But without source markup and the codebehind class environment, simple tasks can prove challenging. Suppose you want to change the status of another control in the page. For instance, say you want to let the user click on the newly added link button and disable another button in the page. Admittedly, this functionality doesn't make a lot of sense when put in these terms. However, taken individually, these represent the tasks you might want to accomplish on hooked and subclassed ASP.NET pages.
The question, then, is how would you retrieve the page object from within the handler? The page object is required for gaining access to the overall control tree and locating the control to work with. A reference to the current HTTP handler-that is, the Page object-is stored in the Handler property of the HttpContext object. You don't normally resort to this sort of trick to get the page reference in the codebehind, but from an HTTP module, this trick is essential:
Sub LinkButton1_Click(sender As Object, e As EventArgs)
Dim pageHandler As Page = DirectCast(_app.Context.Handler, Page)
If pageHandler Is Nothing Then Return
...
End Sub
You use the page object to start another recursive search to locate the control of choice. For example, say you want to disable the "create account" button shown in Figure 5. A quick analysis of the trace output and the HTML source of the page reveals that there's no real button behind this. The Personal Web site uses a number of real-world techniques to build sites. It deeply leverages the layout capabilities of cascading style sheets and uses a lot of images to provide a more appealing UI. Thus, the create account button is expressed as follows:
<a href="https://register.aspx">
<asp:image id="Image1" runat="server" />
</a>
In the Controls collection, you'll find a reference to the Image control, but nothing specifically for the <a> tag as it lacks the runat="server" attribute. This is an inherent limitation to the subclassing techniques I'm using here. Subclassing is not like direct programming and should only be used when necessary. Subclassing an unaware page doesn't let you reproduce everything you can do normally in ASP.NET. In ASP.NET, any sequence of markup text not decorated with the runat attribute is treated like plain text and emitted verbatim. Furthermore, contiguous text is grouped in a single LiteralControl, making it even harder for subclassers to find any sort of reference to a particular piece of markup and its server counterpart. One thing you can try is to figure out the ID of the HTML tag and inject some script code that does the work upon loading of the page. Unfortunately, in this case, the <a> tag lacks even the ID attribute, making it virtually impossible to program against on both the server and the client. Disabling the image is possible, but this doesn't stop clicking. A possible alternative is to hide the control:
Dim controlName As String = "Image1"
Dim ctl As WebControl = DirectCast( _
FindControlRecursive(pageHandler, controlName), WebControl)
If ctl Is Nothing Then Return
ctl.Visible = False
Another technique to manipulate the output of an existing page is to capture the response before it is sent to the browser. From an HTTP module, you hook up the PostRequestHandlerExecute event and assign a custom stream to the Filter property of the HttpResponse object. After this, any text being written to the response stream will pass through your own stream giving you a chance to preview the HTML source and make any required changes, such as modifying DOM and script. A real-world scenario where this technique is useful is changing the URL of a hyperlink inserted as a simple <a> tag in the source page.
Page Override with HTTP Handlers
Subclassing is a good technique for entering limited changes to an existing page without touching the source code. If you can replace a given page with a new page (one with the same name but different content) then, by all means, that is the technique you should use. Subclassing is different from replacing and should be used when you have no simpler alternatives.
To completely change the output of a page without touching the source code, you can redirect the page to a different HTTP handler. The change occurs in the web.config file, like so:
<httpHandlers>
<add verb="*"
path="register.aspx"
type="Samples.Modules.RegisterAspxHandler, Subclass"
validate="false" />
</httpHandlers>
Any attempt to access register.aspx will be handled by the specified HTTP handler rather than the original page. Needless to say, the HTTP handler is responsible for any output that is shown to the end user. An HTTP handler is a class that implements the IHttpHandler interface. All of the final output is generated inside the ProcessRequest method:
Public Sub ProcessRequest(ByVal context As HttpContext) _
Implements IHttpHandler.ProcessRequest
context.Response.ContentType = "text/plain"
context.Response.Write("Hello World")
End Sub
With such a handler registered in the web.config file, the output of the register.aspx page is a simple "Hello World" message.
An HTTP handler represents the lowest-level tool to serve any ASP.NET requests. Once set up, the HTTP handler code is the only endpoint that executes for the request. For page requests, therefore, a custom HTTP handler means no viewstate, no postback, and no lifecycle events. If the request has to go through the regular lifecycle, you should consider the SetRenderMethodDelegate method on the Page class. The method designates a callback that is invoked to render the page contents to the browser. The method is actually defined on the Control class and is available for server controls to render their content into parent controls. Here's an example:
Private Sub OnPreRenderComplete(object sender, EventArgs e)
Dim pageHandler As Page = DirectCast(sender, Page)
If Not pageHandler Is Nothing Then
Dim method As New RenderMethod(RenderPageContents)
pageHandler.SetRenderMethodDelegate(method);
End If
End Sub
Private Sub RenderPageContents( _
ByVal output As HtmlTextWriter, _
ByVal container As Control)
...
End Sub
The method has been designed for very specific uses by the framework and is not intended for public use. Yet, it is an option to consider if all you need to do is override the page rendering without affecting the page lifecycle. Keep in mind, however, that this technique could cause trouble if the render method delegate was previously set to something else.
URL Rewriting
URL rewriting is a good technique for programmatically redirecting a request to a different URL. Basically, URL rewriting consists of changing the URL of the request early in the application request lifecycle-for example, in BeginRequest. To rewrite the URL, you use the RewritePath method of the HttpContext object:
HttpContext context = HttpContext.Current;
context.RewritePath("newpage.aspx");
The effect is the same as with a classic HTTP 302 redirect, except no new physical request is made by the browser. By embedding the preceding code snippet in an HTTP module (or by simply editing the global.asax file), you can check the target of the current request and decide whether to redirect it to another page and different content.
ASP.NET 2.0 supports the <urlMappings> section for a purely declarative and unconditional URL rewriting.
Summary
Like any other type of application, a Web site is made up of source code-whether codebehind compiled code, markup, or script. The source code base is responsible for the user interface and the behavior of the site. If, at some point, you need to change the user interface and behavior, the easiest approach is editing the source code. However, if this is not possible, you can try the techniques described in this article to achieve your goals.
If you compare Figure 5 and Figure 6, you'll see what an external binary component added to the site can do. All the changes you observe have been obtained by simply adding a component-an HTTP module-and completely ignoring the source files. Keep in mind that making runtime changes introduces a per-request overhead, so you should aim to make the site as flexible and configurable as possible and try to anticipate potential changes.
Send your questions and comments for Dino to cutting@microsoft.com.
Dino Esposito is a mentor at Solid Quality Learning and the author of Programming Microsoft ASP.NET 2.0 (Microsoft Press, 2005). Based in Italy, Dino is a frequent speaker at industry events worldwide. Get in touch with Dino at cutting@microsoft.com and join the blog at weblogs.asp.net/despos.