Immutability in C# Part Four: An Immutable Queue
An immutable queue is a bit trickier than an immutable stack, but we’re tough, we can handle it.
First off, the interface is straightforward; enqueuing or dequeuing results in a new queue:
public interface IQueue<T> : IEnumerable<T>
{
bool IsEmpty { get; }
T Peek();
IQueue<T> Enqueue(T value);
IQueue<T> Dequeue();
}
But how ever are we to implement this? It’s not immediately obvious how to make an immutable first-in-first-out list.
The first insight we need is that a stack is essentially a “backwards” queue. Because I think it will come in handy later, I’m going to declare an extension method right now which takes a stack and reverses it:
static public IStack<T> Reverse<T>(this IStack<T> stack )
{
IStack<T> r = Stack<T>.Empty;
for (IStack<T> f = stack; !f.IsEmpty; f = f.Pop())
r = r.Push(f.Peek());
return r;
}
This is an O(n) operation for a stack of size n.
Now that we’ve got that we can make a first cut at our queue implementation. Let’s say that a queue is backed by a stack such that every time we need to get at the “dequeue end”, we’ll just reverse the stack:
sealed class Queue<T> : IQueue<T> {
private static readonly IQueue<T> empty = new Queue<T>(Stack<T>.Empty);
public static IQueue<T> Empty { return empty; }
private readonly IStack<T> backwards;
private Queue(IStack<T> backwards) { this.backwards = backwards; }
public T Peek() { return backwards.Reverse().Peek(); }
public IQueue<T> Enqueue(T t) { return new Queue<T>(backwards.Push(t)); }
public IQueue<T> Dequeue(T t) { return new Queue<T>(backwards.Reverse().Pop().Reverse()); }
public bool IsEmpty { get { return backwards.IsEmpty; } }
public IEnumerator<T> GetEnumerator() { return backwards.Reverse().GetEnumerator(); }
IEnumerator IEnumerable.GetEnumerator() {return this.GetEnumerator();}
}
“Surely this is incredibly inefficient!” I hear you say. Yes, most certainly it is! Imagine enqueuing a thousand elements and then dequeuing them all. The thousand enqueues are each cheap, but the first dequeue requires us to entirely reverse a stack of 1000 items and then reverse another stack of 999 items. The next dequeue requires reversing stacks of 999 and 998 items. Dequeing a single item costs O(n) in the size of the queue! The total cost for all the dequeues is the reversal of 2000 stacks with an average size of about 500, so that’s about a million operations to dequeue the whole thing.
When we reversed the stack to dequeue it the first time, we had a stack that was in the right order for the rest of the dequeues. Why didn’t we keep that thing around? Only because the “forwards” stack that resulted is not in the right order for enqueuing new items. So let’s keep two stacks around, one in “forwards” order, ready to be dequeued, and one in “backwards” order, ready to be enqueued. If we go to dequeue and find that the forward stack is empty, only then will we reverse the backwards stack. (And let's make the empty queue a singleton, like we did with the empty stack.)
public sealed class Queue<T> : IQueue<T>
{
private sealed class EmptyQueue : IQueue<T>
{
public bool IsEmpty { get { return true; } }
public T Peek () { throw new Exception("empty queue"); }
public IQueue<T> Enqueue(T value)
{
return new Queue<T>(Stack<T>.Empty.Push(value), Stack<T>.Empty);
}
public IQueue<T> Dequeue() { throw new Exception("empty queue"); }
public IEnumerator<T> GetEnumerator() { yield break; }
IEnumerator IEnumerable.GetEnumerator() { return this.GetEnumerator(); }
}
private static readonly IQueue<T> empty = new EmptyQueue();
public static IQueue<T> Empty { get { return empty; } }
public bool IsEmpty { get { return false; } }
private readonly IStack<T> backwards;
private readonly IStack<T> forwards;
private Queue(IStack<T> f, IStack<T> b)
{
forwards = f;
backwards = b;
}
public T Peek() { return forwards.Peek(); }
public IQueue<T> Enqueue(T value)
{
return new Queue<T>(forwards, backwards.Push(value));
}
public IQueue<T> Dequeue()
{
IStack<T> f = forwards.Pop();
if (!f.IsEmpty)
return new Queue<T>(f, backwards);
else if (backwards.IsEmpty)
return Queue<T>.Empty;
else
return new Queue<T>(backwards.Reverse(), Stack<T>.Empty);
}
public IEnumerator<T> GetEnumerator()
{
foreach (T t in forwards) yield return t;
foreach (T t in backwards.Reverse()) yield return t;
}
IEnumerator IEnumerable.GetEnumerator() {return this.GetEnumerator(); }
}
}
Isn’t this just as inefficient? No. Reason it through: every element (except the first one) is pushed on the backwards stack once, popped off the backwards stack once, pushed on the forwards stack once and popped off the forwards stack once. If you enqueue a thousand items and then dequeue two, yes, the second dequeue will be expensive as the backwards list is reversed, but the 998 dequeues following it will be cheap. That’s unlike our earlier implementation, where every dequeue was potentially expensive. Dequeuing is on average O(1), though it is O(n) for a single case.
Next time: OK, so we can do stacks and queues. Can we make an immutable binary search tree with guaranteed good search performance? But of course we can.
Comments
Anonymous
December 10, 2007
The comment has been removedAnonymous
December 10, 2007
The comment has been removedAnonymous
December 10, 2007
I cannot see (I am neww to all this) how immutable Queue will help us with ten threads queueing stuff and ten threads dequeueing stuff without a need for the synchronization.Anonymous
December 10, 2007
I did wonder how you would do this. Having seen your solution it brings railroad tracks to mind. Shunt the coaches into a railroad siding. When you need to "dequeue" them reverse them into another siding before dequeuing the first coach.Anonymous
December 10, 2007
Well, start by worrying about two threads, not ten threads. There are no problems involving ten threads that are not also problems involving two threads, so keep it simple. If your goal is to have one queue which is shared by a producing thread and a consuming thread, then you are better off modeling the problem like that: a single threadsafe queue that can be stuck into a shared location. Immutable data structures do not make that scenario easier. (Rather, they move the problem from the queue provider's plate onto the queue user's plate; the queue user must then synchronize access to the variable shared across threads.) However, if your goal is to have one thread build up data structures and then hand them off to other thread to be consumed later, then immutable data structures are your friends. Once they are created, you don't need to worry about whether they are being accessed on multiple threads at a time because they are read-only structures. As many threads as you want can access them at once.Anonymous
December 10, 2007
If we passing the data structure for the 'read-only' consumption, why do we care that it was Queue, or Stack (i.e. had some behavior)? It had some meaning for the creator of the data, but for the consumer it is just some readonly vector (or sequence/stream).Anonymous
December 10, 2007
Maybe. Or maybe that thread intends to do some further transformation of it and hand it off to a third thread.Anonymous
December 10, 2007
Your solution is fine for a single thread, but has a weakness with respect to a multithreaded situation, namely that if you enqueue a thousand items and then pass that one queue off to a thousand threads which each dequeue twice, all those threads will incur considerable expense. In "Purely Functional Data Structures", Okasaki discusses solutions to this problem.Anonymous
December 10, 2007
Right. I deliberately glossed over that one. If we were really buff then we'd do something like memoize the result of the dequeue so that the cost was only born once -- but then of course you have the problem of implementing a threadsafe memoizer! There is no free lunch.Anonymous
December 10, 2007
There ain't no free lunch, but some lunches are cheaper than others -- and if memoization is part of your infrastructure, then someone else is paying. I guess that's why Okasaki uses Haskell.Anonymous
December 10, 2007
Another great post. Thanks! Are readers permitted to use the source code?Anonymous
December 11, 2007
You're welcome, and do whatever you want with the code.Anonymous
December 11, 2007
These immutable data structures are nice, but I don't really understand the link between these and a hypothetical immutability-related new langage feature. I'm sure you have something in mind when writing these articles about immutable data stuctures. I guess I'm too impatient.Anonymous
December 11, 2007
I don't understand it either. I would like to give some examples of the kind of coding style that works today as a starting point for answering the question "how can we make this easier and more powerful?" You don't know what sorts of problems even need solving until you try it a few times. For instance, we've seen this pattern: private sealed class EmptyThing : IThing {/.../} private static readonly EmptyThing = new EmptyThing(); public static IThing Empty { get { return empty; } } And any time I see a pattern like that, I wonder if the language ought to be making that easier. But it depends on how you slice it. One way would be to attack the singleton: private singleton class EmptyThing : IThing { /.../} public static IThing Empty { get { return new EmptyThing(); // singleton ctor always returns same thing } } one way would be to attack the property initializer: public static IThing Empty { get ; } = new EmptyThing(); etc.Anonymous
December 11, 2007
This presentation (actually handout), gives nice graphical representation of how functional data structures work: www.cis.upenn.edu/~bcpierce/courses/advprog/lectures/lec6.pptAnonymous
December 11, 2007
Alright, I understand your point.Anonymous
December 12, 2007
I can smell my cortext burning. :)Anonymous
December 12, 2007
Jomo Fisher--Reading one of my favorite blogs this morning--Eric Lippert's Fabulous Adventures in Coding--IAnonymous
December 12, 2007
You argue for the efficiency of this solution, but you only speak of speed, not of memory. This solution assumes that you are willing to shoulder the significant extra memory burden of keeping multiple copies of the queue for each queue object. Of course, if you are using immutable objects, then memory concerns are probably not at the top of your priority list, and .NET in general does not have have memory efficiency as a primary design goal. However, whereas the immutable stack does not take any more memory than the mutable stack (and can in fact take less when you have multiple instances derived from each other), the immutable queue will always take significantly more memory than its mutable counterpart. I can see the lack of transparency into these kinds of implementation details leading to significant problems down the road.Anonymous
December 12, 2007
I'm not following your train of thought here. You agree that an immutable stack takes no more memory than a mutable stack, and sometimes far less when multiple instances share state. An immutable queue is built out of two immutable stacks, so why shouldn't an immutable queue gain the same sharing benefit? I do not see why an immutable queue would take up significantly more memory than a mutable queue. I agree that an immutable queue spends more time allocating memory than a mutable queue, but the garbage collector will throw away the old state if it is no longer alive. Why should the working set of an immutable queue be any larger than the working set of a mutable queue?Anonymous
December 13, 2007
"An immutable queue is built out of two immutable stacks..." And therefore uses up roughly twice the memory, because it has to keep both the forwards and backwards versions in memory at the same time, whereas with a normal queue the backwards version doesn't even need to exist. Is that not obvious?Anonymous
December 13, 2007
I'm still not following you. Can you show me a sequence of enqueues and dequeues which results in a queue which logically has n items, where the total size of the forward and backwards stacks added together is not n?Anonymous
December 13, 2007
David, look at it this way. Define f as the size of the forwards stack, b as the size of the backwards stack. The memory size of the queue is f + b. If you enqueue you push something onto the backwards stack. So enqueueing goes from memory size f + b to f + (b + 1). If you dequeue, there are three cases. The queue starts with memory size f + b: if f >= 2 then the new queue has size (f - 1) + b else if f == 1 && b == 0 then the new queue has size 0 else if f == 1 && b > 0 then the new queue has size b + 0 Which in every case goes from f + b to f + b - 1. So, enqueuing increases the memory size by one, dequeuing decreases the memory size by one. Is that clear?Anonymous
December 15, 2007
Eric wouldn't it be simpiler to point out that f and b don't contain duplicates of each others contents ? Dr Calvin: "I make the robots appear more human" Det. Spooner: "There wasn't that easier to say?" Dr Calvin: "No, not really" ;)Anonymous
December 24, 2007
The comment has been removedAnonymous
December 24, 2007
No, the problem is that you are narrowly defining what you mean by "reliable in a mulithreaded environment". You are defining 'reliable' to mean 'two threads can enqueue "the same" queue and both enqueued items are present in the resulting queue. This is a very "procedural programming" way to think about thread safety -- thread safety is PRESERVATION OF ALL SIDE EFFECTS. If that's the kind of reliability you want then you should probably be using a mutable threadsafe queue! If you try to use an immutable queue for this purpose then essentially you will have to synchronize access not to the queue -- the queue is threadsafe -- but to the variable which contains the "canonical" version of the queue. Immutable queues provide a different kind of reliability, namely "two threads can enqueue the same queue at the same time, and the result on each thread is NOT polluted by the result on the other." This is a very functional programming way of thinking about thread safety -- thread safety is IMMUNITY FROM SIDE EFFECTS. Enqueueing X onto queue Q1 should result in the queue which is Q1 plus X, always, independent of any nonsense happening to Q1 on another thread. Once you embrace a world where side effects are to be eliminated, not preserved, things get much easier to reason about.Anonymous
December 25, 2007
There is a little mess with immutable and lock-free structures in comments. It is a different things. Immutable structures are thread safe utilizing the "don't worry about it" principle, but they don't provide the data coherency. Lock-free structures are thread safe by the same mean and also coherent.Anonymous
December 27, 2007
Eric, Two features that would make this sort of thing easier would be case classes a la Scala, and pattern matching. That way you could write Haskell style algebraic data types easily without all that boiler plate.Anonymous
December 31, 2007
The comment has been removedAnonymous
January 18, 2008
For some reason, there's been a lot of buzz lately around immutability in C#. If you're interested inAnonymous
January 21, 2008
can someone explain why you wouldn't use a linked list with a front pointer to the next item to be dequeued and a back pointer to the last item queued? DAnonymous
January 22, 2008
There are three possible cases:
- Singly linked list with links pointing back to front. How are you going to efficiently "back up" the front pointer on dequeue?
- Singly linked list with links pointing front to back. How are you going to enqueue on the back pointer without mutating the current back link? That link is immutable.
- Doubly linked list -- same problem as #2; you have to mutate state. Linked lists make great mutable queues. The goal of this series is to describe how to make immutable data structures.
Anonymous
February 22, 2008
i know there is some mistake in making student data using linklist please solve this problem . //////////////////////////////////////////////// using System; namespace ConsoleApplication10 { /// Summary description for Class1. class Class1 { static void newdata() { Console.WriteLine("Successfull"); Console.ReadLine(); } static void display() { Console.WriteLine("There is nothing to Display."); Console.ReadLine(); } //static void e_exit() //{ // } [STAThread] static void Main(string[] args) { string input; Console.WriteLine ("Student Record"); Console.WriteLine ("=============="); Console.WriteLine ("a- Enter Data ." ); Console.WriteLine ("b- Display Data ." ); Console.WriteLine ("c- Exit." ); input = Console.ReadLine(); switch(input) { case "a" : newdata();break; case "b" : display();break; // case c : e_exit(); } main(); Console.ReadLine(); } } }Anonymous
June 03, 2012
Eric Since the backward Queue doesn't change, in dequeue operations, when we have say forward queue {1,2,3}, backward queue {4,3,2,1} after enqueue of 4, the dequeues would be {2,3} and {4,3,2,1} , then {3} and {4,3,2,1} and finally {} and {4,3,2,1} at which point we reverse the backward {1,2,3,4} and {} would be the state. Now when you dequeue you will get 1 instead of 4?Anonymous
September 25, 2012
i don't think this is O(1). assume q=1,1,1,1,1,1,...................1,1,1(10000total), then for(int i=0;i<10000;++i){q.enqueue(1).dequeue();} O(n)