Immutability in C# Part 10: A double-ended queue
Based on the comments, the implementation of a single-ended queue as two stacks was somewhat mind-blowing for a number of readers. People, you ain't seen nothing yet.
Before we get into the actual bits and bytes of the solution, think for a bit about how you might implement an immutable queue which could act like both a stack or a queue at any time. You can think of a stack as "it goes on the left end, it comes off the left end", and a queue as "it goes on the left end, it comes off the right end". Now we want "it goes on and comes off either end". For short, we'll call a double ended queue a "deque" (pronounced "deck"), and give our immutable deque this interface:
public interface IDeque<T>
{
T PeekLeft();
T PeekRight();
IDeque<T> EnqueueLeft(T value);
IDeque<T> EnqueueRight(T value);
IDeque<T> DequeueLeft();
IDeque<T> DequeueRight();
bool IsEmpty { get; }
}
Attempt #1
We built a single-ended queue out of two stacks. Can we pull a similar trick here? How about we have the "left stack" and the "right stack". Enqueuing on the left pushes on the left stack, enqueuing on the right pushes on the right stack, and so on.
Unfortunately, this has some problems. What if you are dequeuing on the right and you run out of items on the right-hand stack? Well, no problem, we'll pull the same trick as before -- reverse the left stack and swap it with the right stack.
The trouble with that is, suppose the left stack is { 1000, 999, ..., 3, 2, 1 } and the right stack is empty. Someone dequeues the deque on the right. We reverse the stack, swap them and pop the new right stack. Now we have an empty left-hand stack and { 2, 3, 4, .... 1000 } on the right hand stack. It took 1000 steps to do this. Now someone tries to dequeue on the left. We reverse the right queue, swap, and pop, and now we have { 999, 998, ... 3, 2 }. That took 999 steps. If we keep on dequeuing alternating on the right and left we end up doing on average five hundred pushes per step. That's terrible performance. Clearly this is an O(n2) algorithm.
Attempt #2
Our attempt to model this as a pair of stacks seems to be failing. Let's take a step back and see if we can come up with a recursively defined data structure which makes it more apparent that there is cheap access to each end.
The standard recursive definition of a stack is "a stack is either empty, or an item (the head) followed by a stack (the tail)". It seems like we ought to be able to say "a deque is either empty, or an item (the left) followed by a deque (the middle) followed by an item (the right)".
Perhaps you have already seen the problem with this definition; a deque by this definition always has an even number of elements! But we can fix that easily enough. A deque is:
1) empty, or
2) a single item, or
3) a left item followed by a middle deque followed by a right item.
Awesome. Let's implement it.
// WARNING: THIS IMPLEMENTATION IS AWFUL. DO NOT USE THIS CODE.
public sealed class Deque<T> : IDeque<T>
{
private sealed class EmptyDeque : IDeque<T>
{
public bool IsEmpty { get { return true; } }
public IDeque<T> EnqueueLeft(T value) { return new SingleDeque(value); }
public IDeque<T> EnqueueRight(T value) { return new SingleDeque(value); }
public IDeque<T> DequeueLeft() { throw new Exception("empty deque"); }
public IDeque<T> DequeueRight() { throw new Exception("empty deque"); }
public T PeekLeft () { throw new Exception("empty deque"); }
public T PeekRight () { throw new Exception("empty deque"); }
}
private sealed class SingleDeque : IDeque<T>
{
public SingleDeque(T t) { item = t; }
private readonly T item;
public bool IsEmpty { get { return false; } }
public IDeque<T> EnqueueLeft(T value) { return new Deque<T>(value, Empty, item); }
public IDeque<T> EnqueueRight(T value) { return new Deque<T>(item, Empty, value); }
public IDeque<T> DequeueLeft() { return Empty; }
public IDeque<T> DequeueRight() { return Empty; }
public T PeekLeft () { return item; }
public T PeekRight () { return item; }
}
private static readonly IDeque<T> empty = new EmptyDeque();
public static IDeque<T> Empty { get { return empty; } }
public bool IsEmpty { get { return false; } }
private Deque(T left, IDeque<T> middle, T right)
{
this.left = left;
this.middle = middle;
this.right = right;
}
private readonly T left;
private readonly IDeque<T> middle;
private readonly T right;
public IDeque<T> EnqueueLeft(T value)
{
return new Deque<T>(value, middle.EnqueueLeft(left), right);
}
public IDeque<T> EnqueueRight(T value)
{
return new Deque<T>(left, middle.EnqueueRight(right), value);
}
public IDeque<T> DequeueLeft()
{
if (middle.IsEmpty) return new SingleDeque(right);
return new Deque<T>(middle.PeekLeft(), middle.DequeueLeft(), right);
}
public IDeque<T> DequeueRight()
{
if (middle.IsEmpty) return new SingleDeque(left);
return new Deque<T>(left, middle.DequeueRight(), middle.PeekRight());
}
public T PeekLeft () { return left; }
public T PeekRight () { return right; }
}
I seem to have somewhat anticipated my denouement, but this is coding, not mystery novel writing. What is so awful about this implementation? It seems like a perfectly straightforward implementation of the abstract data type. But it turns out to be actually worse than the two-stack implementation we first considered. What are your thoughts on the matter?
Next time, what's wrong with this code and some groundwork for fixing it.
Comments
Anonymous
January 22, 2008
It does leave a little to be desired. Now instead of an O(n²) algorithm, you have a recursive O(n²) algorithm, so we can hammer both the CPU and the stack with it. Nice. ;)Anonymous
January 22, 2008
The problem here is the recursive chain that is produced when you call EnqueueXXX and DequeXXX over a non trivial deque. I'm working in a solution that has, for every generarl Deque, a cachedDequeLeft and cachedDequeRight, but I'm not sure if it´s going to work. Cross your fingers :)Anonymous
January 22, 2008
Aaron: Yep! It's truly awful. Each enqueue is O(n) in stack consumption, time and number of new nodes allocated, so enqueuing n items in a row is O(n²) in time. Olmo: Sounds interesting!Anonymous
January 22, 2008
For some reason, there's been a lot of buzz lately around immutability in C#. If you're interested inAnonymous
January 22, 2008
I've been working on it for a while and there is no end. In my implementation Deque is FAST so eneque have to be defined in terms of Deque. The end of the history is that, while you can reuse a lot of this trees (should we call them firs? hehe), for a deque with n elements you need about n^2 that represent every single instance of all the possible subintervals. So having a structure that needs so much memory is stupid. Also, inserting an element stills O(n)... So a way to nowhere ... :SAnonymous
January 22, 2008
I'm thinking that you could just throw together an immutable version of a doubly-linked list. You did say that the solution was tricky, and this one is downright banal, so it probably isn't what you're looking for, but it would work and give O(1) enqueue and dequeue performance. All you'd need is an Element<T> with (read only) Left, Right and Value properties, and you could give the Deque<T> "beginning" and "end" fields of type Element<T>. Throw in a constructor that takes two Element<T> parameters and you've got yourself an immutable Deque. I guess the problem is that it would have to allocate a new Deque<T> for each and every Dequeue operation. The memory performance isn't stellar. Then again, that's exactly what we did for the Queue<T>, so does it make a difference?Anonymous
January 22, 2008
Actually... never mind, I can see now why that wouldn't work - the Element<T> would have to be mutable. The Deque could look immutable on the outside, but I imagine the point of this exercise is to build the whole thing using completely immutable data structures. So, scratch that, back to square one. :-)Anonymous
January 22, 2008
Yep, you got it. Doubly-linked lists are always mutable.Anonymous
January 22, 2008
Going back to attempt #1, if when one stack is empty, you reverse and transfer just half the elements from the other stack, you get amortised O(1) performance, and you're done ;) This is mentioned in Chris Okasaki's book "Purely functional data structures" which describes a lot of interesting immutable data structures. Looking forward to seeing your alternative as well!Anonymous
January 22, 2008
The comment has been removedAnonymous
January 22, 2008
Aaron: no, Luke is right. If you are clever about when you rebalance the deque, you can get amortized O(1) performance. You end up having to keep around extra information about how big each queue is, but that's not hard. However, that is in fact not the solution I had in mind. Rather than fixing Attempt #1, I'm going to fix Attempt #2 by being more clever about what exactly goes in the left, middle and right sections.Anonymous
January 22, 2008
I don't think that's exactly what you meant, and I'm sure I'm missing something, but would return new Deque<T>(left, middle.DequeueRight(), middle.PeekRight()); even work as planned? Wouldn't we end up with the wrong right element after we do this? Did I just say 'wrong right element'? How is it you manage to do this to me almost every time?Anonymous
January 23, 2008
I was able to make an efficient Deque using 2 "Trimmable Stacks" and keeping Deque Count information, but I am not sure about approach #2. I am thinking it might be possible by making left and right into Stacks and keeping middle a Deque. EnqueueLeft would be: return new Deque<T>(left.Push(value), this, right); I am just not sure if this works for all other operations. Great series by the way. Keep it going!Anonymous
January 23, 2008
Chris: Why would we end up with the wrong right element? Remember, middle is IMMUTABLE. Dequeuing it does not change the value of middle, it returns a different deque. We can dequeue that thing all we want and its rightmost element is the same as it ever was. Dr. Blaise: Your intuition is good. I'm going to stick with something-on-the-left, something-in-the-middle and something-on-the-right, and those somethings will be more complex than Attempt #2. It's not going to be exactly as you've sketched out though.Anonymous
January 23, 2008
I'm not sure if I'm on the right track, but the thing I'm noticing is that since a deque implicitly has 3 pieces, it's easier to deal with 2 elements at a time than it is to deal with just 1. What if you made the left and right sides a kind of "buffer" (stacks with a count, I guess), and did the recursive enqueuing only when you had 2 elements to move? For example... if you're enqueuing from the left, and the left buffer has one or two elements, just tack it on. If it has three, then pop all 3 elements, enqueue the last two onto the inner deque, and put the 1st back on the stack, then enqueue the new element normally. There is some recursion, but 50% of the time the depth is zero, 25% of the time it's one level deep, 12.5% of the time it only goes two levels, etc. Once you hit the end, which would be a SingleDeque, then enqueuing two elements at a time is trivial; just move the left to the right, the first element into a new SingleDeque in the middle, and the last element onto the right. I think this is also "amortized" O(1) performance, right? To dequeue off of either side you just reverse this process, same performance. It can end up being lopsided, but I don't think it matters; if you run out of elements on the "requested" side, just start taking them off the other side - since you've got a maximum of 3 elements to pop at any level before you find the "tail", it's still pretty fast. Is this making any sense or have I officially gone off the deep end?Anonymous
January 23, 2008
Give that man a cigar! The part of the trick that you've not explicitly stated is that the inner deque is a deque of stacks of elements, not of elements. I'll be presenting an implementation of this technique in the next few days.Anonymous
January 24, 2008
WOW! Truly mind blowing stuff. Am I wrong or will the final version be: IDeque<T> left; IDeque<IDeque<T>> middle; IDeque<T> right;Anonymous
January 24, 2008
Very close. We're actually going to make a "lite" deque that can store between one and four elements, and then we'll have Dequelette<T> left IDeque<Dequelette<T>> middle Dequelette<T> right (Since the Dequelette does not meet the IDeque contract, I don't want to say that it does.) That gives us fast access to the stuff at the end points, and logarithmic access to the stuff in the middle -- unlike, say, a binary tree, which gives us fast access to the stuff "in the local middle" and logarithmic access to the stuff at the end points. It's like pulling a binary tree inside out, a bit. But patience! We'll get to it next week.Anonymous
January 29, 2008
Welcome to the fortieth issue of Community Convergence. This week we have two new releases of note: WeAnonymous
January 30, 2008
I've convinced myself that this works in C#, but I'm curious as to how the compiler remains happy and sane. The number of concrete types involved seems to be O(log N) in the number of elements, which, while very slow-growing, does not have an upper bound. Are the concrete types really only created as needed at run-time? Would you be unable to implement this same structure using C++ templates? Or have I fundamentally misunderstood something? I'm eager to see the next installment!Anonymous
February 05, 2008
The comment has been removedAnonymous
February 09, 2008
Shouldn't the interface be: public interface IDeque<T> { T PeekLeft(); T PeekRight(); IDeque<T> EnqueueLeft(T value); IDeque<T> EnqueueRight(T value); IDeque<T> DequeueLeft(out T value); IDeque<T> DequeueRight(out T value); bool IsEmpty { get; } } ie, add argument "out T value" to Dequeue operations. Same goes for previously posted data structures ...Anonymous
February 09, 2008
No. That would be conflating two logically separate operations into one method. One method should do one thing, and do it well. Examining the data structure and producing a new data structure are two entirely different operations, so they should be represented by two methods.Anonymous
February 10, 2008
I understand your point, thought it seems to conflict with current .NET mutable collections API (not saying one is wrong vs others, just different way of seeing things I guess).Anonymous
February 11, 2008
Mutable collections must conflate unrelated operations because mutable collections are impossible to reason about. Suppose for example you want to pop a stack AND know what value you popped. In an immutable stack you can ask "are you empty?" and then peek and then pop. In a mutable stack asking "are you empty?" tells you nothing about whether peeking is safe. Someone could have popped everything off the stack on a different thread. In a mutable stack, if you peek and then pop you have absolutely no idea if the value you peeked is the value that was just popped off. Someone might have done a push after your peek. Mutable structures are "threadsafe" only insofar as they guarantee to behave as though their operations are atomic, not that combinations of those operations are atomic. Therefore, any operations which must be logically atomic in a mutable structure must be conflated together. That's why mutable collections have methods that do everything all at once, rather than cleanly separating different operations into different methods. Immutable structures give you a much stronger thread safety -- they give you the ability to reason logically from past calls to future calls. In an immutable stack, the value you peek is always the value you pop.Anonymous
February 12, 2008
I've been loving this immutability series! Any hope for a System.Collections.Immutable someday? :-)Anonymous
February 15, 2008
The comment has been removedAnonymous
February 15, 2008
That is a reasonable way of thinking about it, yes.