I think I have resolved the issue. The issue was related to the way that poison messages were handled.
My app defined 2 functions that are triggered by messages arriving on queues, one for processing messages on the original queue, and another for processing the messages on a '-poison' queue. When a message fails to be processed within the max dequeue count, it is automatically sent to the poison queue (see poison messages section here). If the poison queue does not exist when a message is first sent to it, then it is automatically created. This meant that I had not explicitly defined the poison queue in my ARM template. However, the scaling rules that are generated automatically for my app include a scaling rule for the '-poison' queue. If the '-poison' queue did not exist (which was the case when the app is first deployed), then this seems to break the auto scaling meaning the number of replicas never scaled down.
The solution was to explicitly define the '-poison' queue in my ARM template as well.
I think this should still be addressed as a bug in the KEDA scaling - if a scaling rule cannot connect to a queue (in this case because the queue did not exist), surely it should scale down by default, rather that retaining the maximum number of replicas running (and therefore causing additional costs to the customer).