It would have helped to see the actual execution plans, and not only the estimated ones. Often in cases like this, there are miss-estimates, and they are revealed by the actual plans.
It seems though, that the optimizer expects that the condition in table A to return 2525. The temp table on the other hand has an estimate of 255 rows, and I guess that estimate comes from a previous execution with the same value for UserID.
For that few rows, the optimizer thinks that a nested loop join + key lookup is the cheapest solution., but for the tenfold number of rows, the optimizer thinks it's cheaper to scan the table AD once for a hash joins. The optimizer is predisposed against loop joins and key lookups, and I would not have been so surprised, if AD had had some 10 million rows, but now we are talking 867 million. But one answer may be found in the estimate for how many rows the optimizer estimates the operation to produce in total: For the join with the temp table, it estimates 868000 rows, and for the direct join a whopping number of 12 million. 12 million key lookups is not fun.
What is the actual number returned?
As for why the estimates are out of whack, it could be that statistics are out of date, and you could try running UPDATE STATISTICS on the tables. And to get the most accurate statistics you could try
UPDATE STATISTICS A IX_A_User A WITH FULLSCAN
UPDATE STATISTICS AD IX_A WITH FULLSCAN
But I am not sure how much this is going to help. Also, why the nested loop join may be the best choice for this user id, it may be different for other user ids. Then again, I don't know how you actually use this query. May this particular user is the important one?