I can see two solutions, one decently simple, and one not simple at all.
The first is to take a copy of production. Yes, that is 5TB, but the storage cost is a lot cheaper that the other solution. In this copy environment, you create a database snapshot. When a tester has completed his or her evil tests, you revert from the database snapshot, so that you go back to a known starting point. Reverting to a snapshot has the side effect of setting the log size to 0.5 MB, which is really bad for a production database, but assuming that these testers will only make a limited amount of updates, that should pose a problem in this environment.
The other solution? As Jeffery says, what you outline is a big challenge. Exactly how big depends on the complexity of your data model. It is certainly easier if you only have 20 tables than if you have 2000.
I did write a specification for a customer of my client a couple of years back. Their production database is around 3TB, and they wanted smaller databases for their developers, but the data should still be realistic. The idea was to extract data for some accounts and take it from there. In my spec I detailed which tables that should be extracted and how. An important thing was to find customer that had traded with different types of instruments and other specific features which is needed for the testing.
I don't remember how much time I spent on writing that spec, but it was surely more than 40 hours. Mind you, this is a decently complex database, with 1800 tables in total, of which my spec did not cover all. On the other hand, I have been working with this system for many years, so I have very good knowledge about the tables.
Whether this spec ever was implemented, I don't know, as I am no longer working a lot with this client. I don't recall my time estimate, but maybe it was 500 h. Of course, this cannot be translated to your site, as your database may be simpler. But it can still give an idea of what the cost might be for your virtual pipeline.