Seeing as this is opinionated piece, I’m going to start off saying that i hope your development data doesnt come from production. Production data is conjured from some known or unknown state of your application. With this in mind, how can you base your future development off of unknowns?
<p>With some legacy applications, all you may have to begin with is production data, so what do you do? Well the first thing you do is write some tests. <a href="http://www.pragmaticprogrammer.com/titles/prj/">Ship It</a> by the <a href="http://www.pragmaticprogrammer.com">Pragmatic Programmers</a> outlines some good starting steps for this situation. From your testing you will genarate known inputs and known outputs.</p>
<p>Generating your own test data also allows you to apply some quality assurance to your data. Data in your production application app may have been created by some unknown current (or patched) bug. There are no gurantees that what you genrate will be quality, but at least you have control of the source and arent relyong on unknowns for data that you will use to create more robust featured.</p>