Having testing data is very important. You may have all the required test cases for doing a complete feature testing of the application, but without the test data, all this does not mean anything. For example, if you are testing a video application, then you need to be able to test for the different video formats that the application would support, and there are a large number of such video formats that are needed to be tested (unless you are in the video domain, you would not really believe the number of formats that exist because of the large number of companies that do something for video). For each such different type of video format, you would need to have the required data in this regard - multiple number of video files in each format (as another example, you could have a video file that is just a collection of images, another without audio, and yet another that has a combination of video and audio - there could be a defect that is only found if the video file does not have any audio - we once found a defect like this and it was a big pain to figure out what the problem was; after this, we had to add files with and without audio to our testing matrix).
And video files are large. The total size of the video test data that we used for complete testing was more than 50 GB in size, and it was guarded with great care. Because of the sheer size of the test data, we had decided not to put this test data in the source safe we used (because the data backup of the source safe meant that the guys running the source safe were not happy over backing up so much data that was binary rather than being coding text files) and hence had made multiple disk copies of the this test data, and there was some amount of effort required to ensure that there was a master copy of the data and all copies were synchronized to this copy.
Are you getting an idea of the problem we faced when we added a vendor testing team to this matrix. We had to tell them the focus area of testing, we had to prepare the extraction of the required test cases and make sure that these made sense for somebody wanting to do the testing who did not have the same amount of experience as the core team, and then we had to also pass on the large test data to the vendor while linking this test data to the test cases. Even though we had a fast connection to the vendor, there were some permission requests that also had to be processed since some of the test data that we had was from the vendors with no permissions to pass on, and hence every time we needed to pass them onto a vendor, we needed permission (no monetary problem, just the paperwork and time required), and since the test data was changing with new video cameras entering into the market, there were synchronization problems. And in an unusual case, we found that a particular set of high end video files required a very high end machine and this information was not captured properly. So when the vendor tries to use those files, things were not working.
The challenges may differ depending of the type of test data, but there is a need to ensure that there is a strategy to prepare for the passing on the required test data to the vendor.