Web Research Collections – Web Track
The University of Glasgow took over the distribution of the WT2g/WT10g/.GOV/.GOV2 Web Research Collections from CSIRO (Commonwealth Scientific and Industrial Research Organisation), which has been distributing the Web Research collections to organizations and individuals engaged in research and development of natural language processing, information retrieval or document understanding systems, strictly for research purposes only. These collections have been used in the TREC Web & Terabyte tracks.
In addition, as part of the TREC Blogs track, the University of Glasgow is currently distributing the Blogs06 test collection.
If you are experimenting with Information Retrieval systems in a Web/Blogs context and/or if you are interested in large-scale information retrieval systems design and evaluation, then these collections are very useful. Since queries and relevance assessments are available from the TREC Web page for these collections, you can use these to tune/evaluate your system or approach.