Generating summary documents for a variable-quality PDF document collection
Hughes, Jacob and Brailsford, David F. and Bagley, Steven R. and Adams, Clive E. (2014) Generating summary documents for a variable-quality PDF document collection. In: ACM Symposium on Document Engineering (DocEng '14), 16-19 Sept 2014, Fort Collins, Colorado, USA.
Official URL: http://dx.doi.org/10.1145/2644866.2644892
The Cochrane Schizophrenia Group’s Register of studies details all aspects of the effects of treating people with schizophrenia. It has been gathered over the last 20 years and consists of around 20,000 documents, overwhelmingly in PDF. Document collections of this sort – on a given theme but gathered from a wide range of sources – will generally have huge variability in the quality of the PDF, particularly with respect to the key property of text searchability.
Actions (Archive Staff Only)