allow import of METS-ALTO XML files, and perhaps simple .txt files.
My goal for using Viewshare is to download all of the OCR text from Chronicling America, and to actually use the API to successfully download the data and import it into a search tool.
If Viewshare and Chronicling America could partner their efforts, I believe we could make the use of OCR data more available to our users.
-
Admintrow (Admin, Viewshare) commented
Thanks for the request. We are always interetsed in knowing more about what other kinds of data Viewshare users would like to load.
One issue I would note here, is that Viewshare does not scale particularly well. Anything past a few thousand items becomes difficult to work with. Beyond that, it isn't very good at working with very large amounts of text related to individual items. So, the OCR data for something like 100 pages would likely be very slow.