Data Week has drawn to a close all too fast, and we now have our prototype online!
As readers will see, we have uploaded a small corpus of works into our text-matching tool. If you click on the title of a work, it will bring you to a page that displays the entire text, and highlights areas where it matches that of other uploaded works. In addition to highlighting text matches, the tool displays them alongside one another, and provides links so that readers can view the full text of the work that shares text in common with the one under study.
If this experience has taught us one thing, it’s that we needed more time to create a tool that can be used with so many different digitized texts, for so many different purposes. It was fun to see what could be done in a week, but the compressed timeline gave us little wiggle room for reflecting on how we wanted the tool to work — both in terms of how the code was written and in terms of the interface design—and even less wiggle room for experimentation.
If we were given, say, another week or two, we would love to improve the tool’s matching capabilities and make the interface more elegant and intuitive. We would also love to upload a much larger corpus to the tool and play around with that, and/or figure out whether (and if so, how) we could integrate it into an existing platform for viewing the Medical Heritage Library, so it would be easier for everyone to use.