At the end of 5 days, perhaps inevitably, we are just getting our teeth into the meat of the issues! As we planned yesterday, Alex has run a number of searches within our existing subset of pages that contain the word “inhaler.” Although not without its own problems, the results have enabled Olivia and David to investigate what more refined searches might look like. What would a search looking for pages containing “asthma” + “inhaler” look like? What about finding illustrations only that contain those terms? Would you be able to look for a brand name such as “Potter’s Asthma Cure” or “Ventolin” within these results? The resulting timeline visualisations show how advanced searches and filters might work and show masses of potential to prompt further and interesting research questions.
David has also created a timeline that shows, through density of coloured dots plotting the results, the frequency of results occurring in each issue of the journal.
Spending time today experimenting with more of this user interface confirms that our ambitions are justifiable, but not currently attainable. Hopefully, if the time and resources are found in the future to spend on the back end search functionality of The Chemist and Druggist, it will be a much more accessible and useful resource for a wide range of historians. So, we haven’t reached our holy grail this week, but it has been really enjoyable trying.
Confession: we spent 3 minutes today watching a stop motion version of Paddington Bear dancing to ‘Singin’ in the Rain.’ https://www.youtube.com/watch?v=kHg6QjhvsCM Light relief? Yes, but it came about from a discussion of issues facing our project – honest.
Bear with me (excuse the pun) while I explain: ‘Singin’ in the Rain’ shows the impact of movies going from silent to sound in the late 1920s, a major technological development which we now take entirely for granted. Our parallel thoughts were that we today entirely take our ability to perform complex searches across massive amounts of data for granted. The digitised material exists, and we expect to be able to find what we want from it. However, the root of our problems on the penultimate day of Data Week is that just because The Chemist and Druggist is fully digitised does not automatically mean that it is fully searchable. Far from it.
So having grappled with searches for four days, we have decided to concentrate on our final day fully on the front end. Our approach is to pretend that the complete search functionality is running perfectly, and play with what we would we be able to present to a researcher. So we’re faking this with a series of searches carried out on the full run of C&D firstly to create a subset containing the word “inhaler”, then subsequent searches to split these results into adverts and/or articles, another search looking for the brandname “Ventolin” and a final look at presenting results by the decade they appear. It isn’t feasible to carry these out in ‘real time’ at the moment, so we’re probably going to produce an animation that pretends we can. A stop motion animation if you like – thank you Paddington!
Written by Briony Hudson
Up to this point, our project – to develop ways of searching the enormous digital resource that is The Chemist and Druggist journal to present meaningful and visually stimulating results to researchers – has been a three-pronged attack. How to carry out a comprehensive search across the whole data set (that doesn’t take 3 hours to run), spearheaded by David; how to present the resulting adverts and images in a coherent and stimulating way, spearheaded by Olivia; how to interpret the results to answer research questions and set them in historical context, spearheaded by Briony. As the technical side of the work focusses on improved ways of running searches, David jokingly described what we’ve been up to as “artisan production” and he has a point – the manual, labour- and time- intensive nature of what we’ve been able to achieve up to this point is clearly very restrictive. But we feel optimistic that we’re edging closer to a better way to bring the mass of material into focus.
Using our best existing search results, Olivia has created a timeline which displays each page that contains the search term “asthma”. For the first time, we can visually see our results which is an exciting development. However, it also made it clear that the original search had not been as comprehensive as we had thought – back to the drawing board to refine the process. The visualisation has also thrown up a major challenge in the sheer quantity of material. The pages cascading into the structure as they load is certainly impressive, but raises lots of questions about how best the presentation can ultimately be de-cluttered to allow effective use of the material. Our approach has been to organise the results by individual year and then issue date, which has resulted in some obvious patterning of similar adverts run over consecutive issues and prompted interesting ideas for research questions about subtle design changes and updated content. However, visually the mass of pages will need much more work, and inevitably means a return to the problems surrounding searching such an enormous data set, in order to filter the results to be useful.
Our initial aim was that a timeline approach would allow clear presentation of trends over time, but we also want to be able to carve up the results further to provide answers to research questions that are more refined than the overarching “what medicines were advertised to treat asthma in the Chemist and Druggist between 1859 and 2010?” Exploring themes and filters to allow exploration of questions such as “what additional products were advertised by the same companies that made asthma remedies?” or “are there common active ingredients in the medicines across time?” would obviously be enormously helpful.
And we also want to allow users of the resource to take advantage of glimpses of other research avenues that they might grasp, so the context is all important. Hence, for example, our decision that showing a cropped asthma advert is not as valuable as keeping it situated within its full journal page with its neighbouring adverts. If the end result for a researcher was that they were distracted by an adjacent advert for, say, ballroom floor polish (!) and pursued this through the journal, this would be an equally satisfying result for our project. While grappling with technical and visualisation issues, our motivation is still to make the richness of the journal’s content more accessible.
Hear from David, talking to Tom Crane (THE most knowledgeable person regarding the digital collections) about where the Chemist and Druggist team have got to in their ability to search across the journal:
and from Olivia and Briony about thoughts on end result visualisations:
Day 1 of #WellcomeDataWeek and as a pharmacy historian with limited experience of digital resources, I’m immersed in discussions about search terms, codes and interfaces.
Our group is woking on approaches to try and unlock all of the fabulous adverts in The Chemist and Druggist https://archive.org/details/chemistanddruggist. As a test case, we’re looking at using the digitised journal as a resource to investigate treatments used for asthma over the run of the peridocial from 1859 – 2010. There are many basic challenges to overcome – luckily for me, squarley at the feet of the technical members of our team – such as setting up effective searches for illustrations that feature the word “asthma”, searching successfully across the whole run of issues for a single keyword, and working out how best to display the resulting images online. Although, some of these might seem elementary as a digitsed journal of arond 6,500 issues comprising around 500,000 pages, these are not straightforward tasks. But once the solutions are found, we’re all getting excited about the possibilites.
Our aim for the asthma pilot is to create a timeline approach to displaying and interpreting the adverts, and I’m beginning to work on the interpretation and milestone drug developments to provide context to the search results. And of course, we hope that if we’re successful with this project, we can roll out the approach to other diseases, places or themes. The major problem this afternoon is simply that there are so many great adverts that we’re coming across that it’s very easy to get side-tracked!
Written by: Briony Hudson, freelance pharmaceutical historian interested in The Chemist and Druggist. Working alongside Olivia Vane & David McCormick.