Operation War Diary has been running for over two years now. Together, we have placed hundreds of thousands of tags, made similar numbers of comments, and followed the journeys of hundreds of units through the conflict at the Western Front.
And, like all things, we have evolved over that time.
When we began, we followed in the footsteps of other great crowd-sourced digital humanities projects like Old Weather. But the content we are dealing with at Operation War Diary is unique in its depth, breadth and richness. It meant we had to make certain assumptions when we started out.
Mainly, this was around what should and shouldn’t be tagged, which in turn was based on what we thought the data we would produce might look like and how it would be used. In part, we were led by the transcription mantra, which is that only what is there should be written down. However, tagging is a very different activity to transcription, with a quite different set of applications.
Under our initial guidance, volunteers tagged only what was explicitly mentioned on a diary page, and we also told them not to tag certain everyday activities for units like ammunition columns, mobile veterinary sections and engineers – the movements, collections and checking of infrastructure which might be considered the bread and butter of the units in question.
In part, this was to make the process less onerous for our taggers. We have 1.5 million pages to get through, after all! But, as I said before, it was also partly because we hadn’t quite left the transcription mindset behind.
However, we now have our first real use of Operation War Diary data to refer to, courtesy of Professor Richard Grayson, and it makes for very interesting reading. If you haven’t read the article already, you can find it here.
To some extent, the quality and richness of the data which can be used to support studies like this is limited by what was included in a diary in the first place – some are much sparser than others. However, by following the transcription-oriented method of only tagging what we can see, are we also unnecessarily reducing the coverage of the data we produce?
What about the case of a unit which we know to be in the line, because the author tells us so on one day, but over the course of the next four or five day’s worth of entries, that fact isn’t explicitly mentioned again? Very often, it’s clear that the unit is still in the line, but that information is then lost because there’s nothing for us to drop a tag on.
Or the Mobile Veterinary Section who spend a week travelling from place to place, picking up sick horses to take back to the depot? Again, under our starting assumptions, that detail would also have been lost, because we felt it wasn’t necessary to tag activities we already knew certain units spent much of their time doing.
That’s fine from the standpoint of our knowledge and common-sense understanding of these units and the functions they carried out during the war. But if we shift the perspective to one of providing evidence, quantitative facts which we can use to illustrate our understanding, then by not tagging certain things we know to be true, we aren’t realising the full potential of Operation War Diary.
Of course, there’s a line between inferring what to fill the blanks with and making things up, but as our understanding of the project evolves, so too does the knowledge and experience of our long-term taggers, who may have started off knowing very little about the war diaries, but who have now read and tagged hundreds, if not thousands of pages and are very well placed to see patterns in the information and extrapolate from what is written down to what is only implied.
That will mean making judgement calls at times, but the Talk forums provide a great environment for testing out any inferences before we press the ‘Finish’ button. The whole concept of Operation War Diary is that it is built on consensus, so why not extend that to these situations too?
There are practical issues to overcome – where to place a tag for an inferred activity, for example, or which tag to use. For the former, I would suggest dropping inferred tags close to the date to which they should be linked – our clustering algorithm will then group them together and ensure the information is recorded in the way it was intended. For the latter, we may have to recourse more frequently to the unsatisfactory ‘Other’ option for activities which do not fit neatly into the standard list, but that at least will still allow us to build up a comprehensive timeline for each unit and will clearly indicate what they were not doing, even if we can’t provide specifics beyond that.
With our first published use of Operation War Diary’s data, I believe we now have a clear and compelling case for tagging as much information as we can as accurately as we can. And that is the beauty of Operation War Diary – we can evolve and improve what we do and, in doing so, can tell the stories of the Western Front in the most effective way we know how.
Recently, Steve Hirschorn at the National Archives has been looking at ways to visualise the information being generated by Operation War Diary Citizen Historians. As part of this work, he has taken maps found in the pages of the war diaries and, using the known coordinates of certain features contained within them, has fitted them to current satellite images using Google Earth. This process is known as georectifying and can help us assess how much the landscapes described in the war diaries have changed in the 100 years since they were written.
Over to Steve, who can tell you exactly how he’s done this…
Thanks to the efforts of volunteers using the #map hash-tag in the talk forum, it’s been easy for me to find georeference-able content. By geo-rectifying maps, it would be possible to use a GPS device to find the exact co-ordinates of anything that is documented in the maps, such as trench locations and routes, and machine gun emplacements.
Four KMZ (Google Earth/Google Maps) files are linked below. If you have Google Earth (free download available), you can use these files to view First World War maps overlaid on a recent satellite image. The KMZ files can also be imported into Google Maps, but the functionality via the website is more limited compared to the full Google Earth client.
After you have downloaded a KMZ file, you’ll see the war diary maps overlaid on the satellite view, but completely opaque. Look in the left-hand menu on Google Earth and by right-clicking an individual map item, you can select Properties and in the properties dialog box that appears, you can adjust the transparency from 0% to 100% and anywhere in between. I’ve tried to identify evidence in the current-day satellite images and street view photos of trench locations, but haven’t had much luck so far.
I’m amazed how well the most of the maps fit the current day landscape features. The odd road has disappeared here and there, but there are always enough reference points to fit the map, and they usually fit with just a bit of stretching and rotating, all of which functionality is also available in the free Google Earth client. A couple of them show trench locations (the map of Bullecourt) and machine gun placements (Ypres).
If you missed the link the first time, Google Earth can be downloaded here: http://www.google.co.uk/intl/en_uk/earth/
Steve’s map files can be downloaded using the following links:
- http://zooniverse-demo.s3.amazonaws.com/diaries_data/images/WO-95-1662-1_Bullecourt.kmz One of the better ones, a good fit to current-day road layout and showing the locations of trenches
- http://zooniverse-demo.s3.amazonaws.com/diaries_data/images/WO-95-1601-2_Ypres.kmz Another good fit, this time showing machine gun emplacements
- http://zooniverse-demo.s3.amazonaws.com/diaries_data/images/WO-95-1415-1_Hooge.kmz A map showing German trenches and the British Front Line
Lastly, as this is a new area of exploration for us here at Operation War Diary, and because none of us are experts on it, we have some questions which we hope you might be able to help us with:
- Is KMZ the best, most open format for sharing geo-rectified maps? Is there a better format?
- A bit of researching on the Internet suggests that there are ways of loading KMZ files onto a SatNav. Again, are KMZ files the best way of supporting this?
- Are there any web-based applications that enable geo-rectification of maps, and also provide a method of sharing geo-rectified maps?
- Do you have any geo-rectified maps you’ve created that you’d like to share?
- Are there any other ideas for things that we can do with the maps?
Post your answers in the comments here, or get involved on our forums at: http://talk.operationwardiary.org/#/boards