How’d it go?: Checking in from Northwestern University – HTRC Digging Deeper, Reaching Further

This fall we are piloting the workshop curriculum at all five partner institutions. In this post, librarian Geoffrey Morse shares his thoughts after leading his first DDRF workshop at Northwestern University. Feedback from all of the partners along with the assessment forms collected from attendees will inform development of our next iteration of workshop materials for spring 2017. You can read more about the modular DDRF curriculum in this update.

Q. Was there anything that worked particularly well in the workshop?

A. The workshop attendees seemed to follow along and replicate the results in the HTRC portions of the workshop without much difficulty. All of the attendees were able to create accounts, create work sets, and run algorithms. While all had been familiar with the HathiTrust Digital Library, not all were aware of the HTRC prior to the workshop.

Q. Was there anything that surprised you during the workshop?

A. The range of participants in the workshop was broader than might have initially been expected. Librarians and library staff from all areas of the library were interested in learning more about this topic. Not only did library liaisons to academic departments attend, technical services staff including catalogers and staff from acquisitions also attended as did staff members from Repositories and Digital Curation.

Q. If you could change one thing that you did in the workshop to make the sessions more effective, what would it be?

A. We did not have one change that stood out more than others so we have listed three interrelated changes below:

One change we might consider is, after the introduction, taking a single research question and following it through the entire workshop, from identifying and acquiring texts, to evaluating methods and tools, to running the analysis. Along the way we could still introduce other topics, methods, and tools, but this deeper dive into a single question, running like a spine throughout the workshop, might provide a greater sense of continuity throughout the workshop.
Another change we might consider would be to start off with a hands-on exercise closer to the beginning of the workshop and then move into the text mining background. The text mining background that we started with was important in terms of establishing context for the workshop but the first part of the workshop was mostly lecture until we got to create the workset in HTRC. Anther change we might consider making is to move the text scraping exercise to the first day of the workshop prior to doing the HTRC portions of the curriculum. One comment from attendees was that moving from HTRC to the text scraping exercise with PythonAnywhere, and then back to HTRC later in the workshop is confusing for those who are novices in this area; however others liked the order of the workshop.
Finally, providing a very brief introduction to scripting in Python—something that shows what’s going on under the hood of the scripts we are running—might be beneficial to the workshop participants.

Q. What tips would you give to somebody else teaching similar workshops?

A. Having all of the files necessary for the hands on portion of the workshop using PythonAnywhere was helpful as was having people register for PythonAnywhere accounts prior to the workshop. Before starting the portion of the workshop using PythonAnywhere it is well worth the time to take ten minutes and make sure everyone has the correct files and the PythonAnywhere account set up.

Q. How would you encourage a fellow librarian to play a more active role in supporting data-driven research?

A. Hands-on experience in workshop environments such as the HTRC workshops is a good way to become familiar with both the tools and concepts involved in data driven research. To be able to support this or any kind of research it is important that librarians have a basic understanding of the concepts behind the research and tools necessary for doing the actual research. While one need not be an expert in all aspects of text mining to be able to provide support they will need to have a basic foundation of knowledge and skill.