Resources and Readings

Find a bibliography and further resources here!

Bibliography

MODULE 1

Hearst, M. (2003). What is text mining. SIMS, UC Berkeley. http://people.ischool.berkeley.edu/~hearst/text-mining.html

Jockers, M. L., & Mimno, D. (2013). Significant themes in 19th-century literature. Poetics, 41(6), 750-769. http://dx.doi.org/10.1016/j.poetic.2013.08.005

Juola, P. (2017). Language Log » Rowling and “Galbraith”: an authorial analysis. July 16, 2013. Retrieved January 25, 2017, from http://languagelog.ldc.upenn.edu/nll/?p=5315

Moretti, F. (2013). Distant reading. Verso Books.

Underwood, T., & Sellers, J. (2012). The emergence of literary diction. Journal of Digital Humanities, 1(2), 1-2. http://journalofdigitalhumanities.org/1-2/the-emergence-of-literary-diction-by-ted-underwood-and-jordan-sellers/

MODULE 2.1

Padilla, T. (2015). Kludging: Web to TXT. Retrieved August 16, 2017, from http://www.thomaspadilla.org/2015/08/03/kludge/ .

MODULE 2.2

Collections as Data National Forum. (2017, March 3). The Santa Barbara Statement on Collections as Data. Retrieved August 16, 2017, from https://collectionsasdata.github.io/statement/

MODULE 3

Denny, M. J. and Spirling, A. (2017). Text Preprocessing for Unsupervised Learning: Why It Matters, When It Misleads, and What to Do about It. https://ssrn.com/abstract=2849145

National Endowment for the Humanties. (2017) Data Management Plans for NEH Office of Digital Humanities Proposals and Awards. Retrieved October 1, 2017, from https://www.neh.gov/files/grants/data_management_plans_2018.pdf

Rawson, K., & Muñoz, T. (2016). Against Cleaning. Retrieved August 16, 2017, from http://curatingmenus.org/articles/against-cleaning/

Rockwell, G. (2003). What is Text Analysis, Really? Literary and Linguistic Computing, 18(2), 209–219. https://doi.org/10.1093/llc/18.2.209

MODULE 4.1

Blei, D. M. (2012). Probabilistic topic models. Commun. ACM 55, 4 (April 2012), 77-84. http://dx.doi.org/10.1145/2133806.2133826

MODULE 4.2

Underwood, T., & Sellers, J. (2012). The emergence of literary diction. Journal of Digital Humanities, 1(2), 1-2. http://journalofdigitalhumanities.org/1-2/the-emergence-of-literary-diction-by-ted-underwood-and-jordan-sellers/

MODULE 5

Chuang, J. (2011). Text Visualization.  November 2011. Retrieved January 25, 2017, from http://hci.stanford.edu/courses/cs448b/f11/lectures/CS448B-20111117-Text.pdf

Palmer K., Polley, T., & Pollock, C. (n.d.). Chronicling Hoosier. Retrieved August 16, 2017, from http://centerfordigschol.github.io/chroniclinghoosier/map1.html

Roskey Legal Education Blog. (2011, July 15). Martin Luther King, Jr.’s “I have a dream” speech as a word tree. Retrieved August 16, 2017, from http://roskylegaled.com/blog/post/martin-luther-king-jr-s-i-have-a-dream-speech-as-a/

Schmidt, B. (2017, May 16). A brief visual history of MARC cataloging at the Library of Congress. Retrieved August 16, 2017, from http://sappingattention.blogspot.com/2017/05/a-brief-visual-history-of-marc.html

Schmidt, B. (n.d.). API Philosophy | Bookworm. Retrieved August 16, 2017, from https://bookworm-project.github.io/Docs/api_philosophy.html

Theguardian.com. (2013, February 12). The state of our union is … dumber: How the linguistic standard of the presidential address has declined. Retrieved August 16, 2017, from https://www.theguardian.com/world/interactive/2013/feb/12/state-of-the-union-reading-level

Underwood, T., & Bamman, D. (2016, November 28). The Gender Balance of Fiction, 1800-2007 | The Stone and the Shell. Retrieved August 16, 2017, from https://tedunderwood.com/2016/12/28/the-gender-balance-of-fiction-1800-2007/

Underwood, T. (2012, November 11). Visualizing topic models. | The Stone and the Shell. Retrieved August 16, 2017, from https://tedunderwood.com/2012/11/11/visualizing-topic-models/

Wattenberg, M., & Viégas, F. B. (2008). The word tree, an interactive visual concordance. IEEE transactions on visualization and computer graphics14(6). 10.1109/TVCG.2008.172


Further Reading and Resources

SUPPORTING DIGITAL SCHOLARSHIP

Auckland, M. (2012). Re-skilling for research: An investigation into the role and skills of subject and liaison librarians required to effectively support the evolving information needs of researchers. RLUK Report, available at: http://www.rluk.ac.uk/files/RLUK%20Re-skilling.pdf

Ayers, E. L. (2013). Does digital scholarship have a future?. Educause Review48(4), 24-34. https://er.educause.edu/articles/2013/8/does-digital-scholarship-have-a-future

Babeu, A. (2011). ” Rome Wasn’t Digitized in a Day”: Building a Cyberinfrastructure for Digital Classics. Washington, DC: Council on Library and Information Resources. Retrieved October 3, 2017 from https://www.ianus-fdz.de/attachments/339/Babeu_Rome-Wasnt-Digitized-in-a-Day_2011.pdf

Bryson, T., Posner, M., Pierre, A. S., & Varner, S. (2011). SPEC kit 326: Digital humanities. Washington, DC: Association of Research Libraries. Retrieved October 3, 2017 from http://publications.arl.org/Digital-Humanities-SPEC-Kit-326/

Johnson, L., Adams Becker, S., Estrada, V. & Freeman, A. (2015). NMC Horizon Report: 2015 Library Edition. Austin, TX: The New Media Consortium. Retrieved October 3, 2017 from https://www.learntechlib.org/p/151822/.

Lippincott, J., & Goldenberg-Hart, D. (2014). Digital scholarship centers: Trends & good practice (CNI workshop report). https://www.cni.org/wp-content/uploads/2014/11/CNI-Digitial-Schol.-Centers-report-2014.web_.pdf

Maron, N. L. (2015). The digital humanities are alive and well and blooming: Now what?. Educause Review50(5), 28-38. https://er.educause.edu/articles/2015/8/the-digital-humanities-are-alive-and-well-and-blooming-now-what

McDonald, D., McNicoll, I., Weir, G., Reimer, T., Redfearn, J., Jacobs, N., & Bruce, R. (2012). The Value and Benefits of Text Mining. JISC Digital Infrastructure. Retrieved from http://www.jisc.ac.uk/media/documents/publications/reports/2012/value-text-mining.pdf

Palmer, C. L., & Neumann, L. J. (2002). The information work of interdisciplinary humanities scholars: Exploration and translation. The Library Quarterly72(1), 85-117. https://doi.org/10.1086/603337

Searle, S. (2015). Using scenarios in introductory research data management workshops for library staff. D-Lib Magazine21(11/12). http://www.dlib.org/dlib/november15/searle/11searle.html

Sukovic, S. (2011). E-Texts in research projects in the humanities. In Advances in Librarianship (Vol. 33, pp. 131–202). Emerald Group Publishing Limited. https://doi.org/10.1108/S0065-2830(2011)0000033009

Sula, C. A. (2013). Digital humanities and libraries: A conceptual model. Journal of Library Administration53(1), 10-26. http://dx.doi.org/10.1080/01930826.2013.756680

Toms, E. G., & O’Brien, H. L. (2008). Understanding the information and communication technology needs of the e-humanist. Journal of Documentation64(1), 102-130. https://doi.org/10.1108/00220410810844178

Vinopal, J., & McCormick, M. (2013). Supporting digital scholarship in research libraries: Scalability and sustainability. Journal of Library Administration53(1), 27-42. http://dx.doi.org/10.1080/01930826.2013.756689

Walters, T., & Skinner, K. (2011). New Roles for New Times: Digital Curation for Preservation. Washington, DC: Association of Research Libraries. Retrieved October 3, 2017 from http://files.eric.ed.gov/fulltext/ED527702.pdf.

Zorich, D. (2012). Transitioning to a Digital World: Art History, Its Research Centers, and Digital Scholarship. A Report to the Samuel H. Kress Foundation and the Roy Rosenzweig Center for History and New Media, George Mason University. Retrieved October 3, 2017 from http://www.kressfoundation.org/uploadedFiles/Sponsored_Research/Research/Zorich_TransitioningDigitalWorld.pdf

HT, THE HTDL, AND THE HTRC

Downie, S. J., Furlough, M., McDonald, R. H., Namachchivaya, B., Plale, B. A., & Unsworth, J. (2016). The HathiTrust Research Center: Exploring the full-text frontier. Educause Review51(3), 50-51. http://er.educause.edu/~/media/files/articles/2016/5/erm1638.pdf

HTRC “About” page: https://www.hathitrust.org/htrc_about

HathiTrust Research Center Documentation: https://wiki.htrc.illinois.edu/display/COM/HathiTrust+Research+Center+Documentation

Jett, J. et al., (2016). The HathiTrust Research Center Workset Ontology: A Descriptive Framework for Non-Consumptive Research Collections. Journal of Open Humanities Data. 2, p.e1. http://doi.org/10.5334/johd.3

York, J., & Schottlaender, B. E. (2014). The Universal Library Is Us: Library Work at Scale in HathiTrust. Educause Review49(3), 48-49. http://er.educause.edu/articles/2014/5/the-universal-library-is-us-library-work-at-scale-in-hathitrust

OTHER TEXT ANALYSIS EXAMPLES

Digging For Nuggets Of Wisdom – The New York Times. October 10, 2003. Retrieved January 25, 2017, from http://www.nytimes.com/2003/10/16/technology/digging-for-nuggets-of-wisdom.html

Lancashire, I., & Hirst, G. (2009). Vocabulary changes in Agatha Christie’s mysteries as an indication of dementia: A case study. In 19th Annual Rotman Research Institute Conference, Cognitive Aging: Research and Practice, 8-10.  ftp://ftp.cs.toronto.edu/pub/gh/Lancashire+Hirst-extabs-2009.pdf

BASH COMMANDS

Introduction to Bash: http://programminghistorian.org/lessons/intro-to-bash

Curl and wget: https://daniel.haxx.se/docs/curl-vs-wget.html

PYTHON

Official Python FAQ: https://docs.python.org/3/faq/index.html

List of Python beginner’s guides for non-programmers: https://wiki.python.org/moin/BeginnersGuide/NonProgrammers

DATA VISUALIZATION

General:

Moretti, F. (2005). Graphs, maps, trees: abstract models for a literary history. Verso.

Steele, J., & Iliinsky, N. (2010). Beautiful visualization: looking at data through the eyes of experts. ” O’Reilly Media, Inc.

Yau, N. (2011). Visualize this: The FlowingData guide to design, visualization, and statistics. Indianapolis, IN: Wiley Pub.

The Data Visualization Catalogue developed by Severino Ribecca: http://www.datavizcatalogue.com/index.html

Introduction to Data Visualization: Visualization Types: http://guides.library.duke.edu/datavis/vis_types

Culturomics:

Lieberman, E., Michel, J. B., Jackson, J., Tang, T., & Nowak, M. A. (2007). Quantifying the evolutionary dynamics of language. Nature, 449(7163), 713-716.

Michel, J. B., Shen, Y. K., Aiden, A. P., Veres, A., Gray, M. K., Pickett, J. P., … & Pinker, S. (2011). Quantitative analysis of culture using millions of digitized books. Science, 331(6014), 176-182.

Tag clouds:

Waldner, M., Schrammel, J., Klein, M., Kristjánsdóttir, K., Unger, D., & Tscheligi, M. (2013, May). FacetClouds: exploring tag clouds for multi-dimensional data. In Proceedings of Graphics Interface 2013 (pp. 17-24). Canadian Information Processing Society.

Data visualization examples:

Visualizing Data: http://www.visualisingdata.com

FlowingData: http://flowingdata.com

Information is Beautiful: http://www.informationisbeautiful.net/visualizations

Text Visualization Browser: http://textvis.lnu.se

DATA CURATION AND MANAGEMENT

DH Curation Guide: http://guide.dhcuration.org

Digital Curation Centre: http://www.dcc.ac.uk/

Research Data Alliance: https://www.rd-alliance.org

Research Data and Preservation symposium (RDAP) 2011 Summer Humanities Data Curation Summit, Muñoz and Renear, “Issues in Humanities Data Curation” discussion paper: http://cirssweb.lis.illinois.edu/paloalto/whitepaper/premeeting/

ACLS, Our Cultural Commonwealth, The report of the American Council of Learned Societies Commission on Cyberinfrastructure for the Humanities and Social Sciences (2006): http://www.acls.org/cyberinfrastructure/ourculturalcommonwealth.pdf 

Data life cycle: http://data.library.virginia.edu/data-management/lifecycle/

Examples of Data Management Plans from previous successful NEH grant applications: https://www.neh.gov/divisions/odh/grant-news/data-management-plans-successful-grant-applications-2011-2014-now-available

DATA COLLECTIONS AND TOOLS

Collecting data:

Text and data mining at MIT (a guide for MIT affiliates on rights and restrictions for using licensed resources for text and data mining): https://libraries.mit.edu/scholarly/publishing/text-and-data-mining-at-mit/ 

JSTOR Data for Research: http://dfr.jstor.org

DocSouth Data: http://docsouth.unc.edu/docsouthdata/

Folger Digital Texts: http://www.folgerdigitaltexts.org/download/

Internet Archive: https://archive.org/index.php

Twitter API Overviewhttps://dev.twitter.com/overview/api

Ethical use of social media data:

Research Ethics for Students & Teachers: Social Media in the Classroom:  this handout was created by the Digital Alchemists and collaborators and produced by The Center for Solutions to Online Violence (CSOV). http://femtechnet.org/wp-content/uploads/2016/06/Research-Ethics-For-Students-Teachers_Social-Media-in-the-Classroom_DA-CSOV_2016-1.pdf 

Bailey, M. (2015). # transform (ing) DH Writing and Research: An Autoethnography of Digital Humanities and Feminist Ethics. DHQ: Digital Humanities Quarterly, 9(2). http://www.digitalhumanities.org/dhq/vol/9/2/000209/000209.html 

Preparing data:

OpenRefine: http://openrefine.org

Analyzing data:

Voyant: http://voyant-tools.org

Lexos: http://lexos.wheatoncollege.edu/upload

AntConc: http://www.laurenceanthony.net/software/antconc/

Weka: http://www.cs.waikato.ac.nz/ml/weka/

Mallet: http://mallet.cs.umass.edu

HTRC Algorithm: https://analytics.hathitrust.org/statisticalalgorithms

Visualizing data:

Voyant: http://voyant-tools.org

ArcGIS Online/StoryMaps: https://storymaps.arcgis.com/en/

Google Ngram Viewer: https://books.google.com/ngrams

HathiTrust+Bookworm: https://bookworm.htrc.illinois.edu/develop/ 

Tableau: https://www.tableau.com

Gephi: https://gephi.org

NodeXL: http://www.smrfoundation.org/nodexl/

DH Press: http://dhpress.org

Managing and sharing data:

Figshare: https://figshare.com

Github: https://github.com

Jupyter Notebookhttp://jupyter.org

Journal of Open Humanities Data: http://openhumanitiesdata.metajnl.com

PROJECTS AND INITIATIVES SIMILAR TO DDRF

Data Carpentry: http://www.datacarpentry.org

Rochester DH Institute for Mid-Career Librarians: http://humanities.lib.rochester.edu/institute/

U Mass Data Management Lessons: http://library.umassmed.edu/necdmc/index

Data Carpentry: http://www.datacarpentry.org/

Software Carpentry:  http://software-carpentry.org/

Library Carpentry: https://github.com/data-lessons

DataCamp: https://www.datacamp.com/courses