Warning: count(): Parameter must be an array or an object that implements Countable in /home/aliciape/public_html/wp-content/themes/zoner-lite/includes/admin/libs/metaboxes/init.php on line 756

Warning: Cannot modify header information - headers already sent by (output started at /home/aliciape/public_html/wp-content/themes/zoner-lite/includes/admin/libs/metaboxes/init.php:756) in /home/aliciape/public_html/wp-includes/rest-api/class-wp-rest-server.php on line 1831

Warning: Cannot modify header information - headers already sent by (output started at /home/aliciape/public_html/wp-content/themes/zoner-lite/includes/admin/libs/metaboxes/init.php:756) in /home/aliciape/public_html/wp-includes/rest-api/class-wp-rest-server.php on line 1831

Warning: Cannot modify header information - headers already sent by (output started at /home/aliciape/public_html/wp-content/themes/zoner-lite/includes/admin/libs/metaboxes/init.php:756) in /home/aliciape/public_html/wp-includes/rest-api/class-wp-rest-server.php on line 1831

Warning: Cannot modify header information - headers already sent by (output started at /home/aliciape/public_html/wp-content/themes/zoner-lite/includes/admin/libs/metaboxes/init.php:756) in /home/aliciape/public_html/wp-includes/rest-api/class-wp-rest-server.php on line 1831

Warning: Cannot modify header information - headers already sent by (output started at /home/aliciape/public_html/wp-content/themes/zoner-lite/includes/admin/libs/metaboxes/init.php:756) in /home/aliciape/public_html/wp-includes/rest-api/class-wp-rest-server.php on line 1831

Warning: Cannot modify header information - headers already sent by (output started at /home/aliciape/public_html/wp-content/themes/zoner-lite/includes/admin/libs/metaboxes/init.php:756) in /home/aliciape/public_html/wp-includes/rest-api/class-wp-rest-server.php on line 1831

Warning: Cannot modify header information - headers already sent by (output started at /home/aliciape/public_html/wp-content/themes/zoner-lite/includes/admin/libs/metaboxes/init.php:756) in /home/aliciape/public_html/wp-includes/rest-api/class-wp-rest-server.php on line 1831

Warning: Cannot modify header information - headers already sent by (output started at /home/aliciape/public_html/wp-content/themes/zoner-lite/includes/admin/libs/metaboxes/init.php:756) in /home/aliciape/public_html/wp-includes/rest-api/class-wp-rest-server.php on line 1831
{"id":139,"date":"2013-07-07T23:38:15","date_gmt":"2013-07-07T23:38:15","guid":{"rendered":"http:\/\/aliciapeaker.wordpress.com\/?p=139"},"modified":"2013-07-07T23:38:15","modified_gmt":"2013-07-07T23:38:15","slug":"6-things-to-remember-when-building-a-dataset-from-scratch","status":"publish","type":"post","link":"https:\/\/aliciapeaker.org\/?p=139","title":{"rendered":"6 Things to Consider when Building a Dataset from Scratch"},"content":{"rendered":"

This post is about my experiences at the Digital Humanities Summer Institute<\/a> (DHSI) held at the University of Victoria, June 5-10 2013. \u00a0<\/i><\/p>\n

Last month I attended my first DHSI on UVic\u2019s idyllic campus. In just five days I became proficient in the notoriously difficult ArcGIS; moved my research several, big steps forward; and met some wonderful folks working on amazing projects.<\/p>\n

The course I was enrolled in, \u201cGeographical Information Systems (GIS) in the Digital Humanities,\u201d asked attendees to bring their own data. Being a DH newbie, I had no idea what kind of data was out there or how to find it. I exchanged a few emails and then met with my university’s newly hired GIS expert about where to find data on my topic. She had some excellent recommendations for providing historical context data (UK Met Office<\/a> for weather, Botanical Society of the British Isles<\/a> for botany, and Natural England<\/a> for wildlife populations, to name a few). But when I described my project in detail she looked at me and said, \u201cWell, it sounds like we\u2019re going to need to build your dataset from scratch.\u201d I spent the next month trying to figure out what exactly a dataset really was and how to build one for my research on Edith Holden\u2019s naturalist field books<\/a>.<\/p>\n

Here\u2019s what I learned in the process:<\/p>\n

    \n
  1. Shape your research questions first<\/b>. They may change, in fact they\u2019re almost certain to change once you begin your research. But having a robust research question or set of questions before you begin will help temper the feelings of bewilderment often produced by beginning a dataset from scratch. I find Sonja Foss and William Waters\u2019s list in Destination Dissertation<\/a><\/i> (pg. 41) particularly useful in forming research questions \u2013 and so have my students.\u00a0\"Image\"<\/a><\/li>\n
  2. Start simple<\/b>. When I opened my excel spreadsheet and began creating columns for all the pieces of data I thought I wanted, I ended up with fifteen columns and a near-infinite, horizontal-scrolling problem. It was also taking me far too long to complete the dataset. So I simplified by narrowing my fields down to five, manageable columns: source, date, lat\/long, type, and label. As you can see they are relatively basic kinds of information that are easily generalizable to other texts and projects.<\/li>\n
  3. Move beyond your doubts<\/b>. Cutting back the number of columns was terrifying. What if I missed something important? What if my data wasn\u2019t robust enough? What if it didn\u2019t show me anything new? Would digital humanists scoff at my work? But then I realized that those questions, with a few substitutes, were pretty much the same doubts that I experienced when putting together my dissertation prospectus. They\u2019re the same doubts that plague most graduate students and likely many early-career academics as well. Some of these questions are twisted versions of legitimate concerns academics should ask themselves about their research, but they are rarely useful and almost always harmful in the initial stages of a project. Shelve them for now and return to them (in their un-twisted forms) when you have a better sense of your project.<\/li>\n
  4. Failure can be fruitful<\/b>. Perhaps even more fruitful in this kind of digital work than in \u201ctraditional\u201d research. If your data doesn\u2019t show you something you expect, it will likely show you something unexpected. In other words, the potential for new insights is even greater when working through failures \u2013 whether they be failures in the data design, in the data mining, or in the data presentation.<\/li>\n
  5. Don\u2019t be afraid to ask for help. <\/b>At whatever stages of the research you need it. I may have been stuck worrying that my data design wasn\u2019t elaborate enough to constitute important research if I hadn\u2019t asked a DH faculty member at my university what a dataset actually looked like. He shared a dataset with me, from a project he had completed, that was reassuringly elegant in its simplicity. It was an excel spreadsheet, with six columns of basic data (e.g. year of publication, place of publication, type of publication) that became a very useful model for my own project.<\/li>\n
  6. Savor the rewards. <\/b>Maybe it\u2019s the type-A in me coming out of the closet after years of unstructured academic research, but the gratification of seeing a spreadsheet full of beautiful data that I had collected felt even better than the satisfaction I experience after filling a page with writing. It\u2019s even more rewarding when that data becomes something \u2013 whether that be evidence for your argument, items in a digital archive, or, in my case, a multi-layered map of Edith Holden\u2019s field books.<\/li>\n<\/ol>\n

    Of course you don\u2019t always have to build a dataset from scratch. There\u2019s more data out there than you probably expect. Start with government agencies and societies organized around your topic or data need (e.g. The Royal Astronomical Society<\/a>). In a larger project or with the right resources, you may be able to work with a developer or Computer Science experts to build an algorithm to collect digital data for you.<\/p>\n

    But for most of us, building a dataset often comes down to manually inputting columns and columns of data, gleaned from digitized or undigitized texts, into spreadsheets. While the work is often monotonous, the results can be invigorating. Here\u2019s a static snapshot of the dynamic map my dataset has helped me create and a little foretaste of a future post:<\/p>\n

    \"Screenshot<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"

    This post is about my experiences at the Digital Humanities Summer Institute (DHSI) held at the University of Victoria, June 5-10 2013. \u00a0 Last month I attended my first DHSI on UVic\u2019s idyllic campus. In just five days I became proficient in the notoriously difficult ArcGIS; moved my research several, big steps forward; and met some wonderful folks working on amazing projects. The course I was enrolled in, \u201cGeographical Information Systems (GIS) in the Digital… 6 Things to Consider when Building a Dataset from Scratch<\/span><\/p>\n","protected":false},"author":1,"featured_media":167,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[10,11,13,18,21,24],"_links":{"self":[{"href":"https:\/\/aliciapeaker.org\/index.php?rest_route=\/wp\/v2\/posts\/139"}],"collection":[{"href":"https:\/\/aliciapeaker.org\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/aliciapeaker.org\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/aliciapeaker.org\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/aliciapeaker.org\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=139"}],"version-history":[{"count":0,"href":"https:\/\/aliciapeaker.org\/index.php?rest_route=\/wp\/v2\/posts\/139\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/aliciapeaker.org\/index.php?rest_route=\/wp\/v2\/media\/167"}],"wp:attachment":[{"href":"https:\/\/aliciapeaker.org\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=139"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/aliciapeaker.org\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=139"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/aliciapeaker.org\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=139"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}