JSON is a semi-structured data format for encoding data and is a popular language for data sharing and interchange – as such it is considered a good alternative to XML. This materials in this course will cover all the core JSON syntax and data structures as well as:
– structured data as a concept
– core data structuring approaches
– the differences between XML and JSON
– when to use XML, when to use JSON
Robert Chavez holds a PhD in Classical Studies from Indiana University. From 1994-1999 he worked in the Library Electronic Text Resource Service at Indiana University Bloomington as an electronic text specialist. From 1999-2007 Robert worked at Tufts University at the Perseus Project and the Digital Collections and Archives as a programmer, digital humanist, and institutional repository program manager. He currently works for the New England Journal of Medicine as Content Applications Architect.
Course Structure
This is an online class that is taught asynchronously, meaning that participants do the work on their own time as their schedules allow. The class does not meet together at any particular times, although the instructor may set up optional synchronous chat sessions. Instruction includes readings and assignments in one-week segments. Class participation is in an online forum environment.
Web search engines such as Google, Bing, and Yahoo are integral to making information more discoverable on the open web. How can you expose data about your organization, its services, people, collections, and other information in a way that is meaningful to these search engines?
In this 90 minute session, learn how to leverage Schema.org and semantic markup to achieve enhanced discovery of information on the open web. The session will provide an introduction to both Schema.org and the JSON-LD data format. Topics include an in-depth look at the Schema.org vocabulary, a brief overview of semantic markup with a focus on JSON-LD, and use-cases of these technologies. By the end of the session, you will have an opportunity to apply these technologies through a structured exercise. The session will conclude with resources and guidance for next steps.
Learning Outcomes
Participants will leave this webinar with tools for increasing the discoverability of information on the open web.
This program will include presentation slides, bibliographic references to resources referenced to in the slides, and hands-on exercise material. The exercise material will include instructions, template records for attendees to practice applying Schema.org and JSON-LD, and example records as reference material.
Who Should Attend
Librarians and other professionals interested in increasing discovery of their organization’s information and collections on the open web. General knowledge of metadata concepts and standards is encouraged. Familiarity with the concept of data formats (XML, JSON, MARC, etc.) would be helpful.
Jacob Shelby is the Metadata Technologies Librarian at North Carolina State University (NCSU) Libraries, where he performs metadata activities that support library information services and collections. He has collaborated on endeavors to enhance the discovery of library services and collections on the open web, including exposing NCSU Libraries digital special collections data as Schema.org data. In addition to these endeavors, Jacob has taught workshops at NCSU Libraries on Schema.org and semantic markup.
Eaton, M. E. (2017). Seeing Seeing Library Data: A Prototype Data Visualization Application for Librarians. Journal of Web Librarianship, 11(1), 69–78. Retrieved from http://academicworks.cuny.edu/kb_pubs
Visualization can increase the power of data, by showing the “patterns, trends and exceptions”
Librarians can benefit when they visually leverage data in support of library projects.
Nathan Yau suggests that exploratory learning is a significant benefit of data visualization initiatives (2013). We can learn about our libraries by tinkering with data. In addition, handling data can also challenge librarians to improve their technical skills. Visualization projects allow librarians to not only learn about their libraries, but to also learn programming and data science skills.
The classic voice on data visualization theory is Edward Tufte. In Envisioning Information, Tufte unequivocally advocates for multi-dimensionality in visualizations. He praises some incredibly complex paper-based visualizations (1990). This discussion suggests that the principles of data visualization are strongly contested. Although Yau’s even-handed approach and Cairo’s willingness to find common ground are laudable, their positions are not authoritative or the only approach to data visualization.
a web application that visualizes the library’s holdings of books and e-books according to certain facets and keywords. Users can visualize whatever topics they want, by selecting keywords and facets that interest them.
Primo X-Services API. JSON, Flask, a very flexible Python web micro-framework. In addition to creating the visualization, SeeCollections also makes this data available on the web. JavaScript is the front-end technology that ultimately presents data to the SeeCollections user. JavaScript is a cornerstone of contemporary web development; a great deal of today’s interactive web content relies upon it. Many popular code libraries have been written for JavaScript. This project draws upon jQuery, Bootstrap and d3.js.
To give SeeCollections a unified visual theme, I have used Bootstrap. Bootstrap is most commonly used to make webpages responsive to different devices
D3.js facilitates the binding of data to the content of a web page, which allows manipulation of the web content based on the underlying data.
JSON replace XML. lightweight data-interchange format. Often used with AJAX (send data forth back client, server, without refresh)
Data types:
number: no dfference between integer and floats
string: string of unicode characters “”
Boolean: true and false
array: ordered list of 0 and more values
Object: unordered collection of key/value pairs
Null: empty value
JSON Syntax Rules:
uses key/value pairs – {“name”;”brad”} . uses double quotes around Key and value . must use the specific data type . file type is “.json” . MIME type is “application/json”
strings: text enclosed in single or double quotation marks
numbers: integer or decimal, positive or negative
booleans: true or false, no quot marks
null: means “nothing,” no quot marks
arrays are lists in square brackets, comma separated, can mix data types
objects are JSON dictionaries in curly brackets, keys and values are separated by a colon, pairs are separated by commas. keys and values can be any data type, but string is the most common value for a key
nesting : arrays and objects inside each other
can put arrays inside objects, objects inside
complications: multiple metadata formats, but variations of Dublin core.
Solr is not a relational dbase, so management of separate partners’ records in a single Solr index was issue to make it relational.
Gretchen Gueguen
Data Services Coordinator from DPLA
metadata mapping
aggregates data from libraries, archives, museums etc
Content hubs and services hubs (so LRS at SCSU)
Metadata is basis of the work of DPLA. We rely on a growing network of hubs that aggregate metadata from partners, then we, in turn, aggregate the hubs’ metadata into the DPLA datastore. As we continue to grow our hub network, we have found the practical matter of how to aggregate partner metadata and deal with quality control over the resulting aggregated set becomes our biggest challenge. If your organization is interested in becoming a part of the DPLA network, or if you are interested in how the DPLA works with metadata, we will be hosting a webinar on January 22nd, at 2pm Eastern, about our workflows, and our future development in this area. The webinar will examine the aggregation best practices at two of our DPLA Service Hubs, as the basis of a conversation about metadata aggregation practices among our Hubs. In addition, DPLA has been working on some new tools for metadata aggregation and quality control that we’d like to share. We’ll preview some of our plans and hope to get feedback on future directions. Speakers: Lisa Gregory and Stephanie Williams of the North Carolina Digital Heritage Center Heather Gilbert and Tyler Mobley of the South Carolina Digital Library Gretchen Gueguen of DPLA