counting how many times students use electronic library resources or visit in person, and comparing that to how well the students do in their classes and how likely they are to stay in school and earn a degree. And many library leaders are finding a strong correlation, meaning that students who consume more library materials tend to be more successful academically.
carefully tracking how library use compares to other metrics, and it has made changes as a result—like moving the tutoring center and the writing lab into the library. Those moves were designed not only to lure more people into the stacks, but to make seeking help more socially-acceptable for students who might have been hesitant.
a partnership between the library, which knows what electronic materials students use, and the technology office, which manages other campus data such as usage of the course-management system. The university is doing a study to see whether library usage there also equates to student success.
Submissions are invited for the IOLUG Spring 2019 Conference, to be held May 10th in Indianapolis, IN. Submissions are welcomed from all types of libraries and on topics related to the theme of data in libraries.
Libraries and librarians work with data every day, with a variety of applications – circulation, gate counts, reference questions, and so on. The mass collection of user data has made headlines many times in the past few years. Analytics and privacy have, understandably, become important issues both globally and locally. In addition to being aware of the data ecosystem in which we work, libraries can play a pivotal role in educating user communities about data and all of its implications, both favorable and unfavorable.
The Conference Planning Committee is seeking proposals on topics related to data in libraries, including but not limited to:
Using tools/resources to find and leverage data to solve problems and expand knowledge,
Data policies and procedures,
Harvesting, organizing, and presenting data,
Data-driven decision making,
Data in collection development,
Using data to measure outcomes, not just uses,
Using data to better reach and serve your communities,
It’s working, generally, but it does require a good bit of config tweaking to get it to accurately count. It also needs (I’ve discovered) a certain amount of distance between the door and the camera. We have very low ceilings at our doorways and a single person can span the entire frame which appears to confuse the software. All that being said, it’s very much worth looking into.
Analytics out of the box aren’t great. The only built in report is only the current day’s numbers, but it’s pretty easy to export data. We have Libinsight from Springshare and I’m working on pumping that data into their system. It is tricky because the system basically records two things: a timestamp, and a positive or negative integer depending on whether or not the traffic was going in or out. By default, no generic analytics system seems to understand that well enough to display it the way I’d like, so I may have to create some custom reports using d3.js or similar.
I’m using the Pi 2 and standard camera module from Adafruit. I’d be happy to answer questions.
Mark Sandford Systems Librarian Assistant Professor in the Libraries Colgate University Libraries
“The challenge is to make data discoverable, usable, assessable, intelligible, and interpretable, and do so for extended periods of time…To restate the premise of this book, the value of data lies in their use. Unless stakeholders can agree on what to keep and why, and invest in the invisible work necessary to sustain knowledge infrastructures, big data and little data alike will become no data.”
he premise that data are not natural objects with their own essence, Borgman rather explores the different values assigned to them, as well as their many variations according to place, time, and the context in which they are collected. It is specifically through six “provocations” that she offers a deep engagement with different aspects of the knowledge industry. These include the reproducibility, sharing, and reuse of data; the transmission and publication of knowledge; the stability of scholarly knowledge, despite its increasing proliferation of forms and modes; the very porosity of the borders between different areas of knowledge; the costs, benefits, risks, and responsibilities related to knowledge infrastructure; and finally, investment in the sustainable acquisition and exploitation of data for scientific research.
beyond the six provocations, there is a larger question concerning the legitimacy, continuity, and durability of all scientific research—hence the urgent need for further reflection, initiated eloquently by Borgman, on the fact that “despite the media hyperbole, having the right data is usually better than having more data”
o Data management (Pages xviii-xix)
o Data definition (4-5 and 18-29)
p. 5 big data and little data are only awkwardly analogous to big science and little science. Modern science, or big science inDerek J. de Solla Price (https://en.wikipedia.org/wiki/Big_Science) is characterized by international, collaborative efforts and by the invisible colleges of researchers who know each other and who exchange information on a formal and informal basis. Little science is the three hundred years of independent, smaller-scale work to develop theory and method for understanding research problems. Little science is typified by heterogeneous methods, heterogeneous data and by local control and analysis.
o Provocations (13-15)
o Digital data collections (21-26)
o Knowledge infrastructures (32-35)
o Open access to research (39-42)
o Open technologies (45-47)
o Metadata (65-70 and 79-80)
o Common resources in astronomy (71-76)
o Ethics (77-79)
o Research Methods and data practices, and, Sensor-networked science and technology (84-85 and 106-113)
o Knowledge infrastructures (94-100)
o COMPLETE survey (102-106)
o Internet surveys (128-143)
o Internet survey (128-143)
o Twitter (130-133, 138-141, and 157-158(
o Pisa Clark/CLAROS project (179-185)
o Collecting Data, Analyzing Data, and Publishing Findings (181-184)
o Buddhist studies 186-200)
o Data citation (241-268)
o Negotiating authorship credit (253-256)
o Personal names (258-261)
o Citation metrics (266-209)
o Access to data (279-283)
Applications for the 2018 Institute will be accepted between December 1, 2017 and January 27, 2018. Scholars accepted to the program will be notified in early March 2018.
Learning to Harness Big Data in an Academic Library
Research on Big Data per se, as well as on the importance and organization of the process of Big Data collection and analysis, is well underway. The complexity of the process comprising “Big Data,” however, deprives organizations of ubiquitous “blue print.” The planning, structuring, administration and execution of the process of adopting Big Data in an organization, being that a corporate one or an educational one, remains an elusive one. No less elusive is the adoption of the Big Data practices among libraries themselves. Seeking the commonalities and differences in the adoption of Big Data practices among libraries may be a suitable start to help libraries transition to the adoption of Big Data and restructuring organizational and daily activities based on Big Data decisions. Introduction to the problem. Limitations
The redefinition of humanities scholarship has received major attention in higher education. The advent of digital humanities challenges aspects of academic librarianship. Data literacy is a critical need for digital humanities in academia. The March 2016 Library Juice Academy Webinar led by John Russel exemplifies the efforts to help librarians become versed in obtaining programming skills, and respectively, handling data. Those are first steps on a rather long path of building a robust infrastructure to collect, analyze, and interpret data intelligently, so it can be utilized to restructure daily and strategic activities. Since the phenomenon of Big Data is young, there is a lack of blueprints on the organization of such infrastructure. A collection and sharing of best practices is an efficient approach to establishing a feasible plan for setting a library infrastructure for collection, analysis, and implementation of Big Data.
Limitations. This research can only organize the results from the responses of librarians and research into how libraries present themselves to the world in this arena. It may be able to make some rudimentary recommendations. However, based on each library’s specific goals and tasks, further research and work will be needed.
Big Data is becoming an omnipresent term. It is widespread among different disciplines in academia (De Mauro, Greco, & Grimaldi, 2016). This leads to “inconsistency in meanings and necessity for formal definitions” (De Mauro et al, 2016, p. 122). Similarly, to De Mauro et al (2016), Hashem, Yaqoob, Anuar, Mokhtar, Gani and Ullah Khan (2015) seek standardization of definitions. The main connected “themes” of this phenomenon must be identified and the connections to Library Science must be sought. A prerequisite for a comprehensive definition is the identification of Big Data methods. Bughin, Chui, Manyika (2011), Chen et al. (2012) and De Mauro et al (2015) single out the methods to complete the process of building a comprehensive definition.
In conjunction with identifying the methods, volume, velocity, and variety, as defined by Laney (2001), are the three properties of Big Data accepted across the literature. Daniel (2015) defines three stages in big data: collection, analysis, and visualization. According to Daniel, (2015), Big Data in higher education “connotes the interpretation of a wide range of administrative and operational data” (p. 910) and according to Hilbert (2013), as cited in Daniel (2015), Big Data “delivers a cost-effective prospect to improve decision making” (p. 911).
The importance of understanding the process of Big Data analytics is well understood in academic libraries. An example of such “administrative and operational” use for cost-effective improvement of decision making are the Finch & Flenner (2016) and Eaton (2017) case studies of the use of data visualization to assess an academic library collection and restructure the acquisition process. Sugimoto, Ding & Thelwall (2012) call for the discussion of Big Data for libraries. According to the 2017 NMC Horizon Report “Big Data has become a major focus of academic and research libraries due to the rapid evolution of data mining technologies and the proliferation of data sources like mobile devices and social media” (Adams, Becker, et al., 2017, p. 38).
Power (2014) elaborates on the complexity of Big Data in regard to decision-making and offers ideas for organizations on building a system to deal with Big Data. As explained by Boyd and Crawford (2012) and cited in De Mauro et al (2016), there is a danger of a new digital divide among organizations with different access and ability to process data. Moreover, Big Data impacts current organizational entities in their ability to reconsider their structure and organization. The complexity of institutions’ performance under the impact of Big Data is further complicated by the change of human behavior, because, arguably, Big Data affects human behavior itself (Schroeder, 2014).
De Mauro et al (2015) touch on the impact of Dig Data on libraries. The reorganization of academic libraries considering Big Data and the handling of Big Data by libraries is in a close conjunction with the reorganization of the entire campus and the handling of Big Data by the educational institution. In additional to the disruption posed by the Big Data phenomenon, higher education is facing global changes of economic, technological, social, and educational character. Daniel (2015) uses a chart to illustrate the complexity of these global trends. Parallel to the Big Data developments in America and Asia, the European Union is offering access to an EU open data portal (https://data.europa.eu/euodp/home ). Moreover, the Association of European Research Libraries expects under the H2020 program to increase “the digitization of cultural heritage, digital preservation, research data sharing, open access policies and the interoperability of research infrastructures” (Reilly, 2013).
The challenges posed by Big Data to human and social behavior (Schroeder, 2014) are no less significant to the impact of Big Data on learning. Cohen, Dolan, Dunlap, Hellerstein, & Welton (2009) propose a road map for “more conservative organizations” (p. 1492) to overcome their reservations and/or inability to handle Big Data and adopt a practical approach to the complexity of Big Data. Two Chinese researchers assert deep learning as the “set of machine learning techniques that learn multiple levels of representation in deep architectures (Chen & Lin, 2014, p. 515). Deep learning requires “new ways of thinking and transformative solutions (Chen & Lin, 2014, p. 523). Another pair of researchers from China present a broad overview of the various societal, business and administrative applications of Big Data, including a detailed account and definitions of the processes and tools accompanying Big Data analytics. The American counterparts of these Chinese researchers are of the same opinion when it comes to “think about the core principles and concepts that underline the techniques, and also the systematic thinking” (Provost and Fawcett, 2013, p. 58). De Mauro, Greco, and Grimaldi (2016), similarly to Provost and Fawcett (2013) draw attention to the urgent necessity to train new types of specialists to work with such data. As early as 2012, Davenport and Patil (2012), as cited in Mauro et al (2016), envisioned hybrid specialists able to manage both technological knowledge and academic research. Similarly, Provost and Fawcett (2013) mention the efforts of “academic institutions scrambling to put together programs to train data scientists” (p. 51). Further, Asomoah, Sharda, Zadeh & Kalgotra (2017) share a specific plan on the design and delivery of a big data analytics course. At the same time, librarians working with data acknowledge the shortcomings in the profession, since librarians “are practitioners first and generally do not view usability as a primary job responsibility, usually lack the depth of research skills needed to carry out a fully valid” data-based research (Emanuel, 2013, p. 207).
Borgman (2015) devotes an entire book to data and scholarly research and goes beyond the already well-established facts regarding the importance of Big Data, the implications of Big Data and the technical, societal, and educational impact and complications posed by Big Data. Borgman elucidates the importance of knowledge infrastructure and the necessity to understand the importance and complexity of building such infrastructure, in order to be able to take advantage of Big Data. In a similar fashion, a team of Chinese scholars draws attention to the complexity of data mining and Big Data and the necessity to approach the issue in an organized fashion (Wu, Xhu, Wu, Ding, 2014).
Bruns (2013) shifts the conversation from the “macro” architecture of Big Data, as focused by Borgman (2015) and Wu et al (2014) and ponders over the influx and unprecedented opportunities for humanities in academia with the advent of Big Data. Does the seemingly ubiquitous omnipresence of Big Data mean for humanities a “railroading” into “scientificity”? How will research and publishing change with the advent of Big Data across academic disciplines?
Reyes (2015) shares her “skinny” approach to Big Data in education. She presents a comprehensive structure for educational institutions to shift “traditional” analytics to “learner-centered” analytics (p. 75) and identifies the participants in the Big Data process in the organization. The model is applicable for library use.
Being a new and unchartered territory, Big Data and Big Data analytics can pose ethical issues. Willis (2013) focusses on Big Data application in education, namely the ethical questions for higher education administrators and the expectations of Big Data analytics to predict students’ success. Daries, Reich, Waldo, Young, and Whittinghill (2014) discuss rather similar issues regarding the balance between data and student privacy regulations. The privacy issues accompanying data are also discussed by Tene and Polonetsky, (2013).
Privacy issues are habitually connected to security and surveillance issues. Andrejevic and Gates (2014) point out in a decision making “generated by data mining, the focus is not on particular individuals but on aggregate outcomes” (p. 195). Van Dijck (2014) goes into further details regarding the perils posed by metadata and data to the society, in particular to the privacy of citizens. Bail (2014) addresses the same issue regarding the impact of Big Data on societal issues, but underlines the leading roles of cultural sociologists and their theories for the correct application of Big Data.
Library organizations have been traditional proponents of core democratic values such as protection of privacy and elucidation of related ethical questions (Miltenoff & Hauptman, 2005). In recent books about Big Data and libraries, ethical issues are important part of the discussion (Weiss, 2018). Library blogs also discuss these issues (Harper & Oltmann, 2017). An academic library’s role is to educate its patrons about those values. Sugimoto et al (2012) reflect on the need for discussion about Big Data in Library and Information Science. They clearly draw attention to the library “tradition of organizing, managing, retrieving, collecting, describing, and preserving information” (p.1) as well as library and information science being “a historically interdisciplinary and collaborative field, absorbing the knowledge of multiple domains and bringing the tools, techniques, and theories” (p. 1). Sugimoto et al (2012) sought a wide discussion among the library profession regarding the implications of Big Data on the profession, no differently from the activities in other fields (e.g., Wixom, Ariyachandra, Douglas, Goul, Gupta, Iyer, Kulkami, Mooney, Phillips-Wren, Turetken, 2014). A current Andrew Mellon Foundation grant for Visualizing Digital Scholarship in Libraries seeks an opportunity to view “both macro and micro perspectives, multi-user collaboration and real-time data interaction, and a limitless number of visualization possibilities – critical capabilities for rapidly understanding today’s large data sets (Hwangbo, 2014).
The importance of the library with its traditional roles, as described by Sugimoto et al (2012) may continue, considering the Big Data platform proposed by Wu, Wu, Khabsa, Williams, Chen, Huang, Tuarob, Choudhury, Ororbia, Mitra, & Giles (2014). Such platforms will continue to emerge and be improved, with librarians as the ultimate drivers of such platforms and as the mediators between the patrons and the data generated by such platforms.
Every library needs to find its place in the large organization and in society in regard to this very new and very powerful phenomenon called Big Data. Libraries might not have the trained staff to become a leader in the process of organizing and building the complex mechanism of this new knowledge architecture, but librarians must educate and train themselves to be worthy participants in this new establishment.
The study will be cleared by the SCSU IRB.
The survey will collect responses from library population and it readiness to use and use of Big Data. Send survey URL to (academic?) libraries around the world.
Data will be processed through SPSS. Open ended results will be processed manually. The preliminary research design presupposes a mixed method approach.
The study will include the use of closed-ended survey response questions and open-ended questions. The first part of the study (close ended, quantitative questions) will be completed online through online survey. Participants will be asked to complete the survey using a link they receive through e-mail.
Mixed methods research was defined by Johnson and Onwuegbuzie (2004) as “the class of research where the researcher mixes or combines quantitative and qualitative research techniques, methods, approaches, concepts, or language into a single study” (Johnson & Onwuegbuzie, 2004 , p. 17). Quantitative and qualitative methods can be combined, if used to complement each other because the methods can measure different aspects of the research questions (Sale, Lohfeld, & Brazil, 2002).
Online survey of 10-15 question, with 3-5 demographic and the rest regarding the use of tools.
1-2 open-ended questions at the end of the survey to probe for follow-up mixed method approach (an opportunity for qualitative study)
data analysis techniques: survey results will be exported to SPSS and analyzed accordingly. The final survey design will determine the appropriate statistical approach.
Complete literature review and identify areas of interest – two months
Prepare and test instrument (survey) – month
IRB and other details – month
Generate a list of potential libraries to distribute survey – month
Contact libraries. Follow up and contact again, if necessary (low turnaround) – month
Collect, analyze data – two months
Write out data findings – month
Complete manuscript – month
Proofreading and other details – month
Significance of the work
While it has been widely acknowledged that Big Data (and its handling) is changing higher education (https://blog.stcloudstate.edu/ims?s=big+data) as well as academic libraries (https://blog.stcloudstate.edu/ims/2016/03/29/analytics-in-education/), it remains nebulous how Big Data is handled in the academic library and, respectively, how it is related to the handling of Big Data on campus. Moreover, the visualization of Big Data between units on campus remains in progress, along with any policymaking based on the analysis of such data (hence the need for comprehensive visualization).
This research will aim to gain an understanding on: a. how librarians are handling Big Data; b. how are they relating their Big Data output to the campus output of Big Data and c. how librarians in particular and campus administration in general are tuning their practices based on the analysis.
Based on the survey returns (if there is a statistically significant return), this research might consider juxtaposing the practices from academic libraries, to practices from special libraries (especially corporate libraries), public and school libraries.
Adams Becker, S., Cummins M, Davis, A., Freeman, A., Giesinger Hall, C., Ananthanarayanan, V., … Wolfson, N. (2017). NMC Horizon Report: 2017 Library Edition.
Andrejevic, M., & Gates, K. (2014). Big Data Surveillance: Introduction. Surveillance & Society, 12(2), 185–196.
Asamoah, D. A., Sharda, R., Hassan Zadeh, A., & Kalgotra, P. (2017). Preparing a Data Scientist: A Pedagogic Experience in Designing a Big Data Analytics Course. Decision Sciences Journal of Innovative Education, 15(2), 161–190. https://doi.org/10.1111/dsji.12125
Cohen, J., Dolan, B., Dunlap, M., Hellerstein, J. M., & Welton, C. (2009). MAD Skills: New Analysis Practices for Big Data. Proc. VLDB Endow., 2(2), 1481–1492. https://doi.org/10.14778/1687553.1687576
Daniel, B. (2015). Big Data and analytics in higher education: Opportunities and challenges. British Journal of Educational Technology, 46(5), 904–920. https://doi.org/10.1111/bjet.12230
Daries, J. P., Reich, J., Waldo, J., Young, E. M., Whittinghill, J., Ho, A. D., … Chuang, I. (2014). Privacy, Anonymity, and Big Data in the Social Sciences. Commun. ACM, 57(9), 56–63. https://doi.org/10.1145/2643132
De Mauro, A., Greco, M., & Grimaldi, M. (2015). What is big data? A consensual definition and a review of key research topics. AIP Conference Proceedings, 1644(1), 97–104. https://doi.org/10.1063/1.4907823
Emanuel, J. (2013). Usability testing in libraries: methods, limitations, and implications. OCLC Systems & Services: International Digital Library Perspectives, 29(4), 204–217. https://doi.org/10.1108/OCLC-02-2013-0009
Hashem, I. A. T., Yaqoob, I., Anuar, N. B., Mokhtar, S., Gani, A., & Ullah Khan, S. (2015). The rise of “big data” on cloud computing: Review and open research issues. Information Systems, 47(Supplement C), 98–115. https://doi.org/10.1016/j.is.2014.07.006
Philip Chen, C. L., & Zhang, C.-Y. (2014). Data-intensive applications, challenges, techniques and technologies: A survey on Big Data. Information Sciences, 275(Supplement C), 314–347. https://doi.org/10.1016/j.ins.2014.01.015
Sugimoto, C. R., Ding, Y., & Thelwall, M. (2012). Library and information science in the big data era: Funding, projects, and future [a panel proposal]. Proceedings of the American Society for Information Science and Technology, 49(1), 1–3. https://doi.org/10.1002/meet.14504901187
Tene, O., & Polonetsky, J. (2012). Big Data for All: Privacy and User Control in the Age of Analytics. Northwestern Journal of Technology and Intellectual Property, 11, [xxvii]-274.
van Dijck, J. (2014). Datafication, dataism and dataveillance: Big Data between scientific paradigm and ideology. Surveillance & Society; Newcastle upon Tyne, 12(2), 197–208.
Waller, M. A., & Fawcett, S. E. (2013). Data Science, Predictive Analytics, and Big Data: A Revolution That Will Transform Supply Chain Design and Management. Journal of Business Logistics, 34(2), 77–84. https://doi.org/10.1111/jbl.12010
Wu, Z., Wu, J., Khabsa, M., Williams, K., Chen, H. H., Huang, W., … Giles, C. L. (2014). Towards building a scholarly big data platform: Challenges, lessons and opportunities. In IEEE/ACM Joint Conference on Digital Libraries (pp. 117–126). https://doi.org/10.1109/JCDL.2014.6970157
Eaton, M. E. (2017). Seeing Seeing Library Data: A Prototype Data Visualization Application for Librarians. Journal of Web Librarianship, 11(1), 69–78. Retrieved from http://academicworks.cuny.edu/kb_pubs
Visualization can increase the power of data, by showing the “patterns, trends and exceptions”
Librarians can benefit when they visually leverage data in support of library projects.
Nathan Yau suggests that exploratory learning is a significant benefit of data visualization initiatives (2013). We can learn about our libraries by tinkering with data. In addition, handling data can also challenge librarians to improve their technical skills. Visualization projects allow librarians to not only learn about their libraries, but to also learn programming and data science skills.
The classic voice on data visualization theory is Edward Tufte. In Envisioning Information, Tufte unequivocally advocates for multi-dimensionality in visualizations. He praises some incredibly complex paper-based visualizations (1990). This discussion suggests that the principles of data visualization are strongly contested. Although Yau’s even-handed approach and Cairo’s willingness to find common ground are laudable, their positions are not authoritative or the only approach to data visualization.
a web application that visualizes the library’s holdings of books and e-books according to certain facets and keywords. Users can visualize whatever topics they want, by selecting keywords and facets that interest them.
To give SeeCollections a unified visual theme, I have used Bootstrap. Bootstrap is most commonly used to make webpages responsive to different devices
D3.js facilitates the binding of data to the content of a web page, which allows manipulation of the web content based on the underlying data.