the type of data: wikipedia. the dangers of learning from wikipedia. how individuals can organize mitigate some of these dangers. wikidata, algorithms.
IBM Watson is using wikipedia by algorythms making sense, AI system
youtube videos debunked of conspiracy theories by using wikipedia.
semantic relatedness, Word2Vec
how does algorithms work: large body of unstructured text. picks specific words
lots of AI learns about the world from wikipedia. the neutral point of view policy. WIkipedia asks editors present as proportionally as possible. Wikipedia biases: 1. gender bias (only 20-30 % are women).
conceptnet. debias along different demographic dimensions.
citations analysis gives also an idea about biases. localness of sources cited in spatial articles. structural biases.
geolocation on Twitter by County. predicting the people living in urban areas. FB wants to push more local news.
danger (biases) #3. wikipedia search results vs wkipedia knowledge panel.
collective action against tech: Reddit, boycott for FB and Instagram.
data labor: what the primary resources this companies have. posts, images, reviews etc.
boycott, data strike (data not being available for algorithms in the future). GDPR in EU – all historical data is like the CA Consumer Privacy Act. One can do data strike without data boycott. general vs homogeneous (group with shared identity) boycott.
the wikipedia SPAM policy is obstructing new editors and that hit communities such as women.
how to access at different levels. methods and methodological concerns. ethical concerns, legal concerns,
tweetdeck for advanced Twitter searches. quoting, likes is relevant, but not enough, sometimes screenshot
social listening platforms: crimson hexagon, parsely, sysomos – not yet academic platforms, tools to setup queries and visualization, but difficult to algorythm, the data samples etc. open sources tools (Urbana, Social Media microscope: SMILE (social media intelligence and learning environment) to collect data from twitter, reddit and within the platform they can query Twitter. create trend analysis, sentiment analysis, Voxgov (subscription service: analyzing political social media)
graduate level and faculty research: accessing SM large scale data web scraping & APIs Twitter APIs. Jason script, Python etc. Gnip Firehose API ($) ; Web SCraper Chrome plugin (easy tool, Pyhon and R created); Twint (Twitter scraper)
Facepager (open source) if not Python or R coder. structure and download the data sets.
TAGS archiving google sheets, uses twitter API. anything older 7 days not avaialble, so harvest every week.
social feed manager (GWUniversity) – Justin Litman with Stanford. Install on server but allows much more.
legal concerns: copyright (public info, but not beyond copyrighted). fair use argument is strong, but cannot publish the data. can analyize under fair use. contracts supercede copyright (terms of service/use) licensed data through library.
methods: sampling concerns tufekci, 2014 questions for sm. SM data is a good set for SM, but other fields? not according to her. hashtag studies: self selection bias. twitter as a model organism: over-represnted data in academic studies.
methodological concerns: scope of access – lack of historical data. mechanics of platform and contenxt: retweets are not necessarily endorsements.
ethical concerns. public info – IRB no informed consent. the right to be forgotten. anonymized data is often still traceable.
table discussion: digital humanities, journalism interested, but too narrow. tools are still difficult to find an operate. context of the visuals. how to spread around variety of majors and classes. controversial events more likely to be deleted.
takedowns, lies and corrosion: what is a librarian to do: trolls, takedown,
development kit circulation. familiarity with the Oculus Rift resulted in lesser reservation. Downturn also.
An experience station. clean up free apps.
question: spherical video, video 360.
safety issues: policies? instructional perspective: curating,WI people: user testing. touch controllers more intuitive then xbox controller. Retail Oculus Rift
app Scatchfab. 3modelviewer. obj or sdl file. Medium, Tiltbrush.
College of Liberal Arts at the U has their VR, 3D print set up.
Penn State (Paul, librarian, kiniseology, anatomy programs), Information Science and Technology. immersive experiences lab for video 360.
CALIPHA part of it is xrlibraries. libraries equal education. content provider LifeLiqe STEM library of AR and VR objects. https://www.lifeliqe.com/
how data is produced, collected and analyzed. make accessible all kind of data and info
ask good q/s and find good answers, share finding in meaningful ways. this is where digital literacy overshadows information literacy and this the fact that SCSU library does not understand; besides teaching students how to find and evaluate data, I also teach them how to communicate effectively using electronic tools.
connecting people tools and resources and making it easier for everybody. building collaborative, open and interdisciplinary
robust data computational literates. developing workshops, project and events to practice new skills. to position the library as the interdisciplinary nexus
what are data: definition. items of information, facts, traces of content and form. higher level, conception discussion about data in terms of social effects: matadata capturing information about the world, social political and economic changes. move away the mystic conceptions about data. nothing objective about data.
the emergence of IoT – digital meets physical. cyber physical systems. smart objects driven by industry. . proliferation of sensor and device – smart devices.
what does privacy looks like ? what is netneutrality when IoT? library must restructure : collaborate across institutions about collections of data in opien and participatory ways. put IoT in the hands of make and break things (she is maker space aficionado)
make and break things hackathons – use cheap devices such as Arduino and Pi.
data literacy programs with higher level conception exploration; libraries empower the campus in data collection. data science norms, store and share data to existing repositories and even catalogs. commercial services to store and connect data, but very restrictive and this is why libraries must be involved.
linked data and dark data
linked data – draw connections around online data most of the data are locked. linked data uses metadata to link related information in ways computers can understand.
libraries take advantage of link data. link data opportunity for semantics, natural language processing etc. if hidden data is relative to our communities, it is a library responsibility to provide it. community data practitioners
massive data, which cannot be analyzed by relational processing. data not yield significant findings. might be valuable for researchers: one persons trash is another persons’ treasure. preserving data and providing access to info. collaborate with researchers across disciplines and assist decide what is worth keeping and what discarding and how to study.
rich learning experience working with lined and dark data enable fresh perspective and learning how to work with data architecture. data literacy programming.
In the age of Big Data, there is an abundance of free or cheap data sources available to libraries about their users’ behavior across the many components that make up their web presence. Data from vendors, data from Google Analytics or other third-party tracking software, and data from user testing are all things libraries have access to at little or no cost. However, just like many students can become overloaded when they do not know how to navigate the many information sources available to them, many libraries can become overloaded by the continuous stream of data pouring in from these sources. This session will aim to help librarians understand 1) what sorts of data their library already has (or easily could have) access to about how their users use their various web tools, 2) what that data can and cannot tell them, and 3) how to use the datasets they are collecting in a holistic manner to help them make design decisions. The presentation will feature examples from the presenters’ own experience of incorporating user data in decisions related to design the Bethel University Libraries’ web presence.
lack of fear, changing the mindset.
deep collaboration both within and cross-consortia
don’t rely on vendor solutions. changing mindset
development = oppty (versus development as “work”)
private higher education is PALNI
3d virtual picture of disastrous areas. unlock the digital information to be digitally accessible to all people who might be interested.
they opened the maps of Katmandu for the local community and they were coming up with the strategies to recover. democracy in action
i can’t stop thinking that the keynote speaker efforts are mere follow up of what Naomi Klein explains in her Shock Doctrine: http://www.naomiklein.org/shock-doctrine: a government country seeks reasons to destroy another country or area and then NGOs from the same country go to remedy the disasters
A question from a librarian from the U about the use of drones. My note: why did the SCSU library have to give up its drone?
Douglas County Library model. too resource intensive to continue
Marmot Library Network
ILS integrated library system – shared with other counties, same sever for the entire consortium. they have a programmer, viewfind, open source, discovery player, he customized viewfind community to viewfind plus. instead of using the ILS public access catalogue, they are using the Vufind interface
Caiifa Enki. public library – single access collection. they purchase ebooks from the publisher and they are using also the viewfind interface. but not integrated with the library catalogs. Kansas public library went from OverDrive to Viewfind. CA State library is funding for the time being this effort.
Harper Collins is too cumbersome and the reason to avoid working with them.
security issues. some of the material sent over ftp and immediately moved to sftp
decisions – use of internal resources only, if now – amazon
programmer used for the pilot. contracted programmers. lack of the ability to see the large picture. eventually hired a full time person, instead of outsourcing. RDA compliant MARC.
ONIX, spreadsheet MARC.
Decision about who to start with : public or academic.
attempt to keep pricing down –
own agreement with the customers, separate from the agreement with the Publisher
current development: web-based online reading, shared-consortial collections and SIP2 authentication