Searching for "data"

General Data Protection Regulations

A techie’s rough guide to GDPR

https://www.cennydd.com/writing/a-techies-rough-guide-to-gdpr

A large global change in data protection law is about to hit the tech industry, thanks to the EU’s General Data Protection Regulations (GDPR). GDPR affects any company, wherever they are in the world, that handles data about European citizens. It becomes law on 25 May 2018, and as such includes UK citizens, since it precedes Brexit. It’s no surprise the EU has chosen to tighten the data protection belt: Europe has long opposed the tech industry’s expansionist tendencies, particularly through antitrust suits, and is perhaps the only regulatory body with the inclination and power to challenge Silicon Valley in the coming years.

So, no more harvesting data for unplanned analytics, future experimentation, or unspecified research. Teams must have specific uses for specific data.

bots, big data and the future

Computational Propaganda: Bots, Targeting And The Future

February 9, 201811:37 AM ET 

https://www.npr.org/sections/13.7/2018/02/09/584514805/computational-propaganda-yeah-that-s-a-thing-now

Combine the superfast calculational capacities of Big Compute with the oceans of specific personal information comprising Big Data — and the fertile ground for computational propaganda emerges. That’s how the small AI programs called bots can be unleashed into cyberspace to target and deliver misinformation exactly to the people who will be most vulnerable to it. These messages can be refined over and over again based on how well they perform (again in terms of clicks, likes and so on). Worst of all, all this can be done semiautonomously, allowing the targeted propaganda (like fake news stories or faked images) to spread like viruses through communities most vulnerable to their misinformation.

According to Bolsover and Howard, viewing computational propaganda only from a technical perspective would be a grave mistake. As they explain, seeing it just in terms of variables and algorithms “plays into the hands of those who create it, the platforms that serve it, and the firms that profit from it.”

Computational propaganda is a new thing. People just invented it. And they did so by realizing possibilities emerging from the intersection of new technologies (Big Compute, Big Data) and new behaviors those technologies allowed (social media). But the emphasis on behavior can’t be lost.

People are not machines. We do things for a whole lot of reasons including emotions of loss, anger, fear and longing. To combat computational propaganda’s potentially dangerous effects on democracy in a digital age, we will need to focus on both its howand its why.

++++++++++++++++
more on big data in this IMS blog
http://blog.stcloudstate.edu/ims?s=big+data

more on bots in this IMS blog
http://blog.stcloudstate.edu/ims?s=bot

more on fake news in this IMS blog
http://blog.stcloudstate.edu/ims?s=fake+news

Women in data science

Women in data science conference

Organiser: Fatima Batool (The Alan Turing Institute and WiDS Ambassador)

Date: 6 April 2018

Venue: The Alan Turing Institute

Register now


The Stanford Women in Data Science conference (WiDS) is a one day global conference that will bring data scientists together to share cutting edge research.  The conference aim is to inspire and encourage data scientists worldwide and exclusively support women in the field.

We will proudly host WiDS at The Alan Turing Institute.  The conference will feature eminent female speakers through technical talks, lunchtime discussions on data science (topics to be announced shortly), a panel discussion and networking event.

The conference programme and speaker information will be soon available through the conference website.  The event will be available worldwide via live streaming and the conference talks will be broadcast online.

The event will provide great opportunities to connect with potential mentors, collaborators and peers; hear about recent advancements in data science and explore new research dimensions.

Speakers:

Cecilia Lindgren

Mihaela van der Schaar

Jil Matheson

Codina Cotar

Emma McCoy

Cecilia Mascolo

Kathy Whaler

Mariana Damova

We welcome all regardless of gender to join us on Friday 6 April 2018 for an excellent learning experience.

For more information email: events@turing.ac.uk

Analytics Data Integration and Governance

Supporting Analytics through Data Integration and Governance

https://www.educause.edu/focus-areas-and-initiatives/enterprise-and-infrastructure/enterprise-it-program/supporting-analytics-through-data-integration-and-governance

Support analytics initiatives with data integration and governance. The changing landscape of enterprise IT is characterized by an expanding set of services, systems, and sourcing strategies. Data governance, cross-enterprise partnerships, and data integration are key ingredients in supporting higher education’s growing need for reliable information.

Enterprise IT Case Studies

In this set of EDUCAUSE Review case studies, see how Drake University, the University of Tennessee, and the University of Montana improved their analytics initiatives through data integrations and governance.

 

 

+++++++++
more on analytics in this IMS blog
http://blog.stcloudstate.edu/ims?s=analytics

http://blog.stcloudstate.edu/ims?s=data+governance

data storytelling

3 Reasons Why Data Storytelling Will Be A Top Marketing Trend of 2018

https://martechseries.com/mts-insights/guest-authors/3-reasons-data-storytelling-will-top-marketing-trend-2018/
A study that looked at reader engagement across articles that contained charts and infographics vs. articles that were text-only found that those with graphical storytelling, or what I like to call data storytelling, had up to 34 percent more comments and shares and a 300 percent improvement on the depth of scroll down the page.
Using storytelling techniques to present data not only makes it more visually appealing but also enables easy spotting of key trends, seamless results-tracking, and quick goal-monitoring.

Here are things that can help you build a bridge from your current methods to effective data storytelling–

  • Choose a topic by identifying your target audience, the goal of your visual, what you would like to achieve.
  • Organize your data by thinking about what you want to convey and then get rid of anything that doesn’t help you tell that story.
  • Spend time making your visualization look sharp by keeping it simple, using color and interactivity.

A few bonus tips to make your data visualizations really pop–

  • Don’t use more than two graphs at a time so as not to confuse participants.
  • Stick with one color per graph; making things multicolored will cause data to look jumbled.
  • Give context to your concept. Introduce your idea slowly and tell the story of what you want your data to reveal instead of assuming everyone in the room is on the same page.
  • Try using interactive data storytelling techniques to support your data.
++++++++++++
more on digital storytelling in this IMS blog
http://blog.stcloudstate.edu/ims?s=digital+storytelling

Borgman data

book reviews:
https://bobmorris.biz/big-data-little-data-no-data-a-book-review-by-bob-morris
“The challenge is to make data discoverable, usable, assessable, intelligible, and interpretable, and do so for extended periods of time…To restate the premise of this book, the value of data lies in their use. Unless stakeholders can agree on what to keep and why, and invest in the invisible work necessary to sustain knowledge infrastructures, big data and little data alike will become no data.”
http://www.cjc-online.ca/index.php/journal/article/view/3152/3337
he premise that data are not natural objects with their own essence, Borgman rather explores the different values assigned to them, as well as their many variations according to place, time, and the context in which they are collected. It is specifically through six “provocations” that she offers a deep engagement with different aspects of the knowledge industry. These include the reproducibility, sharing, and reuse of data; the transmission and publication of knowledge; the stability of scholarly knowledge, despite its increasing proliferation of forms and modes; the very porosity of the borders between different areas of knowledge; the costs, benefits, risks, and responsibilities related to knowledge infrastructure; and finally, investment in the sustainable acquisition and exploitation of data for scientific research.
beyond the six provocations, there is a larger question concerning the legitimacy, continuity, and durability of all scientific research—hence the urgent need for further reflection, initiated eloquently by Borgman, on the fact that “despite the media hyperbole, having the right data is usually better than having more data”
o Data management (Pages xviii-xix)
o Data definition (4-5 and 18-29)
p. 5 big data and little data are only awkwardly analogous to big science and little science. Modern science, or big science inDerek J. de Solla Price  (https://en.wikipedia.org/wiki/Big_Science) is characterized by international, collaborative efforts and by the invisible colleges of researchers who know each other and who exchange information on a formal and informal basis. Little science is the three hundred years of independent, smaller-scale work to develop theory and method for understanding research problems. Little science is typified by heterogeneous methods, heterogeneous data and by local control and analysis.
p. 8 The Long Tail
a popular way of characterizing the availability and use of data in research areas or in economic sectors. https://en.wikipedia.org/wiki/Long_tail

o Provocations (13-15)
o Digital data collections (21-26)
o Knowledge infrastructures (32-35)
o Open access to research (39-42)
o Open technologies (45-47)
o Metadata (65-70 and 79-80)
o Common resources in astronomy (71-76)
o Ethics (77-79)
o Research Methods and data practices, and, Sensor-networked science and technology (84-85 and 106-113)
o Knowledge infrastructures (94-100)
o COMPLETE survey (102-106)
o Internet surveys (128-143)
o Internet survey (128-143)
o Twitter (130-133, 138-141, and 157-158(
o Pisa Clark/CLAROS project (179-185)
o Collecting Data, Analyzing Data, and Publishing Findings (181-184)
o Buddhist studies 186-200)
o Data citation (241-268)
o Negotiating authorship credit (253-256)
o Personal names (258-261)
o Citation metrics (266-209)
o Access to data (279-283)

++++++++++++++++
more on big data in education in this IMS blog
http://blog.stcloudstate.edu/ims?s=big+data

academic library collection data visualization

Finch, J. f., & Flenner, A. (2016). Using Data Visualization to Examine an Academic Library Collection. College & Research Libraries77(6), 765-778.

http://login.libproxy.stcloudstate.edu/login?qurl=http%3a%2f%2fsearch.ebscohost.com%2flogin.aspx%3fdirect%3dtrue%26db%3dllf%26AN%3d119891576%26site%3dehost-live%26scope%3dsite

p. 766
Visualizations of library data have been used to: • reveal relationships among subject areas for users. • illuminate circulation patterns. • suggest titles for weeding. • analyze citations and map scholarly communications

Each unit of data analyzed can be described as topical, asking “what.”6 • What is the number of courses offered in each major and minor? • What is expended in each subject area? • What is the size of the physical collection in each subject area? • What is student enrollment in each area? • What is the circulation in specific areas for one year?

libraries, if they are to survive, must rethink their collecting and service strategies in radical and possibly scary ways and to do so sooner rather than later. Anderson predicts that, in the next ten years, the “idea of collection” will be overhauled in favor of “dynamic access to a virtually unlimited flow of information products.”  My note: in essence, the fight between Mark Vargas and the Acquisition/Cataloguing people

The library collection of today is changing, affected by many factors, such as demanddriven acquisitions, access, streaming media, interdisciplinary coursework, ordering enthusiasm, new areas of study, political pressures, vendor changes, and the individual faculty member following a focused line of research.

subject librarians may see opportunities in looking more closely at the relatively unexplored “intersection of circulation, interlibrary loan, and holdings.”

Using Visualizations to Address Library Problems

the difference between graphical representations of environments and knowledge visualization, which generates graphical representations of meaningful relationships among retrieved files or objects.

Exhaustive lists of data visualization tools include: • the DIRT Directory (http://dirtdirectory.org/categories/visualization) • Kathy Schrock’s educating through infographics (www.schrockguide.net/ infographics-as-an-assessment.html) • Dataviz list of online tools (www.improving-visualisation.org/case-studies/id=5)

Visualization tools explored for this study include Plotly, Microsoft Excel, Python programming language, and D3.js, a javascript library for creating documents based on data. Tableau Public©

Eugene O’Loughlin, National College of Ireland, is very helpful in composing the charts and is found here: https://youtu.be/4FyImh2G7N0.

p. 771 By looking at the data (my note – by visualizing the data), more questions are revealed,  The visualizations provide greater comprehension than the two-dimensional “flatland” of the spreadsheets, in which valuable questions and insights are lost in the columns and rows of data.

By looking at data visualized in different combinations, library collection development teams can clearly compare important considerations in collection management: expenditures and purchases, circulation, student enrollment, and course hours. Library staff and administrators can make funding decisions or begin dialog based on data free from political pressure or from the influence of the squeakiest wheel in a department.

+++++++++++++++
more on data visualization for the academic library in this IMS blog
http://blog.stcloudstate.edu/ims?s=data+visualization

data visualization for librarians

Eaton, M. E. (2017). Seeing Seeing Library Data: A Prototype Data Visualization Application for Librarians. Journal of Web Librarianship, 11(1), 69–78. Retrieved from http://academicworks.cuny.edu/kb_pubs

Visualization can increase the power of data, by showing the “patterns, trends and exceptions”

Librarians can benefit when they visually leverage data in support of library projects.

Nathan Yau suggests that exploratory learning is a significant benefit of data visualization initiatives (2013). We can learn about our libraries by tinkering with data. In addition, handling data can also challenge librarians to improve their technical skills. Visualization projects allow librarians to not only learn about their libraries, but to also learn programming and data science skills.

The classic voice on data visualization theory is Edward Tufte. In Envisioning Information, Tufte unequivocally advocates for multi-dimensionality in visualizations. He praises some incredibly complex paper-based visualizations (1990). This discussion suggests that the principles of data visualization are strongly contested. Although Yau’s even-handed approach and Cairo’s willingness to find common ground are laudable, their positions are not authoritative or the only approach to data visualization.

a web application that visualizes the library’s holdings of books and e-books according to certain facets and keywords. Users can visualize whatever topics they want, by selecting keywords and facets that interest them.

Primo X-Services API. JSON, Flask, a very flexible Python web micro-framework. In addition to creating the visualization, SeeCollections also makes this data available on the web. JavaScript is the front-end technology that ultimately presents data to the SeeCollections user. JavaScript is a cornerstone of contemporary web development; a great deal of today’s interactive web content relies upon it. Many popular code libraries have been written for JavaScript. This project draws upon jQuery, Bootstrap and d3.js.

To give SeeCollections a unified visual theme, I have used Bootstrap. Bootstrap is most commonly used to make webpages responsive to different devices

D3.js facilitates the binding of data to the content of a web page, which allows manipulation of the web content based on the underlying data.

 

JSON and Structured Data

JSON and Structured Data

https://www.w3schools.com/js/js_json_intro.asp

JSON replace XML. lightweight data-interchange format. Often used with AJAX (send data forth back client, server, without refresh)

Data types:
number: no dfference between integer and floats
string: string of unicode characters “”
Boolean: true and false
array: ordered list of 0 and more values
Object: unordered collection of key/value pairs
Null: empty value

JSON Syntax Rules:
uses key/value pairs – {“name”;”brad”} .     uses double quotes around Key and value .     must use the specific data type .   file type is “.json” .   MIME type is “application/json”

http://www.json.org/

https://code.google.com/archive/p/json-simple/

https://www.linkedin.com/learning/learn-api-documentation-with-json-and-xml/json-basics

strings: text enclosed in single or double quotation marks
numbers: integer or decimal, positive or negative
booleans: true or false, no quot marks
null: means “nothing,” no quot marks

arrays are lists in square brackets, comma separated, can mix data types

objects are JSON dictionaries in curly brackets, keys and values are separated by a colon, pairs are separated by commas. keys and values can be any data type, but string is the most common value for a key

nesting : arrays and objects inside each other
can put arrays inside objects, objects inside

 

anonymous browsing data

‘Anonymous’ browsing data can be easily exposed, researchers reveal

https://www.theguardian.com/technology/2017/aug/01/data-browsing-habits-brokers

A similar strategy was used in 2008, Dewes said, to deanonymise a set of ratings published by Netflix to help computer scientists improve its recommendation algorithm: by comparing “anonymous” ratings of films with public profiles on IMDB, researchers were able to unmask Netflix users – including one woman, a closeted lesbian, who went on to sue Netflix for the privacy violation.

++++++++++++++++
A hacker explains the best way to browse the internet anonymously.
https://www.facebook.com/techinsider/videos/824655787732779/ 

++++++++++++++
more on privacy in this IMS blog
https://blog.stcloudstate.edu/ims?s=privacy

1 2 3 55