Finch, J. f., & Flenner, A. (2016). Using Data Visualization to Examine an Academic Library Collection. College & Research Libraries, 77(6), 765-778.
http://login.libproxy.stcloudstate.edu/login?qurl=http%3a%2f%2fsearch.ebscohost.com%2flogin.aspx%3fdirect%3dtrue%26db%3dllf%26AN%3d119891576%26site%3dehost-live%26scope%3dsite
p. 766
Visualizations of library data have been used to: • reveal relationships among subject areas for users. • illuminate circulation patterns. • suggest titles for weeding. • analyze citations and map scholarly communications
Each unit of data analyzed can be described as topical, asking “what.”6 • What is the number of courses offered in each major and minor? • What is expended in each subject area? • What is the size of the physical collection in each subject area? • What is student enrollment in each area? • What is the circulation in specific areas for one year?
libraries, if they are to survive, must rethink their collecting and service strategies in radical and possibly scary ways and to do so sooner rather than later. Anderson predicts that, in the next ten years, the “idea of collection” will be overhauled in favor of “dynamic access to a virtually unlimited flow of information products.” My note: in essence, the fight between Mark Vargas and the Acquisition/Cataloguing people
The library collection of today is changing, affected by many factors, such as demanddriven acquisitions, access, streaming media, interdisciplinary coursework, ordering enthusiasm, new areas of study, political pressures, vendor changes, and the individual faculty member following a focused line of research.
subject librarians may see opportunities in looking more closely at the relatively unexplored “intersection of circulation, interlibrary loan, and holdings.”
Using Visualizations to Address Library Problems
the difference between graphical representations of environments and knowledge visualization, which generates graphical representations of meaningful relationships among retrieved files or objects.
Exhaustive lists of data visualization tools include: • the DIRT Directory (http://dirtdirectory.org/categories/visualization) • Kathy Schrock’s educating through infographics (www.schrockguide.net/ infographics-as-an-assessment.html) • Dataviz list of online tools (www.improving-visualisation.org/case-studies/id=5)
Visualization tools explored for this study include Plotly, Microsoft Excel, Python programming language, and D3.js, a javascript library for creating documents based on data. Tableau Public©
Eugene O’Loughlin, National College of Ireland, is very helpful in composing the charts and is found here: https://youtu.be/4FyImh2G7N0.
p. 771 By looking at the data (my note – by visualizing the data), more questions are revealed, The visualizations provide greater comprehension than the two-dimensional “flatland” of the spreadsheets, in which valuable questions and insights are lost in the columns and rows of data.
By looking at data visualized in different combinations, library collection development teams can clearly compare important considerations in collection management: expenditures and purchases, circulation, student enrollment, and course hours. Library staff and administrators can make funding decisions or begin dialog based on data free from political pressure or from the influence of the squeakiest wheel in a department.
+++++++++++++++
more on data visualization for the academic library in this IMS blog
https://blog.stcloudstate.edu/ims?s=data+visualization
Minnesota State University Moorhead – Software Carpentry Workshop
https://www.eventbrite.com/e/minnesota-state-university-moorhead-software-carpentry-workshop-registration-38516119751
Reservation code: 680510823 Reservation for: Plamen Miltenoff
Hagen Hall – 600 11th St S – Room 207 – Moorhead
pad.software-carpentry.org/2017-10-27-Moorhead
http://www.datacarpentry.org/lessons/
https://software-carpentry.org/lessons/
++++++++++++++++
Friday
Jeff – certified Bash Python, John
http://bit.do/msum_swc
https://ntmoore.github.io/2017-10-27-Moorhead/
what is shall and what does it do. language close to computers, fast.
what is “bash” . cd, ls
shell job is a translator between the binory code, the middle name. several types of shells, with slight differences. one natively installed on MAC and Unix. born-again shell
bash commands: cd change director, ls – list; ls -F if it does not work: man ls (manual for LS); colon lower left corner tells you can scrool; q for escape; ls -ltr
arguments is colloquially used with different names. options, flags, parameters
cd .. – move up one directory . pwd : see the content cd data_shell/ – go down one directory
cd ~ – brings me al the way up . $HOME (universally defined variable
the default behavior of cd is to bring to home directory.
the core shall commands accept the same shell commands (letters)
$ du -h . gives me the size of the files. ctrl C to stop
$ clear . – clear the entire screen, scroll up to go back to previous command
man history $ history $! pwd (to go to pwd . $ history | grep history (piping)
$ cat (and the file name) – standard output
$ cat ../
+++++++++++++++
how to edit and delete files
to create new folder: $ mkdir . – make directory
text editors – nano, vim (UNIX text editors) . $ nano draft.txt . ctrl O (save) ctr X (exit) .
$ vim . shift esc (key) and in command line – wq (write quit) or just “q”
$ mv draft.txt ../data . (move files)
to remove $ rm thesis/: $ man rm
copy files $cp $ touch . (touches the file, creates if new)
remove $ rm . anything PSEUDO is dangerous Bash profile: cp -i
*- wild card, truncate $ ls analyzed (list of the analyized directory)
stackoverflow web site .
+++++++++++++++++
head command . $head basilisk.day (check only the first several lines of a large file
$ for filename in basilisk.dat unicorn.dat . (making a loop = multiline)
> do (expecting an action) do
> head -n 3 $filename . (3 is for the first three line of the file to be displayed and -n is for the number)
> done
for doing repetitive functions
also
$ for filename in *.dat ; do head -n 3$x; done
$ for filename in *.dat ; do echo $filename do head -n 3$x; done
$ echo $filename (print statement)
how to loop
$ for filename in *.dat ; do echo $filename ; echo head -n 3 $filename ; done
ctrl c or apple comd dot to get out of the loop
http://swcarpentry.github.io/shell-novice/02-filedir/
also
$ for filename in *.dat
> do
> $filename
> head -n 10 (first ten files ) $filename | tail -n 20 (last twenty lines)
$ for filename in *.dat
do
>> echo $filename
>> done
$ for filename in *.dat
>> do
>> cp $filename orig_$filename
>>done\
history > something else
$ head something.else
+++++++++++++
another function: word count
$ wc *.pdb (protein databank)
$ head cubane.pdb
if i don;t know how to read the outpun $ man wc
the difference between “*” and “?”
$ wc -l *.pdb
$
wc -l *.pdb > lenghts.txs
$ cat lenghts.txt
$ for fil in *.txt
>>> do
>>> wc -l $fil
by putting a $ sign use that not the actual text.
++++++++++++
$ nano middle.sh . The entire point of shell is to automate
$ bash (exectubale) to run the program middle.sh
rwx – rwx – rwx . (owner – group -anybody)
bash middle.sh
$ file middle.sh
$path .
$ echo $PATH | tr “:” “\n”
/usr/local/bin
/usr/bin
/bin
/usr/sbin
/sbin
/Applications/VMware Fusion.app/Contents/Public
/usr/local/munki
$ export PATH=$PWD:$PATH
(this is to make sure that the last version of Python is running)
$ ls ~ . (hidden files)
$ ls -a ~
$ touch .bach_profile .bashrc
$history | grep PATH
19 echo $PATH
44 echo #PATH | tr “:” “\n”
45 echo $PATH | tr “:” “\n”
46 export PATH=$PWD:$PATH
47 echo #PATH | tr “:” “\n”
48 echo #PATH | tr “:” “\n”
55 history | grep PATH
wc -l “$@” | sort -n ($@ – encompasses eerything. will process every single file in the list of files
$ chmod (make it executable)
$ find . -type d . (find only directories, recursively, )
$ find . -type f (files, instead of directories)
$ find . -name ‘*.txt’ . (find files by name, don’t forget single quotes)
$ wc -l $(find . -name ‘*.txt’) – when searching among direcories on different level
$ find . -name ‘*.txt’ | xargs wc -l – same as above ; two ways to do one and the same
+++++++++++++++++++
Saturday
Python
Link to the Python Plotting : https://swcarpentry.github.io/python-novice-gapminder
C and C++. scripting purposes in microbiology (instructor). libraries, packages alongside Python, which can extend its functionality. numpy and scipy (numeric and science python). Python for academic libraries?
going out of python $ quit () . python expect beginning and end parenthesis
new terminal needed after installation. anaconda 5.0.1
python 3 is complete redesign, not only an update.
http://swcarpentry.github.io/python-novice-gapminder/setup/
jupyter crashes in safari. open in chrome. spg engine maybe
https://swcarpentry.github.io/python-novice-gapminder/01-run-quit/
to start python in the terminal $ python
>> variable = 3
>> variable +10
several data types.
stored in JSON format.
command vs edit code. code cell is the gray box. a text cell is plain text
markdown syntax. format working with git and github . search explanation in https://swcarpentry.github.io/python-novice-gapminder/01-run-quit/
hackMD https://hackmd.io/ (use your GIthub account)
PANDOC – translates different data formats. https://pandoc.org/
print is a function
in what cases i will run my data trough Python instead of SPSS?
python is a 0 based language. starts counting with 0 – Java, C, P
atom_name = ‘helium ‘
print(atom_name[0]) string slicing and indexing is tricky
atom_name = ‘helium ‘
print(atom_name[0:6])
vs
atom_name = ‘helium ‘
print(atom_name[7]) python does not know how to slice it
synthax of python is start : end : countby/step
string versus list . string is in a single quote, list will have brakets
strings allow me to work not only w values, revers the string
atom_name = ‘helium lithium beryllium’
print(atom_name[::-1])
Atom_name = ‘helium’
len (atom_name) 6 . case sensitive
to clean the memory, restart the kernel
objects in Python have different types. adopt a class, value may have class inherent in its defintion
print (type(’42’)) . Python tells me that it is a string
print (type(42)) . tells e it is a string
LaTex
to combine integer and letter: print (str(1) + ‘A’)
converting a string to integer . : print (1 + int(’55’)) . all the same type
translation table. numerical representation of a string
float
print (‘half is’, 1 / 2.0)
built in functions and help
print is a function, lenght is a function (len); type, string, int, max, round,
Python does not explain well why the code breaks
ASCI character set – build in Python conversation
function “import”
Saturdady afternoon
reading .CSV in Python
python is object oriented and i can define the objects
python creates its own types of objects (which we model) and those are called “DataFrame”
method applied it is an attribute to data that already exists. – difference from function
data.info() . is function – it does not take any arguments
whereas
data.columns . is a method
print (data.T) . transpose. not easy in Excel, but very easy in Python
print (data.describe()) .
/Users/plamen_local/anaconda3/lib/python3.6/site-packages/pandas/__init__.py
%matplotlib inline teling Jupyter notebook
import pandas
data = pandas.read_csv(‘/Users/plamen_local/Desktop/data/gapminder_gdp_oceania.csv’ , index_col=’country’)
data.loc[‘Australia’].plot()
plt.xticks(rotation=10)
GD plot 2 is the most well known library.
xelatex is a PDF engine. reST restructured text like Markdown. google what is the best PDF engine with Jupyter
four loops . any computer language will have the concept of “for” loop. In Python: 1. whenever we create a “for” loop, that line must end with a single colon
2. indentation. any “if” statement in the “for” loop, gets indented