Software Carpentry Workshop

Minnesota State University Moorhead – Software Carpentry Workshop

https://www.eventbrite.com/e/minnesota-state-university-moorhead-software-carpentry-workshop-registration-38516119751

Reservation code: 680510823  Reservation for: Plamen Miltenoff

Hagen Hall – 600 11th St S – Room 207 – Moorhead

pad.software-carpentry.org/2017-10-27-Moorhead

http://www.datacarpentry.org/lessons/

https://software-carpentry.org/lessons/

++++++++++++++++

Friday

Jeff – certified Bash Python, John

http://bit.do/msum_swc

https://ntmoore.github.io/2017-10-27-Moorhead/

what is shall and what does it do. language close to computers, fast.

what is “bash” . cd, ls

shell job is a translator between the binory code, the middle name. several types of shells, with slight differences. one natively installed on MAC and Unix. born-again shell

bash commands: cd change director, ls – list; ls -F if it does not work: man ls (manual for LS); colon lower left corner tells you can scrool; q for escape; ls -ltr

arguments is colloquially used with different names. options, flags, parameters

cd ..  – move up one directory .      pwd : see the content      cd data_shell/   – go down one directory

cd ~  – brings me al the way up .        $HOME (universally defined variable

the default behavior of cd is to bring to home directory.

the core shall commands accept the same shell commands (letters)

$ du -h .     gives me the size of the files. ctrl C to stop

$ clear . – clear the entire screen, scroll up to go back to previous command

man history $ history $! pwd (to go to pwd . $ history | grep history (piping)

$ cat (and the file name) – standard output

$ cat ../

+++++++++++++++
how to edit and delete files

to create new folder: $ mkdir . – make directory

text editors – nano, vim (UNIX text editors) .      $ nano draft.txt .  ctrl O (save) ctr X (exit) .
$ vim . shift  esc (key)  and in command line – wq (write quit) or just “q”

$ mv draft.txt ../data . (move files)

to remove $ rm thesis/:     $ man rm

copy files       $cp    $ touch . (touches the file, creates if new)

remove $ rm .    anything PSEUDO is dangerous   Bash profile: cp -i

*- wild card, truncate       $ ls analyzed      (list of the analyized directory)

stackoverflow web site .

+++++++++++++++++

head command .  $head basilisk.day (check only the first several lines of a large file

$ for filename in basilisk.dat unicorn.dat . (making a loop = multiline)

> do (expecting an action) do

> head -n 3 $filename . (3 is for the first three line of the file to be displayed and -n is for the number)

> done

for doing repetitive functions

also

$ for filename in *.dat ; do head -n 3$x; done

$ for filename in *.dat ; do echo $filename do head -n 3$x; done

$ echo $filename (print statement)

how to loop

$ for filename in *.dat ; do echo $filename ; echo head -n 3 $filename ; done

ctrl c or apple comd dot to get out of the loop

http://swcarpentry.github.io/shell-novice/02-filedir/

also

$ for filename in *.dat

> do

> $filename

> head -n  10 (first ten files ) $filename | tail  -n 20 (last twenty lines)

$ for filename  in *.dat

do
>> echo  $filename
>> done

$ for filename in *.dat
>> do
>> cp $filename orig_$filename
>>done\

history > something else

$ head something.else

+++++++++++++

another function: word count

$ wc *.pdb  (protein databank)

$ head cubane.pdb

if i don;t know how to read the outpun $ man wc

the difference between “*” and “?”

$ wc -l *.pdb

$

wc -l *.pdb > lenghts.txs

cat lenghts.txt

$ for fil in *.txt
>>> do
>>> wc -l $fil

by putting a $ sign use that not the actual text.

++++++++++++

nano middle.sh . The entire point of shell is to automate

$ bash (exectubale) to run the program middle.sh

rwx – rwx – rwx . (owner – group -anybody)

bash middle.sh

$ file middle.sh

$path .

$ echo $PATH | tr “:” “\n”

/usr/local/bin

/usr/bin

/bin

/usr/sbin

/sbin

/Applications/VMware Fusion.app/Contents/Public

/usr/local/munki

$ export PATH=$PWD:$PATH

(this is to make sure that the last version of Python is running)

$ ls ~ . (hidden files)        

$ ls -a ~

$ touch .bach_profile .bashrc

$history | grep PATH

   19   echo $PATH

   44  echo #PATH | tr “:” “\n”

   45   echo $PATH | tr “:” “\n”

   46   export PATH=$PWD:$PATH

   47  echo #PATH | tr “:” “\n”

   48   echo #PATH | tr “:” “\n”

   55  history | grep PATH

 

wc -l “$@” | sort -n ($@  – encompasses eerything. will process every single file in the list of files

 

$ chmod (make it executable)

 

$ find . -type d . (find only directories, recursively, ) 

$ find . -type f (files, instead of directories)

$ find . -name ‘*.txt’ . (find files by name, don’t forget single quotes)

$ wc -l $(find . -name ‘*.txt’)  – when searching among direcories on different level

$ find . -name ‘*.txt’ | xargs wc -l    –  same as above ; two ways to do one and the same

+++++++++++++++++++

Saturday

Python

Link to the Python Plotting : https://swcarpentry.github.io/python-novice-gapminder

C and C++. scripting purposes in microbiology (instructor). libraries, packages alongside Python, which can extend its functionality. numpy and scipy (numeric and science python). Python for academic libraries?

going out of python $ quit () .      python expect beginning and end parenthesis

new terminal needed after installation. anaconda 5.0.1

python 3 is complete redesign, not only an update.

http://swcarpentry.github.io/python-novice-gapminder/setup/

jupyter crashes in safari. open in chrome. spg engine maybe

https://swcarpentry.github.io/python-novice-gapminder/01-run-quit/

to start python in the terminal $ python

>> variable = 3

>> variable +10

several data types.

stored in JSON format.

command vs edit code.  code cell is the gray box. a text cell is plain text

markdown syntax. format working with git and github .  search explanation in https://swcarpentry.github.io/python-novice-gapminder/01-run-quit/

hackMD https://hackmd.io/ (use your GIthub account)

PANDOC – translates different data formats. https://pandoc.org/

print is a function

in what cases i will run my data trough Python instead of SPSS?

python is a 0 based language. starts counting with 0 – Java, C, P

atom_name = ‘helium ‘
print(atom_name[0])                  string slicing and indexing is tricky

atom_name = ‘helium ‘
print(atom_name[0:6])
vs
atom_name = ‘helium ‘
print(atom_name[7])                python does not know how to slice it
synthax of python is        start : end : countby/step
string versus list .   string is in a single quote, list will have brakets
strings allow me to work not only w values, revers the string
atom_name = ‘helium lithium beryllium’
print(atom_name[::-1])
muillyreb muihtil muileh
Atom_name = ‘helium’
len (atom_name)                                     6 .             case sensitive
to clean the memory, restart the kernel
objects in Python have different types. adopt a class, value may have class inherent in its defintion
print (type(’42’)) .   Python tells me that it is a string
print (type(42)) .    tells e it is a string
LaTex
to combine integer and letter: print (str(1) + ‘A’)
converting a string to integer . : print (1 + int(’55’)) .    all the same type
translation table. numerical representation of a string
float
print (‘half is’, 1 / 2.0)
built in functions and help
print is a function, lenght is a function (len); type, string, int, max, round,
Python does not explain well why the code breaks
ASCI character set – build in Python conversation
libraries – package: https://swcarpentry.github.io/python-novice-gapminder/06-libraries/
function “import”
 Saturdady afternoon
reading .CSV in Python
http://swcarpentry.github.io/python-novice-gapminder/files/python-novice-gapminder-data.zip
**For windows users only: set up git https://swcarpentry.github.io/workshop-template/#git 
python is object oriented and i can define the objects
python creates its own types of objects (which we model) and those are called “DataFrame”
method applied it is an attribute to data that already exists. – difference from function
data.info() . is function – it does not take any arguments
whereas
data.columns . is a method
print (data.T) .  transpose.  not easy in Excel, but very easy in Python
print (data.describe()) .
/Users/plamen_local/anaconda3/lib/python3.6/site-packages/pandas/__init__.py
%matplotlib inline teling Jupyter notebook

import pandas

data = pandas.read_csv(‘/Users/plamen_local/Desktop/data/gapminder_gdp_oceania.csv’ , index_col=’country’)
data.loc[‘Australia’].plot()
plt.xticks(rotation=10)

GD plot 2 is the most well known library.

xelatex is a PDF engine.  reST restructured text like Markdown.  google what is the best PDF engine with Jupyter

four loops .  any computer language will have the concept of “for” loop. In Python: 1. whenever we create a “for” loop, that line must end with a single colon

2. indentation.  any “if” statement in the “for” loop, gets indented

1 Comment on Software Carpentry Workshop

Leave a Reply