Want to learn basic computer programming skills specifically tailored for academia?
Please consider a FREE two-day workshop on either on Python or on R.
Python is a programming language that is simple, easy to learn for beginners and experienced programmers, and emphasizes readability. At the same time, it comes with lots of modules and packages to add to your programs when you need more sophistication. Whether you need to perform data analysis, graphing, or develop a network application, or just want to have a nice calculator that remembers all your formulas and constants, Python can do it with elegance. https://www.python.org/about/
R (RStudio) is a language and environment for statistical computing and graphics. R provides a wide variety of statistical and graphical techniques. R can produce well-designed publication-quality plots, including mathematical symbols and formulae. https://www.r-project.org/about.html
Both software packages are free and operate on MS Windows, MAC/Apple and GNU/Linux OS.
Besides seamless installation on your personal computer, you can access both software in SCSU computer labs or via SCSU AppsAnywhere.
Integers: A signed or unsigned whole number running from -32,768 to 32,768 or from 0 to 65,535 if not signed. Integers are used anytime something needs to be counted.
Long Integer: Any whole number outside the above range. Python doesn’t distinguish between the two though many languages do. Practically, Python’s integers range from −2,147,483,648 to 2,147,483,648 or 0 to 0 to 4,294,967,295. Most of us will be very happy with this many whole numbers to choose from.
Real and Floating Point Numbers: Real numbers are signed or unsigned numbers including decimals. The numbers 2,3,4 are Integers and Real Numbers. The numbers 2.1, 2.9,3.9 are Real Numbers, but not Integers. Real Numbers can include representations of irrational numbers such as pi. Real numbers must be rational, that is a decimal number that terminates after a finite number of decimals. You will sometimes encounter the term Floating Point Numbers. This is a technical term referring to the way that large Real Numbers are represented in a computer. Python hides this detail from you so Real and Floating Point are used intercangeably in this language.
Binary Numbers: And Octal and Hexadecimal. These are numbers used internally by computers. You will run into these values fairly often. For instance, when you see color values in HTML such as “FFFFFF” or “0000FF”,
Hexadecimal and Octal are used because humans can read them without too much trouble and they are compromise between what computers process and what we can read. Any time you see something in Octal or Hexadecimal, you are looking at something that interfaces with the lower levels of a computer. You will most commonly use Hexadecimal numbers when dealing with Unicode character encodings. Python will interpret any number which begins with a leading zero as binary unless formatting commands have been used.
Numbers such as 7i are referred to as complex. They have a real part, the 7, and an imaginary part, i. Chance are you won’t use complex numbers unless you’re working with scientific data.
A String consists of a sequence of characters. The term String refers to how this data type is represented internally. You store text in Strings. Text can by anything, letters, words, sentences, paragraphs, numbers, just about anything.
Lists are close cousins to Strings, though you may never need to think of them that way. A list is just that, a list of things. Lists may contain any number of numbers or any number of strings. List may even contain any number of other lists. Lists are compared to arrays, but they are not the same thing. In most uses, the function the same so the difference, for our purposes, is moot. Strings are like lists in that, internally, the computer works with strings in an identical manner to lists. This is why the operations on Strings are so different from numbers.
The last main data type in the Python programming language is the dictionary. Dictionaries are map types, known in other languages as hashes, and in computer science as Associative Arrays. The best way to think of what the dictionary does is to consider a Library of Congress Call Number(something this audience is familiar with). The call number is what’s called a Key. It connects to a record which contains information about a book. The combination of keys and records, called values, comprises a dictionary. A single key will connect to a discrete group of values such as the items in this record. Dictionaries will be touched on in the next lesson in some detail in the next course. These are fairly advanced data structures and require a solid understanding a programming fundamentals in order to be used properly.
Statements, an Overview
Programs consist of statements. A statement is a unit of executable code. Think of a statement like a sentence. In a nutshell, statements are how you do things in a program. Writing a program consists of breaking down a problem you want to solve into smaller pieces that you can represent as mathematical propositions and then solve. The statement is where this process gets played out. Statements themselves consist of some number of expressions involving data. Let’s see how this works.
An expression would be something like 2+2=4. This expression, however is not a complete statements. Ask Python to evaluate it and you will get the error “SyntaxError: can’t assign to operator”. What’s going on here? Basically we didn’t provide a complete statement. If we want to see the sum of 2+2 we have to write a complete statement that tells the interpreter what to do and what to do it with. The verb here is ‘print’ and the object is ‘2+2’. Ask Python to evaluate ‘print 2+2’ and it will show ‘4’. We could also throw in subject and do something a bit more detailed: ‘Sum=2+2’. In this case we are assigning the value of 2+2 to the variable, Sum. We can then do all sorts of things with Sum. We can print it. We can add other numbers to it, hand it off to a function and so on. For instance, might want to know the root of Sum. In which case we might write something like ‘print sqrt(sum)’ which will display ‘2’.
A shell is essentially a user interface that provides you access to a system’s features. Normally, this means access to an Operating System. In cases like this, the shell provides you access to the Python programming environment.
Anything preceed by a “#” is not interpreted or executed by the programming shell. Comments are used widely to document programs. One school of programming holds that code should be so clear that comments are uncessary.
Operations on Numbers
Expressions are discrete statements in programming that do something. They typically occupy one line of code, though programmers will sometimes squeeze more in. This is generally bad form and can really make your program a mess. Expressions consist of operations and data or rather data and operations on them. So, what can you do with numbers? Here is a concise list of the basic operations for integers and real numbers of all types:
Addition: z= x + y
Subtraction: z = x – y
Multiplication: z = x * y. Here the asterisk serves as the ‘X’ multiplication symbol from grade school.
Division: z = x/y. Division.
Exponents: z = x ** y or xy, x to the y power.
Operations have an order of precedence which follows the algebraic order of precedence. The order can be remembered by the old Algebra mnenomic, Please Excuse My Dear Aunt Sally which is remeinds you that the order of operations is:
Operations on Strings
Strings are strange creatures as I’ve noted before. They have their own operations and the arithmetic operations you saw earlier don’t behave the same way with strings.
Putting Expressions Together to Make Statements
As I noted earlier, all computer languages, and natural languages, possess pragmatics, larger scale structures which reduce ambiguity by providing context. This is a fancy way of saying just as sentences posses rules of syntax to make able to be comprehended, larger documents have similar rules. Computer Programs are no different. Here’s a break down of the structure of programs in Python, in a general sense.
Programs consist of one or more modules.
Modules consist of one or more statements.
Statements consist of one or more expressions.
Expressions create and/or manipulate objects(and variables of all kinds).
Modules and Programs are for the next class in the series, though we will survey these larger structures next lesson. For now, we’ll focus on statements and expressions. Actually, we’ve already started with expressions above. In Python, statements can do three things.
Assign a variable
Change a variable
Take an action
Variable Names and Reserved Words
Now that we’ve seen some variable assignments, let’s talk about best practices. First off, aside from reserved words, variable names can be almost any combination of letters, numbers and punctuation marks. You, however, should never ever, use the following punctuation marks in variable names:
These punctuation marks tends to be operators and characters that have special meanings in most computer languages. The other issue is reserved words. What are “reserved words”? They are words that Python interprets as commands. Pythons reservers the following words.:
True: A special value set aside for boolean values
False: The other special value set aside for boolean vaules
None: The logical equivalent of 0
and: a way of combining logical conditions
as: describes how modules are imported
assert: a way of forcing something to take on a certain value. Used in debugging of large programs
break: breaks out of a loop and goes on with the rest of the program
class: declares a class for object oriented design. For now, just remember not to use this variable name
continue: returns to the top of the loop and keeps on going again
def: declares functions which allow you to modularize your code.
elif: else if, a cotnrol structure we’ll see next lesson
else: as above
except: another control structure
finally: a loop control structure
for: a loop control structure
from: used to import modules
global: a scoping statement
if: a control structure/li>
in: used in for each loops
is: a logical operator
lamda: like def, but weird. It defines a function in a single line. I will not teach this becuase it is icky. If you ever learn Perl you will see this sort of thing a lot and you will hate it, but that’s just my personal opinion.
nonlocal: a scoping command
not: a logical operator
or: another logical operator
pass: does nothing. Used as placeholder
raise: raises an error. This is used to write custom error messages. Your programs may have conditions which would be considered invalid based on our business situation. The interpreter may not consider them errors, but you might not want your user to do something so you ‘raise’ an exception and stop the program.
return: tells a function to return a value
try: this is part of an error testing statement
while: starts a while loop
with: a context manager. This will be covered in the course after the next one in this series
yield: works like return
Variable names should be meaningful. Let’s say I have to track a person’s driver license number. explanatory names like ‘driverLicenseNumber’.
Use case to make your variable names readable. Python is case sensitive, meaning a variable named ‘cat’ is different from named ‘Cat’. If you use more than one word to name variable, start of lower case the change case on the second word. For instance “bigCats = [‘Tiger’,’Lion’,’Cougar’, ‘Desmond’]”. The common practice used by programmers in many settings is that variables start with lowercase and functions(methods and so on) start with upper case. This is called “Camel Case” for its lumpy, the humpy appearance. Now, as it happens, there is something of a religious debate over this. Many Python programmers prefer to keep everything lower case and join words in a name by underscores such as “big_cats”. Use whichever is easiest or looks the nicest to you.
Variable names should be unique. Do not reuse names. This will cause confusion later on.
Python conventions. Python, as with any other programming language, has culture built up around it. That means there are some conventions surrounding variable naming. Two leading underscores, __X, denote system variables which have special meaning to the interpreter. So avoid using this for your own variables. There may be a time and place, but that’s for an advanced prorgramming course. A single underscore _X indicates to other programmers that this a fundamental variable and that they mess with it at their own peril.
Avoid starting variable names with a number. This may or may not return an error. It can also mislead anyone reading your program.
“A foolish consistency is the hobgoblin of little minds”. But not to programming minds. Consistency helps the readability of code a great deal. Once you start a system, stick with it.
Putting together valid statements can be a little hard at first. There’s a grammar to them. Thus far, we’ve mainly been workign with expressions such as “x = x+1”. You can think of expression as nouns. We’ve clearly defined x, but how do we look inside? For that we need to give it a verb, the print command. We would then write “print x”. However we can skip the middle statement and print an expression such as “print x + 1”. The interpreter evaluates this per the order of operations I laid out earlier. However, once that expression is evaluated, it then applies the verb, “print”, to that expression.
Print is a function that comes with the Python distribution. There are many more and you can create your own. We’ll cover that a bit in next lesson. Let’s look at little more at the grammar of a statement. Consider:
x = sin(b)
Assume that b has been defined elsewhere. x is the subject, b is the object and sin is the verb. Python will go to the right side of the equal sign first. It will then go to the inside of the function and evaluate what’s there first. It then evaluates the value of the function and finishes by setting x to that value. What about something like this?
Python evaluates from the inside out according to the rules of operation. Very complex statements can be built up this way.
x = sin(log((x + 3)/(e**2)))
Regardless of what this expression evaluates to (I don’t actually know), Python starts with the innermost parentheses, then works through the value of e squared then adds 3 to x and divides the result by e squared. With that worked out, it takes the logarithm of the result and takessthe sine of that before setting x to the final result.What you cannot do is execute more than one statement on a line. No more than one verb on a line. In this context, a verb is an assignment, or a command acting on an expression
Call up your copy of Think Python or go to the website at http://www.greenteapress.com/thinkpython/html/. Read Chapter 2. This will reiterate much of what I’ve presnted here, but this will help cement the content into you minds. Skip section 2.6 because IPython treats everything as script mode. IPyton provides you with the illusion of interactive, but everything happens asynchronously. This means that any action you type in will not instantaneously resolve as it would if you were running Python interactively on your computer. You will have to use print statements to see the results of your work.
Your assignment consists of the following:
Exercise 1 from Chapter 2 of Think Python. If you type an integer with a leading zero, you might get a confusing error:
<<< zipcode = 02492
SyntaxError: invalid token
Other numbers seem to work, but the results are bizarre:
<<< zipcode = 02132
Can you figure out what is going on? Hint: display the values 01, 010, 0100 and 01000.
Exercise 3 from Chapter 2 of Think Python.Assume that we execute the following assignment statements:
width = 17
height = 12.0
delimiter = ‘.’
For each of the following expressions, write the value of the expression and the type (of the value of the expression).
1 + 2 5
Exercise 4 from Capter 2 of Think Python. Practice using the Python interpreter as a calculator:
1. The volume of a sphere with radius r is 4/3 π r3. What is the volume of a sphere with radius 5? Hint: 392.7 is wrong!
2. Suppose the cover price of a book is $24.95, but bookstores get a 40% discount. Shipping costs $3 for the first copy and 75 cents for each additional copy. What is the total wholesale cost for 60 copies?
3/ If I leave my house at 6:52 am and run 1 mile at an easy pace (8:15 per mile), then 3 miles at tempo (7:12 per mile) and 1 mile at easy pace again, what time do I get home for breakfast?
In your IPython notebook Create a markdown cell and write up your exercise in there. Just copy it from the textbook or from the above write up. Next ceate a code cell and do your work in there. Please, comment your work thoroughly. You cannot provide too many comments. Use print statements to see the outcome of your work.
Our beginning programming course, CNA 267, is now using Python as the programming language. Students learn to work with decision and loop control structures, variables, lists (arrays) and procedures, etc. Python is becoming one of the most widely-accepted languages for business professionals and scientists.
Please inform your students (who need to learn programming) of this course. It is being offered during spring semester, as well as next fall.
the type of data: wikipedia. the dangers of learning from wikipedia. how individuals can organize mitigate some of these dangers. wikidata, algorithms.
IBM Watson is using wikipedia by algorythms making sense, AI system
youtube videos debunked of conspiracy theories by using wikipedia.
semantic relatedness, Word2Vec
how does algorithms work: large body of unstructured text. picks specific words
lots of AI learns about the world from wikipedia. the neutral point of view policy. WIkipedia asks editors present as proportionally as possible. Wikipedia biases: 1. gender bias (only 20-30 % are women).
conceptnet. debias along different demographic dimensions.
citations analysis gives also an idea about biases. localness of sources cited in spatial articles. structural biases.
geolocation on Twitter by County. predicting the people living in urban areas. FB wants to push more local news.
danger (biases) #3. wikipedia search results vs wkipedia knowledge panel.
collective action against tech: Reddit, boycott for FB and Instagram.
data labor: what the primary resources this companies have. posts, images, reviews etc.
boycott, data strike (data not being available for algorithms in the future). GDPR in EU – all historical data is like the CA Consumer Privacy Act. One can do data strike without data boycott. general vs homogeneous (group with shared identity) boycott.
the wikipedia SPAM policy is obstructing new editors and that hit communities such as women.
how to access at different levels. methods and methodological concerns. ethical concerns, legal concerns,
tweetdeck for advanced Twitter searches. quoting, likes is relevant, but not enough, sometimes screenshot
social listening platforms: crimson hexagon, parsely, sysomos – not yet academic platforms, tools to setup queries and visualization, but difficult to algorythm, the data samples etc. open sources tools (Urbana, Social Media microscope: SMILE (social media intelligence and learning environment) to collect data from twitter, reddit and within the platform they can query Twitter. create trend analysis, sentiment analysis, Voxgov (subscription service: analyzing political social media)
graduate level and faculty research: accessing SM large scale data web scraping & APIs Twitter APIs. Jason script, Python etc. Gnip Firehose API ($) ; Web SCraper Chrome plugin (easy tool, Pyhon and R created); Twint (Twitter scraper)
Facepager (open source) if not Python or R coder. structure and download the data sets.
TAGS archiving google sheets, uses twitter API. anything older 7 days not avaialble, so harvest every week.
social feed manager (GWUniversity) – Justin Litman with Stanford. Install on server but allows much more.
legal concerns: copyright (public info, but not beyond copyrighted). fair use argument is strong, but cannot publish the data. can analyize under fair use. contracts supercede copyright (terms of service/use) licensed data through library.
methods: sampling concerns tufekci, 2014 questions for sm. SM data is a good set for SM, but other fields? not according to her. hashtag studies: self selection bias. twitter as a model organism: over-represnted data in academic studies.
methodological concerns: scope of access – lack of historical data. mechanics of platform and contenxt: retweets are not necessarily endorsements.
ethical concerns. public info – IRB no informed consent. the right to be forgotten. anonymized data is often still traceable.
table discussion: digital humanities, journalism interested, but too narrow. tools are still difficult to find an operate. context of the visuals. how to spread around variety of majors and classes. controversial events more likely to be deleted.
takedowns, lies and corrosion: what is a librarian to do: trolls, takedown,
development kit circulation. familiarity with the Oculus Rift resulted in lesser reservation. Downturn also.
An experience station. clean up free apps.
question: spherical video, video 360.
safety issues: policies? instructional perspective: curating,WI people: user testing. touch controllers more intuitive then xbox controller. Retail Oculus Rift
app Scatchfab. 3modelviewer. obj or sdl file. Medium, Tiltbrush.
College of Liberal Arts at the U has their VR, 3D print set up.
Penn State (Paul, librarian, kiniseology, anatomy programs), Information Science and Technology. immersive experiences lab for video 360.
CALIPHA part of it is xrlibraries. libraries equal education. content provider LifeLiqe STEM library of AR and VR objects. https://www.lifeliqe.com/