APIs: What they are and why they matter to digital humanities?

URL for this talk: http://bit.ly/dhapiintro

MITH API and the Digital Humanities Workshop (#DHapi and apiworkshop)

Friday, February 25, 2011

Raymond Yee (@rdhyee), Data Unbound LLC and author of Pro Web 2.0 Mashups: Remixing Data and Web Services

What I Hope to Accomplish

In this talk, I hope that you will get a basic sense of:

With a focus on digital humanities, of course.

Please talk to me this weekend and after the workshop.

Basic terms

APIs

APIs are "application programming interfaces".

Note: There are many different types of APIs (e.g., operating system APIs, including the Windows API, Mac OS X APIs, Linux APIs, iOS, Android APIs), programming language and framework APIs (e.g., Java library), we'll put most of our focus on web APIs.

From Chapter 2:

A web site’s public API is specifically designed as the official channel of programmatic access to the data and services of the web site. It essentially lets you access and program the web site almost like a local object or database. For a slightly more formal definition of an API, consider the one by John Musser from Programmableweb.com: “a set of functions that one computer program makes available to other programs so they can talk to it directly.”

Mashups

A mashup, in the words of the Wikipedia, is a web site or web application “that seamlessly combines content from more than one source into an integrated experience.”

Why APIs Matter?

The value of your data, when it is scattered throughout multiple databases and applications, grows if you can make it all work together. This value increases further when you leverage your information resources with the vast world of data on the Web. APIs can help you integrate data across your organization and beyond.

Why I'm Passionate about APIs and mashups

From the introduction of my book:

How often do you wish that you could make all the different parts of your digital world—your e-mail, your word processor documents, your photos, your search results, your maps, your presentations—work together more seamlessly? After all, it’s all digital and malleable information—shouldn’t it all just fit together?

In fact, below the surface, all the data, web sites, and applications you use could fit together. This book teaches you how to forge those latent connections—to make the Web your own—by remixing information to create your own mashups. A mashup, in the words of the Wikipedia, is a web site or web application “that seamlessly combines content from more than one source into an integrated experience.”

or put more succintly: I want what @AlienWeedMan has promised....

Personal sidenote: I think that APIs are central to building that ultimate personal information manager, the Memex (see Vannnevar Bush's "As We May Think"), or "personal knowledge bases"; see also Stephen Davies' Still building the memex (CACM, Feb 2011)

Sample API calls

Geocoding

Let's geocode 1600 Pennsylvania Ave NW Washington DC using geocoder.us (HTML) or geocoder.us (REST), whose output is

<?xml version="1.0"?>
<rdf:RDF xmlns:dc="http://purl.org/dc/elements/1.1/"
  xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#"
  xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
  <geo:Point rdf:nodeID="aid34957879">
    <dc:description>1600 Pennsylvania Ave NW, Washington DC 20502</dc:description>
    <geo:long>-77.037684</geo:long>
    <geo:lat>38.898748</geo:lat>
  </geo:Point>
</rdf:RDF>

Flickr API

Flickr API

An example method: flickr.photos.search. Use flickr.photos.search explorer to look at possible parameters and formulating specific calls. E.g.,

http://api.flickr.com/services/rest/?method=flickr.photos.search&api_key={api-key}&tags={tags}

Sign up for a key/secret -- my key is e81ef8102a5160154ef4662adcc9046b. That is,

http://api.flickr.com/services/rest/?method=flickr.photos.search&api_key=e81ef8102a5160154ef4662adcc9046b&tags=flower

See Chapter 6 of my book for a fuller explanation.

See Stats about Flickr photos (Google Spreadsheet) for an example of using the Flickr API inside of Google spreadsheet to tally the number of public photos by license available to the API.

Flickr Services lists some mashups using the Flickr API including Flickr Sudoku, Flickr Postcards and fd's Flickr Toys: Do fun stuff with your photos. One of my favorite Flickr mashups is two ways in which Picnik interacts with Flickr: a) Flickr's photo-editing functionality uses Picknik's API and b) you can edit Flickr photos in Picnik (via Flickr's API).

Amazon Wishlist + Google Spreadsheet Mashup

In Chapter 17 of my book, I created a mashup of an Amazon Wishlist and Google Spreadsheets. When I returned to examine my code last night, I learned that it no longer worked. Why? First, the Amazon Ecommerce API morphed into the Amazon Product Advertising API; I was puzzled why the API wasn't listed where I expected it to be. Unfortunately, Amazon, in its infinite and inscrutable wisdom, also decided to kill the ListLookup operation, the one call that I depended on to retrieve the content of my Amazon wishlist. (I'm not alone in having broken applications because of this change.)

So what to do now? Interestingly enough, someone just announced a JSON feed service for a given wishlist, for example, Jeff Bezos' wishlist and mine (in JSON). I hope it stays around. How does it work given the demise of the ListLookup operation? My guess is that some sort of screen-scraping is going on.

With the new JSON feed, I rewrote my code to regenerate Raymond Yee's Amazon Wishlist as a Google Spreadsheet. (Code still to be posted.)

Things to consider: the long-term sustainability of APIs, the business case for creating an API in the first place. Sometimes APIs, even from big companies die. Another example: I'm not the only person sad about the death of the Google Maps Data API.

Library of Congress SRU

MODS record for Milosz's Collected Poems (See SRU is Simple - SRU: Search/Retrieval via URL, Standards, Library of Congress and ModsFromLibraryOfCongressQuery)

How APIs are being used today

ProgrammableWeb lists APIs and mashups that use these APIs.

Supplementary References for my talk

This talk will be similar to one I gave at the Library of Congress: Web 2.0 Mashups: Making the Web Your Own Webcast (Library of Congress).

For a highly readable article on how companies get into creating APIs to enable mashups, read How to Manage Volunteer Software Developers, Managing Technology Article - Inc. Article  -- it's the story of how cutting-edge usage of Etsy prompted Etsy to hire Mashery to build an API.

Last year, I attended a Workshop on Application Programming Interfaces for the Digital Humanities (Oct 2009)

Combinatorial Exercises

  1. Pick any two API (starting with services you are familiar with and then migrating to random ones) and brainstorm mashups using these APIs.  You can use the list of APIs at ProgrammableWeb and do an advanced search on ProgrammableWeb for a particular combination of APIs. We'll do this in a large group first and then we'll be working in smaller groups to have more discussion.
  2. Now we take a problem that you want to solve (a fanciful problem or one that you are wrestling with in your own work) – come up with the APIs that would be useful to help you solve that problem.  Do those APIs exist?  If not, what would it take to make those APIs?

My Dream / Our (?) Dream

Making everything connect seamlessly with minimal and elegant ease: Can this happen? How?

Presenter Bio

Raymond Yee ()

(phone: 510.984.2330)

(twitter: @rdhyee)

Raymond Yee is President of Data Unbound LLC. He is author of the leading book on web mashups, Pro Web 2.0 Mashups: Remixing Data and Web Services (Apress). He has been a contributing writer for ProgrammableWeb, the web's premiere resource for tracking developments in APIs and mashups.

At the UC Berkeley School of Information, he taught Mixing and Remixing Information, a course on using APIs to create mashups. He has co-written three influential reports on how the US government can improve its efforts to make data and services available through APIs.

Raymond served as the Integration Advisor for the Zotero Project (a widely used open source research tool) and managed the Zotero Commons, a collaboration between George Mason University and the Internet Archive.

As the Technology Architect for the Interactive University Project at UC Berkeley, he designed and prototyped software to support learning, teaching, and research, in collaboration with the California Digital Library. As a campus data architect, Raymond led the technical development of a faculty advancement reporting system.

Raymond is an erstwhile tubaist, admirer of J. S. Bach, Presbyterian elder, aspiring essayist, son of industrious Chinese-Canadian restaurateurs, and devoted husband of the incomparable Laura.

My book

Pro Web 2.0 Mashups

I'm including excerpts from my book (Excerpts from Raymond Yee Pro Web 2.0 Mashups: Remixing Data and Web Services (Apress, 2008) (licensed under CC NC-SA-BY 2.5): Introduction, Chapter 1, Chapter 2 (excerpt), Chapter 11 (excerpt), Chapter 12))

On my book blog: http://blog.mashupguide.net/ -- you'll find the complete text for my book, licensed under a CC license: http://blog.mashupguide.net/toc/

Pro Web 2.0 Mashups (online HTML version)