Create a Table with IGDB API and R

The Internet Game Database has an API that allows you to access their rich set of video game information. Even better, there is an R package that makes requests for data Super easy.

Let’s get started by installing the IGDB package in R (you may need to update your version of R):

devtools::install_github("detroyejr/igdb")

Now that you have the package installed, we are almost ready to start requesting data. Before we get there, you will nee to have an API key, which requires you to setup an IGDB account. Fair compromise. Click “GET FREE KEY” on the documentation homepage to signup: https://igdb.github.io/api/.

And while you’re at it, be sure to read through the documentation so understand what type of information is available in the API and how to make basic calls. The documentation pages also provide handy examples.

Now that you’re setup, let’s store the API key so we can get access to the data. You can use the following code in R:

Sys.setenv("IGDB_KEY" = "[yourkeygoeshere]")

Now we are ready to play ball. The call below returns video games containing the term “mario” and orders them by release date, oldest to most recent. I’ve decided to bring back the following variables: ID, name of the game, release date, IGDB rating, and some popularity metric they have conceived. We can save it as “mario_games.”

mario_games <- igdb_games(
search = "mario",
order = "first_release_date:asc",
fields = c("id", "name", "first_release_date", "rating", "popularity")
)

One of the nice things about this is that the data comes back in a nicely formatted data frame which you view and confirm with these commands:

view(mario_games)
class(mario_games)

If for some reason yours does not, you should be able to transform it into a data frame, for easier analysis, using the as.data.frame base R function.

mario_games <- as.data.frame(mario_games)

Now you have a table for analysis or visualization at your disposal. Enjoy.

How to Collect Twitter Data Using R and the Twitter Search API

Luckily, the hard work is done for us. There is a terrific R package called twitterR that allows you to easily connect to the Twitter Search API. You just need to know a few arguments to properly ask for the data you need.

First, let’s explore what type of data and limitations exist in the Twitter Search API so we know what we have to work with.

Official documentation: https://developer.twitter.com/en/docs/tweets/search/api-reference/get-search-tweets

“Returns a collection of relevant Tweets matching a specified query.

Please note that Twitter’s search service and, by extension, the Search API is not meant to be an exhaustive source of Tweets. Not all Tweets will be indexed or made available via the search interface.

To learn how to use Twitter Search effectively, please see the Standard search operators page for a list of available filter operators. Also, see the Working with Timelines page to learn best practices for navigating results by since_id and max_id.”

The first step, of course, is to activate the packages you need for this project. If you don’t have these packages installed already, you’ll need to do that too. I have all of these installed, so I’ve commented out that part here.

# install.packages("twitterR")
library(twitterR)

We’re getting close, but before we can request data from the Twitter API, we have to provide some credentials to make sure we aren’t doing anything nefarious. To accomplish this, you need four things:

  • consumer_key
  • consumer_secret
  • access_token
  • access_secret

No worries, all of these can be easily found here (you’ll need an active Twitter account): https://apps.twitter.com/. Once you’re logged, you need to create an “application” which is essentially just saying you want to work on a project. Go ahead and fill in the details and you should receive the four criteria above.

Now we’ll save each of these strings in this manner (note that you’ll need to replace your string where i have ‘abc123’):

onsumer_key <- 'acb123'
consumer_secret <- 'acb123'
access_token <- 'acb123'
access_secret <- 'acb123'

Now, let’s get authorized and begin requesting Twitter data.

setup_twitter_oauth(consumer_key, consumer_secret, access_token, access_secret)

Below is a simple request for Tweets that you can modify to your liking. In this example, I’m going to save my request as “nebtweets” and I’ll call for the information with “searchTwitter” which is part of the twitterR package we installed and activated. I’ve arbitrarily set the number of results I want back to 200, starting in February 2018. It’s important to note here that the Twitter Search API does NOT give you full access to Twitters’ data. It’s only an index of recent Tweets. So you may get back warnings if you try asking for something that is not available.

nebtweets <- searchTwitter("nebrasketball", n=200, lang="en", since = '2018-02-01')

Now we have the Tweets saved, but they're not in a nice, neat data frame. This can easily be solved using "twListToDF" which is also part of the TwitterR package.

nebtweetsDF = twListToDF(nebtweets)
View(nebtweetsDF)

Now you're ready to analyze. Enjoy.

How to Pull API Data Into Excel

Pulling API data into Excel is quite a bit easier than what I expected. The most difficult part is understanding how to build the URL you will use to request the data.

In my case, I have been working with the Sportradar API to analyze Husker football data. The first step for me was to get an API key that allows me to get back data from Sportsradar.  Once I had that, it was simply a matter of taking these simple steps:

  1. Open a new Excel workbook
  2. Click on the Data tab in Excel
  3. Click From the Web
  4. Enter the API URL

It’s as easy as that. Unless you have come across an error you should have your data tables listed in Excel.

If you are interested in using the Sportsradar API, checkout http://developer.sportradar.us/