Showing posts with label FRED. Show all posts
Showing posts with label FRED. Show all posts

Tuesday, April 22, 2014

Pretty, Fast D3 charts using Datawrapper

While reading a news article I came across a US state cloropleth that piqued my curiosity.

An economic news article on state unemployment rates at The New Republic included a state level map with a two color scale and tooltips.  As always when I see a new chart that I like I look for two things to steal; the data and the method used to make the chart.

In this case I was in luck.  The embedded chart included links for both the method used and the data (great features of Datawrapper).

Datawrapper.de is a set of open source visual analytics tools (mostly Javascript from what I've seen) integrated into an easy to use UI.  For those of us still learning D3.js this is a great way to build beautiful, interactive charts with the style and capabilities of the D3 style graphics which are used by many online publications and bloggers.


State level maps (Maps still in Beta at time of writing) are really easy and fun to create.  To build you simply start by uploading or pasting in your data.  I was able to simply paste in two columns of data, state abbreviation and value.  Using R this might have taken me fifteen or twenty minutes or so, at Datawrapper this only took five


I took another stab at this data, trying out one of the line chart templates provided.  Here I wanted to try to mimic (substance not style) one of my most favorite data tools, FRED:




Following similar steps to the state US map above I simply pasted in the time series data from FRED and moved through the Datawrapper wizard.  I simply selected line chart then updated a few options to better mimic FRED (sadly the iconic recession shading is not available natively, or grid lines).  In some ways the result is even more aesthetically pleasing, and could be a nice easy addition to a blog post or article



Monday, January 27, 2014

GeoGraphing with R; Part 3: Animation

To finish out this series I'll show a method I recently used to create animated .GIFs from R plots.

I've found a few methods of doing this, almost all of which use ImageMagick or GraphicsMagick, callable tools to convert images files into gifs, ffmepgs, mpgs etc.

Most of the examples I've found involve calling the "convert" command (a PATH reference to ImageMagick or GraphicsMagick) within a function of the animation package.

A few of these make use of the saveGIF() function available from the animation library.  I like the intuitive nature of this function, but I found that I wasn't able to control the ImageMagick conversion as well as I'd like.  Using ImageMagick directly from the command line along with some settings tweaks gave me more nuanced control of the GIF creation.

I followed a process of creating the images first (PNGs created through a controlled loop, see Part 2) then calling ImageMagick's convert function directly from the command line using a command like:

C:\Users\Erich\Documents\Plots\State UNMP>convert -delay 100 *.png "UNMP2012+.gif"


At work I took a slightly different tack, and used a different external conversion program, PhotoScape.  THe PhotoScape GUI was easy to use, but not as hack-y as ImageMagick. 

 Finding the right delay is key; I've found 80ms works well for many charts

Sunday, January 26, 2014

GeoGraphing with R; Part2: US State Heatmaps

The second geographical chart project I'll show is a classic.  In a national business it's often important to know the economic health of a region given different economic indicator values.  This logic uses a two color heatmap scheme for some intensity level visual feedback.This is another project I've developed at work and modified here to use public data.

The quantmod financial modeling library is the main source for this project.  I really like the design of this library.  The quantmod library features quick access to the most common sources of financial time series data (Google finance, Yahoo stocks, and FRED).  There are some great built in functions, a few of which I make use of below.  quantmod also has the added benefit of allowing you to trick coworkers into thinking you had a Bloomberg terminal installed overnight:


Great looking chart in three commands:
library(quantmod)
getSymbols('GOOG')
lineChart(Cl(GOOG['2011::']))

This project uses data from the St. Louis Federal Reserve's FRED repository.  I've written about my love for this public data before and using it with the quantmod library in R is even more convenient.

To create the heatmaps, I separated the project into two functions; the first creates a standardized data frame consisting of the time series data for each state, the second plots the state level data against a US state boundary map.


I've dubbed the first function "stFRED".  This function loops through each state using the built in state.abb data.  With each loop columns for date (from the quantmod xts index) and state name (from state.abb) are added to a data.frame creating a single standardized set structure.  After every state is added ldply is called to combine all sets.


#The stFRED function is built using the quantmod library to assign all US state level economic data to a single data frame
#I have chosen to for loop through each state in order to make use of the auto.assign=FALSE functionality
#which allows the printing each set instead of assigning it to separate sets
stFRED <- function(econ,begin="",ending=""){
  require(quantmod)
# The default state abbreviation set is used for the loop length and quantmod query   
  stDat <- lapply(state.abb, function(.state){
  input <- getSymbols(paste(.state,econ,sep=""),src="FRED", auto.assign=FALSE)
# Here I use the very effective subset function for the quantmod xts sets, using the variables begin and ending to subset  
  input <- input[paste(begin,'::',ending,sep="")]
# Converting the xts set to a data frame makes the data easier to manipulate for charting and other functions.
# This step assigns a date value to the index of the xts  
  input <- data.frame(econ_month=index(input),coredata(input))
# Since each state's indicator data includes a unique name ("VA...","GA...") I normalize them to one here
  colnames(input)[2]<-"ind_value"
# In order to separate the data later I include a variable for state name  
  input$St <- .state  
  input
  })   
# After returning each state dataset, I add them together using the very helpful ldply function
require(plyr)  
result <- ldply(stDat,data.frame)
  result
}  
 
Created by Pretty R at inside-R.org

The second function plots the state data onto a US map.  I borrowed much of the map plotting logic from Oscar Perpinan which I found from a StackOverflow question. This function could be used with other data, just note that I have used the names from the stFRED function for the plotted dataset.


#stFREDPlot creates a US state heatmap based on a state level data frame.
#While any state level set may be used, I have written this function to complement the stFRED function
#which produces a data frame which fits this function well.
stFREDPlot <- function(ds,nm=ptitle,ptitle=nm,begin=NULL, ending=NULL) {
# The libraries needed here are needed for the US state boundary mapping feature
  require(maps)
  require(maptools)
  require(mapproj)
  require(sp)
  require(plotrix)
  require(rgeos)
# To provide some default values for the begin and ending variables, I have set these variables to the minimum and maximum dates (full range)
# of the dataset.  This can be used for one or both, allowing partial subsets. 
  if ( is.null(begin) ) { begin<-min(ds$econ_month)}
  if ( is.null(ending) ) { ending<-max(ds$econ_month)}
  subds <- ds[ds$econ_month >= as.Date(begin) & ds$econ_month <= as.Date(ending),]
#The econSnap set is used for quick reference of the unique dates used  
  econSnap <- sort(unique(as.Date(subds$econ_month)))
#The dir.create function is used to create a folder to store the potentially many plot images created.
  if (is.null(nm) ) { print("Please enter a name or chart title") }
  dir <- paste("~//Plots//",nm,"//",sep="")
  dir.create(file.path(dir), showWarnings = FALSE)
#The variable i is used to reference the correct date in the econSnap set.  
  i <- 0
  for (n in econSnap) {
    plot.new()
    i <- i+1
#   Dataset limited to iterated reference date. 
    dataf <- data.frame(subds[subds$econ_month == n,])    
 
#   Much of this plotting logic built from tutorial found here: http://stackoverflow.com/questions/8537727/create-a-heatmap-of-usa-with-state-abbreviations-and-characteristic-frequency-in
#   Credit: StackOverflow user http://stackoverflow.com/users/964866/oscar-perpinan 
    dataf$states <- tolower(state.name[match(dataf$St,  state.abb)])
    mapUSA <- map('state',  fill = TRUE,  plot = FALSE)
    nms <- sapply(strsplit(mapUSA$names,  ':'),  function(x)x[1])
    USApolygons <- map2SpatialPolygons(mapUSA,  IDs = nms,  CRS('+proj=longlat'))
 
    idx <- match(unique(nms),  dataf$states)
    dat2 <- data.frame(value = dataf$ind_value[idx], match(unique(nms),  dataf$states))
    row.names(dat2) <- unique(nms)
 
    USAsp <- SpatialPolygonsDataFrame(USApolygons,  data = dat2)
    s = spplot(USAsp['value'],   col.regions = rainbow(100, start = 4/6, end = 1), main = paste(ptitle, ":  ", format(econSnap[i], format="%B %Y"),sep=""), colorkey=list(space='bottom'))
#   Status feedback given to user representing which date's US chart has been created.    
 print(format(econSnap[i], format="%B %Y"))
#   Plot saved as png.  Format chosen for malleability  in creating gif's and other manipulation
    png(filename=paste(dir,"//Map",substr(econSnap[i],1,7),".png",sep=""))
    print(s)
    dev.off() 
#   Dataset cleanup 
    rm(dataf)
    rm(dat2)
  }
}
Created by Pretty R at inside-R.org


As seen in the code, I have included some limited date subsetting funcationality and the resulting plots are saved for each available date.  This presents some possible problems if a very large data range is selected, but this iterative function will come in handy in part three of this series, animation.



Sunday, January 19, 2014

Favorite Tools: FRED and St. Louis Fed Research Tools

I'd like to use this series as a set of love notes on my favorite data tools.  Some of these I use almost constantly at work, others are personal favorites I have come across.


FRED is a tool I came across a few years ago while reading economics blogs.  The distinctive color of a standard FRED graph (with obligatory recession shading) was something I began to associate with the econ blogger crowd.  It seems this has been noticed by many, and Paul Krugman, his blog being one I first noticed FRED on, is quoted as saying "I think just about everyone doing short-order research — trying to make sense of economic issues in more or less real time — has become a FRED fanatic."

After using these tools at work and home I have come to feel the same way about the tool, even evangelizing its merits to my coworkers and friends.

FRED graphs are distinctive and immediately recognizable


In my work in data analysis at a national bank, I have come to greatly value FRED for two main reasons.  FRED is a singularly well organized and populated database and it allows the immediate reference to data which is often useful in a one off fashion.  Pulling this data out during a meeting has more than once garnered some recognition of my economic knowledge which might not have otherwise occurred.

The breadth of data available is somewhat astounding.  International Data might usually take you all over the web and to a few commercial sites, but FRED has enough to do most high level macroeconomic survey work.  I find the somewhat more obscure metrics very interesting at times, and it's fun to eyeball for trends.

It's too easy to make weird charts...


After discovering FRED's website I was ecstatic to find that an Excel Add-In had been developed.  i immediately made use of the feature and made sure I spread the news around.  Being able to quickly pull in common economic data while doing simple (or complex) analysis can save a lot of time.  Outsourcing the data storage and update costs to FRED is wonderful.  I've been able to cut down on some user table creation and maintenance I owned was a time saver.

In order to facilitate the access to my company's internal economic data hub I even created my own version of the FRED Excel Add-In, which I named ED.  Using some simple VBA  GUI elements (drop downs, radio buttons, many MsgBox's...) and an ODBC connection I was able to mimic the Excel Add-In functionality of FRED.  Adding in some charting code I was able to mimic the distinctive graphs as well.  Given that the data is proprietary, I don't see any issue in my imitation of FRED, and I view it as a labor of love in tribute to the data tool.
Tying FRED into R was an obvious result, and I've already begun to make use of this data.  Being able to pull this data down into the R environment makes it even easier to manipulate the data quickly, without the worry of Excel resources (Autosave I'm looking at you!), or adding the data to a database structure.  A R programming project I'll detail later exhibiting geographical plotting uses similar data, maybe I'll tie FRED in to show off the functionality.

I also happily own the FRED mobile app, which I find entirely too amusing, and has come in handy for wonky discussions, and to prove my data nerdiness to anyone in sight.

If they sold T-shirts, sign me up for two.


The St.. Louis Fed includes three other tools GeoFred (data Mapping), ALFRED (historical economic series), and CASSIDI (a personal favorite of mine, which details US banking industry data).  I believe I'll include love notes on these as well, CASSIDI especially.