Game Changers Guest Post: NPR’s API

NOTE: We asked each of our 2009 Game Changers Awards finalists to write about their projects, what they’ve learned along the way and what’s next. This essay written by Zach Brand,Chief of Technical Strategy & Operations of NPR, Digital Media.

NPR’s API, what is it?

It is the simple, yet radical idea that world class content can be made freely available for re-use and presentation by the public.  An idea not realized by any major American media organization until the launch of the NPR API on July 17th, 2008.  In the age of Web 2.0 people want access to information and entertainment faster, better, cheaper… but they also want it on their own terms and they want to mash it up.  The NPR API is about making content available in an infinite number of ways as powered by the imagination and inventiveness of the online audience. It is the first full and extensive content based API, produced and made freely available by one of the largest news and media organizations in the country.

Since the introduction of RSS feeds in 1999 people have come to understand the basic idea in which they can view content online in more places than where it was originally created.  Thanks to RSS feeds, I can quickly scan headlines from NPR, The Washington Post, CNN and BBC all from one location in my RSS reader, or even in any blog or website which has incorporated those particular feeds…  However if I want the full story I always have to click and go to the relevant site.  The content in these RSS feeds is essentially friendly marketing trying to get you to visit the isolated island of that media company’s website. “Sure you’re having fun on the ‘Miguel’s green plants blog’, but click on over to our website, because we have at least one story that you’re interested in.”  While RSS is really simple syndication – it is also usually really stingy syndication. A RSS feed will usually give you a tease of information, but seldom does it provide the breadth and depth of the story. Previously any sort of content focused APIs were usually just a way to select RSS feeds.

The NPR API opens up the full digital media content of NPR’s stories and allows people to mash it up, and include it in their widget, blog, website or other online presentation.  Not limited to a headline and a link, content from the API has the full text as well as audio and photos.  Whether you want to build a widget, bring more content onto your website, social network page, or blog, you can now easily do so. Spanning over 13 years of NPR content, from news to music, the API includes all available text for over 250,000 stories in more than 90 topics.  Users of the API can filter by these topics, their own custom keyword or use selectable categories to filter their custom results to twelve different programs, 4000 music artists, 400 NPR personalities, and over 700 editorial columns.   Output is currently available in 7 different formats, including XML, JSON, Atom and Javascript/HTML widgets.

The Team behind it:

Within NPR, the technology team that is part of the Digital Media division is the group responsible for writing all code and systems to support NPR.org, NPR podcasts, and all other online efforts including the NPR API.  Compared to other major media companies this team is relatively small, but it has come to be known as one of the most innovative. When looking for ways to improve performance and allow more flexible use of NPR’s online content, this team saw a unique opportunity to build tools that could be used by external parties as well as internal.  Since the NPR API was originally built as a mechanism to power the display of all content on www.npr.org, the team was able to begin work on the API (for internal use) before ever tackling the business and corporate questions of introducing an entirely new distribution model.

Embracing the idea of C.O.P.E (create once, publish everywhere) the team set out to make tools which  could be used to quickly and easily push content into any format – web, mobile, widget etc.  As the team designed the tools, we made sure that the architecture would support opening the API up to a broader audience. By creating an XML abstraction layer in front of the traditional database storage, the team was able to realize better performance and better portability and re-usability of content.  This XML layer became the basic framework of the API.  Once this framework was in place and was being used to render all the content on www.npr.org, it was time to tackle the question of how we could make the API public.  After working with NPR leadership, business, legal, and communication departments, it became obvious there was a real opportunity to completely shake up the media landscape, as well as provide a real value to the public.  NPR fully embraced the idea that free distribution for non-commercial use would greatly serve the public mission by making NPR’s content available in more ways than we would ever be able to build ourselves.

So far the API has been a great success.  Beyond the positive feedback we have received directly, we have seen a tremendous interest in the API.  In just its first month of being available, we received over 1,000,000 requests to the API for content, and the numbers continue to grow.  NPR has seen obvious implementations (ex: independent member stations providing more comprehensive information on their website), to widgets and to website usage – and even code libraries being written.  Ultimately, we hope for the success to result in a greater openness of content online beyond NPR’s.  We are in a new age of content.  People are likely familiar with the metaphor that consuming content online is like trying to sip from a fire hose.  As a radio company we keenly appreciate that 20 years ago a radio listener had a couple dozen stations to choose from on their FM dial; today’s online audience has – at last count – over 100 million websites to choose from.   The NPR API embraces a strategy in this new world that’s been dubbed ‘Brand and Release’.   We don’t want to limit access to only those folks who come to our website.  NPR’s content, both news and entertainment is best in its’ class. We want to share it wherever it is applicable, and believe there are many other places it can be put to good use. We look forward to seeing the new and innovative ways NPR content is being used thanks to the NPR API.

Challenges and Next Steps:

There have been many challenges along the way with the NPR API.  Many were technical in nature – what platforms to build upon, what caching mechanisms, what’s the basic atomic elements of our content.  In solving them, we think we came up with some pretty smart answers.  NPR also wanted to make the API as user friendly as possible.  We knew we did alright when, using our CEO as test case, he was able to add content to his blog in about 10 minutes.  We also faced some challenges immediately after we launched the API.  While we can only provide content that we have rights to, some folks criticized us that we didn’t include all public radio content in the API.  We did our best to explain the legal and Intellectual Property limitations, and further explaining that some shows like ‘Marketplace’ and ‘This American Life’ are not owned or produced by NPR.  Additionally, there are also other shows produced by local stations; and as such, these shows again are not NPR created or owned.

This lack of non-NPR content perhaps speaks to the most exciting challenge ahead.  As mentioned, local radio stations that air NPR content are not owned or operated by NPR.  These local radio stations are completely independent, and play a mix of content they create, content they license from NPR, and content from other providers.  In the past year we have spent much time discussing NPR’s ‘Brand and Release’ approach with these stations. Thus far, all  I have spoken to have been very supportive of this approach.   So much so that many of these stations suggested we should carry their content in the NPR API.  We were able to achieve an early proof of this concept because there were eleven stations  already feeding NPR content for use in the music portion of NPR’s website.  As every such provider wanted to be included, you can currently get content from these partners in the API.  Recognizing that local stations do not have the resources to build their own API’s, we are now exploring how to solve this next challenge: How can NPR make it so that any public radio content potentially could be served by the API? As a first step, we hope to create the tools for NPR member stations to do this.  Beyond the challenges of providing a standard mechanism to load in data, this will add complexity on the data organization and of course significant data storage challenges.

As stated, ultimately NPR hopes to see other media companies embracing this ‘Brand and Release’ philosophy, and creating their own APIs. NPR believes that by making great content available for non-commercial / non-profit use the public can be better served while still protecting corporate interests. NPR was very excited to see our friends at www.nytimes.com launch API functionality for movie reviews and other features this past October. We anticipate more examples to follow as our API marks the start of a new online trend. Much as the internet has opened up new forms of community, dialogue, and interaction with Web 2.0, we anticipate that the NPR API has helped pave the way where great content made freely available via APIs will be more accessible, more useful, and more contextually interesting than ever before.

You may also like