Downloadable Data / Resources: Kalev H. Leetaru

This page lists several datasets that I receive a lot of requests for and have made available for open research.



     Chicago Tribune: Content Velocity Analysis

This study for the Center for Research Libaries on behalf of the Library of Congress, was designed to answer key questions around the volume of new content added to the Chicago Tribune's website over a one month period from September to October 2010, the overall rate of change, linking structure and ease of traversal for archival crawlers, and overall structure, linking, and content characterization considerations. Crawlers were used to archive all 105 gateway pages every 30 minutes, resulting in a total of 136,605 snapshots of the site's content.

This study was conducted as part of a larger study by the Center for Research Libaries for the Library of Congress on the future of news in the digital era in a report titled Preserving News in the Digital Environment: Mapping the Newspaper Industry in Transition.

Links Citation
  • Leetaru, Kalev. (2010). Chicago Tribune: Content Velocity Analysis. In Preserving News in the Digital Environment: Mapping the Newspaper Industry in Transition. Alverson, Jessica; Leetaru, Kalev; McCargar Victoria; Ondracek, Kayla; Simon, James; Reilly, Bernard. (2011). Center for Research Libraries on behalf of the Library of Congress. Data downloaded from http://contentanalysis.ichass.illinois.edu/data/


     Drudge Report

The Drudge Report is one of the founding flag bearers of "new media": a U.S.-based news aggregator founded in the late 1990s that has developed a reputation for breaking tomorrow's news today. The site has become a powerful force in the U.S. media sphere and its founder was named one of Time Magazine's most influential people in 2006. In existence for more than a decade, the Drudge Report makes an ideal case study for examining the "new media versus old media" argument. How dependent is such a "new media" aggregator on the "old media" it draws from, and how does it find its breaking stories? A cross-section of analytical techniques is used to demonstrate how to profile a news Web site, and finds that the Drudge Report relies heavily on wire services and obscure news outlets to find small stories that will break large tomorrow, making it highly dependent on mainstream "old media" sites.

Original Study Link Data Link Text
  • Master-List-All-Drudge-Report-Link-Texts.txt. Master list of all link text used for links on the Drudge Report 2002-2008, with the link text and number of snapshots the text was used for any link (can be multiplied by 2 minutes to determine the total length of time the link was alive for). In cases where the same link text was used for multiple links, or where the underlying link URL changed over time while the link text itself remained the same, it is only counted once.
  • Master-List-All-Drudge-Report-Link-Words.txt. Master list of all link text used for links on the Drudge Report 2002-2008, broken down by word.
Link Domains
  • Master-List-All-Domains.txt. Master list of all domains linked to by the Drudge Report 2002-2008.
    • Master-Geocoded-Domain-List.txt. Same as above, but with automatically-generated information regarding the physical location of the domain's owner, including an approximate city-centroid latitude/longitude.
    • Master-Geocoded-Domain-List-By-Year.txt. Same as above, but broken down by year with automatically-generated information regarding the physical location of the domain's owner, including an approximate city-centroid latitude/longitude.
Update Timeline Citation
  • Leetaru, Kalev. (2009). New media vs. old media: A portrait of the Drudge Report 2002-2008. First Monday. Vol. 14, Issue 7. Data downloaded from http://contentanalysis.ichass.illinois.edu/data/


     Soundbite University: 60 Years of University News Coverage

Soundbite University is a large-scale study conducted by Kalev Leetaru and Paul Magelli exploring the broader changes in how higher education has been covered in the national press over the last 60 years. More than 18 million documents comprising the entire run of the New York Times from 1945 to 2005 were examined for all references to United States research universities and compared to spatial, temporal, and a variety of institutional indicators to examine how coverage has changed over this period and the characteristics most commonly associated with elevated national press visibility. One of the most surprising findings is the transition of the research university from a newsmaker to a news commentator, suggesting a need for universities to profoundly change the ways in which they interact with the press, especially as we enter a new era in media.

While individual institutional trends are available as interactive graphs on the project website, we have made the entire set of timelines available here as an Excel spreadsheet to assist in further research.

Links Citation
  • Leetaru, Kalev & Magelli, Paul. (2010). The Soundbite University: 60 Years of University News Coverage. The American Council on Education's The Presidency. http://www.ichass.illinois.edu/SoundbiteUniversity/. Data downloaded from http://contentanalysis.ichass.illinois.edu/data/