News room staffs have shrunk by one-third over the past decade even as competition with leaner, more profitable Web sites has intensified. The remaining reporters are forced to produce more stories with less time to prepare for them.

Reporters increasingly find themselves working on stories on unfamiliar topics, leading to a real risk of errors, misleading stories or the type of pack journalism where everyone interviews the same source regardless of relevance or real expertise.

University PR departments and think tanks also prey on overworked reporters. The most aggressive ones pounce on major news stories, offering reporters interviews with their faculty, whether they are relevant or not to the actual story. Harried reporters feel time pressure to accept this.

My project, Expertpedia, aims to be a newsroom tool to help reporters quickly locate the most relevant experts on the needed topic.

Page 1

1. Users would input a search phrase on the site. Expertpedia would then input it into Google Scholar. The author of the first article to pop up, the most cited academic paper, would become the first person on our list.

2. That person’s name would then be rerun through Google Scholar along with the original search to call up all the relevant academic papers he/she has written on the topic.

3. The author’s name would then be run through a Google News search to come up with op-eds and analyses he/she may have written in the mainstream media.

4. The site would then trawl through university Websites to pull up contact information on the author.

5. There would also be space for the author’s to be rated by Expertpedia users based on their ability to explain their ideas well on TV, radio and in print.

Unnamed Mockup 2

Expertpedia is not a database that needs curating or updating, like some other Websites that aim to bring experts to journalists. But instead, it operates more like a search engine.

There are some problems here. One issue could be timeframe. A plane crash paper about engine failures in WWI era fighter planes might have more citations just because it’s been out longer than other ones. Also, some scientific papers could have more than a dozen authors.

Using Google news to get op-eds would only turn up current ones, not older, but still possibly relevant ones.

Also, how would we get contact information from people retired from academia? How does it deal with people with common names and would bad ratings for one academic accidentally tarnish another one? How do we prevent PR departments from gaming the ratings system?

And the searches initial Google Scholar searches are not always relevant. The fourth entry on the page of Plane Crash Experts is somehow Mercer Mayer, the author of such children’s books as ‘’Frog, Where are You?”

Page 3


Nonetheless, I am confident these hurdles could be overcome in development, leaving reporters with a valuable time-saving tool and the public with better, more relevant information.

Can we use Big Data to improve health reporting?

Our project:

Ali and I built a tool that uses Big Data to help journalists who report on health see how their coverage of a subject (ie., diabetes) stacks up next to what actually kills people and the related research dollars invested. The idea was that the tool could give journalists a sense of whether they are covering health issues that actually affect their readers and also whether there are topics with lots of research being done that they might be under-covering.

Using the tool:

As a health journalist, I was interested in trying out the tool and seeing whether it could give me a sense of missed opportunities in my reporting or ways to improve  coverage.

For this demo, we used New York Times and Wall Street Journal data. I looked at the 2011 visualization for the New York Times, and imagined I was a journalist working at that paper. I found that while heart disease kills a lot of Americans, I was hardly covering it. Meanwhile, I was dedicating a great deal of ink to Parkinson’s disease, even though relatively few Americans are afflicted. Similarly, few die by suicide, yet it received the maximum media attention at my paper, while accidents—which kill many more people—got hardly any coverage. Chronic lower respiratory disease and stroke are both major killers in America that attract a sizable amount of research funding, yet we barely reported on these health subjects.

These gaps between mortality and media attention left me with ideas about how I might be able to diversify and strengthen my health journalism. It allowed me to reflect on where my focus was, and where my blind spots might be. I’d think about doing stories on accidents and heart disease, for example, and looking into possibilities for reporting on COPD, which got no coverage.

Still, this is just a demo with a lot of room for improvement.

Future directions:

1) We want to use DALYs data instead of mortality as our population health measure. DALYs—disability-adjusted life years—are a measure of overall disease burden that measure the number of life years lost due to poor health. They include a range of factors such as smoking, diet, pollution, cancer, depression and asthma. So DALYs would encompass a broader scope of health and lifestyle issues than the mortality measure.

2) We want to add other media outlets from around the world to the tool, and create a feature that allows users to upload data from their own websites to see how they compare.

3) We need to improve our user interface so that the information is displayed on one screen and there is no need to scroll down.

4) We want to build our database so users can compare media attention over a longer period of time and see how their focus is shifting over the years.

5) We need to make sure we are comparing apples to apples. On almost every measure, the Wall Street Journal appeared to have almost zero coverage next to the New York Times and we need to explore why this is happening.

6) We included measures health researchers might be interested in. For example, the ratio of research investment to mortality in the population. We hope to work with health researchers to ensure our methodology is robust enough, and invite them to use our tool for their studies.

News Trustee Network

By Uri Blau and Nini Cabaero

Somewhere around the globe a major news story just broke.  Your news crew is thousands of miles away.  It will take ages to get them there and will cost a fortune.

What do you do?

News outlets need someone on the ground.  Someone they can count on.  Someone they can trust.  Now we have a solution for them.

News Trustee Network (NTN)

Expert Sources (seeking coding partner)

With shrinking newsrooms, shrinking newsroom budgets and desperate news organizations trying to remain relevant and profitable, journalists are being put under nearly impossible demands. The Oregonian has recently insisted its staffers post three articles a day, several comments on the site AND produce two major projects a quarter.

Reporters under these types of production pressures need to remove as much friction from the reporting process as possible. What I propose is developing a prototype for a usable and trustworthy expert database that journalists can turn to quickly to find relevant scholarly articles, op-eds and university professors with real expertise on a given topic.

This site would need to trawl google scholar, giving heavy weight to the most cited articles, as well as google news and universities’ own web sites to quickly give reporters the information they need and contact details and bios for the top experts in the field.

Let’s say there is a train crash and a reporter is assigned to look at the safety of the U.S. rail system. Often, within minutes after such an event, a newsroom is bombarded with emails from university flaks trying to push their ”experts” into the news stories. Overworked reporters are very likely to bite, meaning the experts with the most aggressive PR machines, rather than those with the most relevance to the topic, will end up being cited.

With my proposed tool, journalists could easily access the most relevant people to interview. They would type ”rail safety” in a search bar and the resource would respond with a list of experts. Under each expert, possibly ranked by google scholar citations as a signifier of relevance, would be a bio culled from their university website, contact details and links to their main research in the field as well as op-eds they might have written on the topic.

There currently is a quite poor resource called Profnet that is run by PRNewswire and is essentially a public relations exercise on behalf of universities seeking to get as much press attention as possible. Again, the spoils go to the most aggressive PR machine, not the most relevant expert. This new tool would be more trustworthy and thus more useful to getting the best information to the journalist, and by extension to the public, in as smooth a manner as possible.

To try to make this a reality, I’d need to partner with an experienced coder who could help me get a prototype off the ground.




How good is your health journalism? by Ali and Julia

As we have discussed in previous posts, health journalism isn’t always as diverse and accurate as it could be. While measuring quality in journalism is a difficult task, Ali and I wanted to create a tool that could allow journalists to see where there may be space for improvement or missed opportunities in their health coverage.

So we propose a tool that would allow journalists to compare the proportion of  coverage at the top 25 media outlets in the world on a given health subject to 1) the proportion of related research spending in a given year 2) and the relative health impact of that subject on the population.

For example, on our website, a user would select:
1) a news source, such as the New York Times;
2) a health subject, such as “diabetes;”

Then the website we build will spit out a visualization (probably a bubble chart) of how much coverage the disease garnered at that media source compared to how much money was invested in diabetes-related research globally and how much that health issue impacts quality of life compared to other factors. We could also rate the robustness of a news outlet’s coverage (ie, We could give a poor grade to a source that is severely under-covering an important health issue or subject for which there is a lot of research output).

Ali has access to a database of media stories from the world’s top 25 outlets and we will do key word searches in each source to find out how often the top disease burden factors are mentioned.

To measure health impact, we are using global data from the Institute of Health Metrics and Evaluation (IHME) about the relative impact on DALYs or disability-adjusted life years. This “is a measure of overall disease burden, expressed as the number of years lost due to ill-health, disability or early death” which gives a sense of which diseases or exposures lead to the most deaths” and DALYs include everything from obesity to cancer and pollution.

For the data on health-research investment, we are using global data from IHME, as well.

The data will be visualized in bubbles that allow users to easily see where there are gaps between coverage, research spending, and public-health impact. The user will get a sense of whether the health coverage at a given paper is actually representing related research and disease burden.

We hope that journalists and editors who cover health could use this tool to find missed opportunities and ideas for how to expand their work.