Showing posts with label databases. Show all posts
Showing posts with label databases. Show all posts

Monday, February 08, 2010

Netflix Data--- What it tells us about culture.

The New York Times has a very interesting set of interactive graphs that provide information about the top 100 movies rented from Netflix in 2009. The data are graphed on maps of the major cities in the United States so that you can see which movies were the most rented in various neighborhoods in these cities.

In some cases it shows how homogeneous our movie watching is-- see the pattern for The Curious Case of Benjamin Button and in other cases the interesting regional variations-- see Last Chance Harvey.

It is also worth reading the comments as well--- everything from outrage that this data is public to curiosity and puzzlement.

Saturday, January 26, 2008

datebases and scholarly communication

The primary way in which scholars have communicated new scientific knowledge and to the advancement of science has been through the publication of research findings. (this ignores theoretical contributions, but that matters also.)

It seems to me that the vast increase in the ability to store large amounts of information affords the opportunity to ask the question what other ways might scientific scholarship be advanced in addition to publishing findings.

One idea that intrigues me is the sharing of data sets with others and developing interactive data analytic tools to explore these databases. Here is one nice example. The KidsCount Data Center keeps track of over 100 child and family indicators of well-being. At the data center you can select indicators, create comparision's and compare data in a variety of other ways. By making the data available in this fashion, other researchers or even the public at-large can answer questions using this data. No one is going to make scientific breakthroughs with this data, but all of us can find answers to questions that may be of interest to us: Are the trends in teen pregnancy in my state above the average in the US? Are children's reading scores improving? What has happened to teen drinking in the US?

With more powerful tools it would be possible to look at correlations, compare whether differences between counties or states were statistically significant, and so forth.

Imagine if more specialized data collected by scientists were routinely available to other researchers and the general public, wouldn't this advance science all the more quickly?