Science Online New York (SoNYC) encourages audience participation in the discussion of how science is carried out and communicated online. To celebrate our first birthday, we are handing the mic over to the audience so that anyone who would like to participate will get five minutes to show off their favourite online tool, application or website that makes science online fun. To complement the celebrations, we’re hosting a series of guest posts on Soapbox Science where a range of scientists share details about what’s in their online science toolkits. Why not let us know how they compare to the tools that you use in the comment threads?
Jerry Sheehan serves as the Chief of Staff for the California Institute for Telecommunications and Information Technology (Calit2), a UC San Diego/UC Irvine partnership. In his this capacity, Mr. Sheehan has responsibility for strategic planning, metrics, institute governance, and strategic initiatives. During his career Jerry has focused on the intersection of public policy and information technology with a particular interest in applying academic innovation to “real world” problems. In addition to his executive management responsibilities, Jerry served as Senior Personnel on the National Science Foundation’s GreenLight Project, a major research instrument effort focused on improving computer energy efficiency. Sheehan served as a member of the California Emerging Technology Fund Panel of Experts and as staff has supported the work of Governor Schwarzenegger’s California Broadband Task Force and President Clinton’s Information Technology Advisory Committee on Open Source Software for High Performance Computing. He received a Masters of Science degree in Political Science from Eastern Illinois University in 1991 and is a member of the American Association for the Advancement of Science (AAAS), and Educause.
The Challenge of the Digitally Amplified Researcher
The California Institute for Telecommunications and Information Technology [Calit2] is a multidisciplinary and multi-campus research institute that unites faculty, students and research professionals at the University of California San Diego and the University of California Irvine with private-sector partners to explore how emerging information technologies and telecommunications can transform arenas vital to California’s economy and citizens’ quality of life. The Institute focuses on the digital transformation of health, energy, the environment and culture by applying disruptive technologies in wireless, photonics, nanotechnology/MEMS and cyberinfrastructure.
Our two-campus community totals about 600 researchers from 24 different academic departments. Working together with academic, public and private sector partners, the Institute has enabled more than $800 million dollars worth of scientific research. Each researcher within the campuses may be thought of as a “digitally amplified researcher.” This term borrows from the concept of “amplified individuals” created by the Institute of the Future [IFTF] and refers to the impact that information technology has in radically increasing the productivity and electronic output of individuals. Digital amplification occurs when researchers use modern decentralized publishing technologies from web pages, blogs and social networks to document their scientific accomplishments. It can also be achieved through the employment of cyber-aware students who are equally, if not more, electronically prolific.
A primary management and collaboration challenge resulting from the amplified researcher is determining the research focus of each individual. The traditional approach to this challenge has been to conduct an annual survey through email or on-line forms. There are a number of weaknesses of this approach: 1) Low response rate. Perceived as administrative data-gathering, faculty response rates to these types of queries are very low (~10 percent). 2) Static Snapshot. Even when data are available, reports from surveys are, at best, snapshot data that reveal interest or capability at one point in time. However, we know that research interests are dynamic and change based on scientific inquiry and funding opportunities. These two weaknesses severely limit any meaningful insight about an academic research community that this approach might provide.
The Research Intelligence Toolset: Using Technology to Help Reflect Back Academic Strengths
To address the challenges presented by the digitally amplified researcher, Calit2 developed the Research Intelligence (RI) project. The project builds an automated faculty research profile for members of our community based on public data using freely available text mining application programming interfaces (APIs).
The system works by first identifying Web pages within the University domain, associated academic research publications and abstracts of funded grants from federal agencies for each researcher. Each individual piece of digital data is then sent to three API text mining algorithms. Currently used services include Calais from Thomson Reuters, the Yahoo! Term Extraction Web Service and a modification of the Keyword Extraction Algorithm (KEA) from the New Zealand Digital Library project. Each service has a slightly different method for determining meaningful terms, so results are compiled across the three with data stored in a MySQL database per information item, which is then associated with a specific researcher. A weight is calculated for each term based on the frequency of the term across the three services, and any additional numerical information returned from used services.
These text-mining results are then fused with authoritative and public information from the University directory (name, contact email, phone number, departmental and research organization, etc.) and are used to populate a standard faculty profile Web template. The centerpiece of this display is a tag cloud of the top 100 keywords for that faculty member. Individual faculty members may use their university login to access the profile and edit the terms. Users are also encouraged to directly upload documents for keyword extraction that are subsequently added to their profile. Since faculty are affiliated with both their departments and programs in the University Directory, it is also possible to aggregate keywords for a given program to get an understanding of potential research strengths.
Public Profiles Showing Top 100 Research Strengths of the 951 UC San Diego Faculty Users
Example Faculty Research Strength Page for Larry Smarr
Also critical to the Research Intelligence project is a way of providing a direct benefit to faculty members for creating and maintaining a profile on the system. Using the same system described above, Calit2 retrieves on a daily basis, a data stream from grants.gov of all new and open available federal funding research opportunities. Abstracts for these opportunities are text mined for keywords and then matched to individual faculty research profiles. Faculty members may login to the RI system to see this customized grant feed, or choose to have summaries emailed out to them on a daily, weekly or monthly basis.
The initial prototype in 2007 created 350 automatically generated faculty profiles. Approximately 40 percent of faculty members in the pilot modified profiles that had been automatically generated for them. Since the pilot, the system has been adopted by the UC San Diego campus as a platform for featuring faculty research and now has more than 951 faculty profiles. Faculty members have created the majority of the profiles on the site, with the greatest addition occurring during the American Recovery and Reinvestment Act, when researchers were searching for funding and form new collaborations. Since 2009, the campus Research Intelligence portal has had 66,097 pageviews from more than 18,000 unique visitors. Traffic remains constant, with an average of approximately 100 visits to the site per week.
Overall, Research Intelligence presents a realistic alternative to the traditional annual “survey” of faculty to gain an understanding of their research interests. The RI system is superior in three primary ways:
1) Higher participation by faculty members: Voluntary participation in the system is high, in part due to the daily automated matching of profiles to funding opportunity;
2) Moving beyond a snapshot of research interest: Given that the system is automated, research profiles can be updated by simply rerunning the text mining system against on-line information sources linked to from the faculty profile;
3) Creating more actionable knowledge from the data: Since the RI system is digital and has higher and more frequently updated information then an annual survey resulting data is more useful to administrators and others supporting the research enterprise.
You can follow the online conversation on Twitter with the #ToolTales hashtag and you can read Mary Mangan’s Tool Tale here, Dr Peter Etchells’s Tool Tale here, Alan Cann’s here, Boris Adryan’s here, Anthony Salvagno’s here, Daniel Burgarth and Matt Leifer’s here, Zen Faulkes’s here, Jenn Cable’s here , Mike Biocchi’s here, Susanna Speier’s here, Derek Hennen’s here, Musa Akbari’s here, Benedict Noel’s here, Chris Surridge’s here and Gerd Moe-Behrens’s here.