Database Information
Data current through
Saturday - January 20, 2018
McGuire Center for Entrepreneurship
The University of Arizona | Eller College of Management The University of Arizona Eller College of Management
Eller College Home > McGuire Center for Entrepreneurship > CRIE > Patent Data
Commercialization Research on Innovation and Entrepreneurship

Patent Data

Indeed, as Mairesse noted in a recent roundtable of some of the leading thinkers on the topic, 'We have exhausted all we can get from our old data sets on R&D, patents, citation counts.'

1 Tellis et al. 2009

We need new patent data!

Thousands of articles utilizing patent data have been written in economics, marketing, strategy, entrepreneurship, and law. The limitations of the use of patent data has been its accessibility. We propose an "open-science" paradigm where the data is cleansed, normalized, and prepared for academic consumption. Any-and-all algorithms used will be openly shared and refined through peer scrutiny. The intent is to reduce or eliminate the rework of the data problem, so that better research on innovation and entrepreneurship can be performed.

An example

Early in his career as a PhD student in marketing, Monte Shaffer was tackling a very specific research question: for a given list of 216 firms (including IBM and General Electric, the two firms possessing the most patents), find all patents granted to the firm by the USPTO for years 1996-2008. For each2 patent, identify the number of backward citations it has, the number of forward citations, and subsets of these citations (how many of this type of citation occurred in a 5-year window? how many of these citations are self-citations3). Proposed to Monte was an Excel solution: (1) go to the USPTO web site, (2) perform a web search for IBM, (3) find all patents for IBM, (4) for each patent lookup all citations (forward/backward), (5) for each citation determine if it fits in the temporal window and if it is a self-citation, (6) paste the results in an Excel table. After a bit of hand calculations, he concluded it would take him 12 years to complete4 the task as outlined. So instead, in December 2008, he began writing a PHP class that would harvest the necessary patents from the USPTO web site. He created code that would parse the patent, compare patents, and generate custom reports with nested for loops.

Contiguous, comprehensive solution

As outlined above, if patent data were organized in relational form, researchers could spend more time doing analysis to advance academic thought. We provide the patent data center with this intent. We will maintain the database, updating it contiguously, and constantly improve the data available. We will seek your feedback on new data (whether indexes of the patent network, auxilliary data for external sources, etc.) and be constantly making efforts to improve the quality of data available.

1. Tellis, Gerard J, Jaideep C Prabhu, Rajesh K Chandy. 2009. Radical Innovation Across Nations: The Preeminence of Corporate Culture. Journal of Marketing 73(1) 3 — 23.

2. Although there are online resources to search patent data (USPTO, Google Patents, Google Scholar, Delphion, etc.), sophisticated queries needed by academics are not readily accessible. However, if this data were in relational form, could you write a single SQL subselect query to return the appropriate result?

3. A self-citation means the forward or backward citation referenced is by the same firm.

4. In retrospection, Monte realized that he had examined 1.7 million patents to perform the task. He has all of the patent data on an external hard drive, has written a parsing algorithm to create an data object for each patent. If only this data could be relationalized and made available to the research community!

For additional information, please contact us.

hosted by Mergent

powered by Patent Rank

Patent Data Repository
* Email:
Patent Data Repository
* Password:
Patent Data Repository
Lost Password Reset Password Activate Account
* Email:
This will be your user name (EDU email)
* Create Password:
* Repeat Password:

* Name:      
William        H.       Gates       III 
* Preferred:
Bill Gates or "Bill"      

* Service:
By checking this box, you agree to our terms of service.
If you check this box, we will send you a monthly newsletter.
If you check this box, we will send you promotional information about the Patent Data Repository, etc.

- Reference:
Monte [user] [crie-sandbox]