Database Information
Data current through
Saturday - January 20, 2018
McGuire Center for Entrepreneurship
The University of Arizona | Eller College of Management The University of Arizona Eller College of Management
Eller College Home > McGuire Center for Entrepreneurship > CRIE > Patent Data > Origins of Patent Data Repository
Commercialization Research on Innovation and Entrepreneurship

Origins of Patent Data Repository

The patent data center began rather serendipitously. Monte asked his fellow PhD student and office mate, Brian, what he was doing. For his research-assistant work, Brian was copy some patent data from the USPTO website into an excel spreadsheet. Monte further probed and learned that Brian had to do this for all patents (from 1995-2008) for some 216 firms (including IBM and General Electric). Monte asked Brian how long he anticipates this task taking. Brian did not know.

A week later Monte asked Brian for an update on his progress, and Brian shared the results. Do some simple linear calculations, Monte concluded that it would take about 12.7 years for Brian to complete his task. Monte asked Brian to show him where he would go on the website, what data he would need to collect from each page, etc.

That weekend, Monte went home and wrote some code in PHP to harvest and parse the patent data to help out Brian. The next weekend he refined the code further. Over Christmas break, he wrote a report mechanism. To prepare the data for Brian's task, over 1.7 million patents were reviewed, created a panel of data consisting of over 60,000 records (firm-patent analysis) and including information such as forward-citations, self-citations, etc.

With this beginning, Monte decided to continue harvesting the data. He built a small workstation, and attached two 2TB Western Digital harddrives. He continued to improve the parsing algorithm, but each report would take a day or longer to run since it literally had to open each patent-data object to grab the data.

Monte attempted to store these objects into a database form, but the computing power he had available made this impractical. Monte wanted to put it in relational form for academic consumption, so he approached Len Jessup about doing a postdoc at the University of Arizona to do just that.

For additional information, please contact us.

hosted by Mergent

powered by Patent Rank

Patent Data Repository
* Email:
Patent Data Repository
* Password:
Patent Data Repository
Lost Password Reset Password Activate Account
* Email:
This will be your user name (EDU email)
* Create Password:
* Repeat Password:

* Name:      
William        H.       Gates       III 
* Preferred:
Bill Gates or "Bill"      

* Service:
By checking this box, you agree to our terms of service.
If you check this box, we will send you a monthly newsletter.
If you check this box, we will send you promotional information about the Patent Data Repository, etc.

- Reference:
Monte [user] [crie-sandbox]