Leveraging Data Science to Capitalize on the Strategic Value of Intellectual Property
James H. Moeller - Moeller Ventures, LLC - https://www.moellerventures.com/
August 2, 2023
Executive Summary
The application of data science to intellectual property analysis has been the key to dramatically improving the speed and decreasing the difficulty in deriving meaningful business intelligence from IP data sets. In addition, data science methodologies continue to expand the depth of information that can be analyzed while also improving the relevancy and applicability of the results. Below are 11 ways that analytics and data science can be applied to capitalize on the strategic value of intellectual property. Many of these are exemplified in patent landscape analytics and business intelligence reports available via Alpha Data IQ.
  1. Identify IP Sector Competitors, Collaborators, and Partners
  2. Find Investors and Acquirers or Investments and Acquisitions
  3. Discover New Venture Capital Investments
  4. Assessing IP Sector Risk
  5. Freedom-to-Operate and White Space Research
  6. Determining the Competitive Nature of the Patenting Environment
  7. Developing Valuation Context on Patents and Portfolios
  8. Collateralize Patents for Business Financing
  9. Refining Internal R&D Processes to More Efficiently Develop Protectable IP
  10. Improving New Patent Application Filings
  11. Integrate IP Analyses with Additional Business Intelligence
Background: The Application of Data Science to Intellectual Property Analysis
Analysis techniques generally start with a scope definition and the collection of relevant patent documents, referred to in this report as the domain collection. This domain collection can be a specific predefined set of patent documents or the query results from a patent database that are derived via a search using keywords or key phrases, specific assignees or company names, a specific company patent portfolio, or even patent documents deemed to be semantically similar based on natural language processing (NLP) analysis.
Analytics and data science processes are then applied to the domain collection to summarize, tabulate, and further analyze the patent documents to produce meaningful business intelligence about the domain collection IP sector. These processes often start with an analytical patent document landscape in tabulating patent document information such as assignees, forward and backward citations, and classification codes. Additional analysis can be applied to assess IP sector risk, estimate valuations, and plot trends of filed applications, granted patents, and IP sector grant rates.
Natural language processing can be applied to derive the semantic similarity of the domain collection documents, which when viewed via a scatter plot density display, can indicate clusters of similar documents as well as areas of sparse concentrations or complete voids. These patent document similarity landscapes can be valuable in assessing the competitive areas of an IP sector and in executing freedom-to-operate and white space analyses.
Finally, the domain collection analysis can be integrated with external resources to derive additional intelligence and effectively cross-correlate IP analyses with information such as research reports, regulatory filings, third-party standard-essential patent (SEP) databases, and even newer AI large language model systems such as OpenAI’s ChatGPT, Google’s BARD, and others.
1)    Identify IP Sector Competitors, Collaborators, and Partners
One of the most common business intelligence inquiries concerns the identification of other entities in a specific IP sector. This is often motivated from a competitive technological perspective but can be equally meaningful in finding potential collaborators and partners.
In the context of IP landscape analyses, this information comes from the tabulation of assignees across a domain collection. For most entities pursuing patents, formal agreements usually exist whereby employees are required to assign any inventions, created as part of their work, to the parent entity. So, the assignees list of a domain collection consists mainly of the corporations, educational institutions, research organizations, or other entities involved in that IP sector. This can be a very valuable list in simply identifying those entities that are potential competitors, collaborators, and partners.
2)    Find Investors and Acquirers or Investments and Acquisitions
The assignees list can also be instrumental from an investments and acquisitions perspective for businesses of any size. For example, startups can use the assignees list to find larger, more established companies pursuing similar IP, that may be potential investors and eventual acquirers. This typically involves an analysis of the more highly ranked companies near the top of an IP sector assignees list.
Larger companies are often interested in potential investments or acquisition targets that may appear near the middle or bottom portions of an IP sector assignees list. This, in turn, requires that the assignees list be presented in its entirety (often numbering in the hundreds or even thousands) and accessible in a searchable, sortable dynamic web-based report or old-school spreadsheet environment. These investments or acquisitions can be motivated by a fundamental interest in identifying and acquiring new technologies or as a result of a business pivot. For example, companies exploring a pivot in a product line, business segment, or high-level corporate strategy may need to license, invest in, or completely acquire new intellectual property to compete successfully. The IP sector assignees list is thus an ideal resource for identifying these opportunities.
3)    Discover New Venture Capital Investments
The venture capital community will also find the assignees list useful for discovering potential new investments. This is particularly true when investing in new ventures that pursue products or markets that are highly dependent on intellectual property.
These new ventures will likely have only a small number of patents or application filings and will thus rank near the bottom of an IP sector assignees list. Again, this emphasizes the need for IP analyses that provide a comprehensive presentation of the full assignees list results, not just the top-ranked or leading assignees.
4)    Assessing IP Sector Risk
Analytical analyses to assess risk can begin with a review of the assignees list but can also integrate a wide variety of additional information covering licensing, litigation, PTAB activity (USPTO Patent Trial and Appeal Board), and even trade secret practices, copyrights, trademarks, and other sector-specific factors. For example, from the basic analytical landscape, an IP sector assignees list may sometimes indicate the presence of patent-holding companies, sometimes referred to as non-practicing entities (NPEs). These companies often pursue aggressive licensing and litigation business strategies and can represent additional risk in the IP sector.
In addition, another common approach is to analyze litigation activity at the District Court level and the USPTO PTAB. Assignee names identified via the landscape analytics can be used in search queries to identify relevant IP sector litigation activity. District Court and PTAB cases are available via APIs (Application Programming Interfaces) from various third-party and government-provided resources.
5)    Freedom-to-Operate and White Space Research
Freedom-to-operate (FTO) and white space research processes can be significantly improved with the application of natural language processing (NLP) and machine learning (ML) in assessing the similarity of documents in a domain collection. In turn, these processes can provide significant insights to mitigate risks associated with litigation and patent application filings.
The general process involves the application of NLP to enable a two-dimensional scatter plot density display of the similarity of the domain collection documents, also referred to as a patent document similarity landscape in this report. This process can be applied to the entire text of the documents or just the claims when focusing on an FTO analysis. This similarity landscape can be used to quickly identify clusters of patent documents in more highly competitive landscape topic areas as well as sparse concentrations or voids across the landscape. Query capabilities into the scatter plot can be used to explore the analysis for additional topics of interest and potential white spaces of patenting opportunities. Finally, an ML model can be developed that enables the placement of freeform text on the scatter plot display that enables the analysis to show where descriptions of new inventive ideas may fall on the landscape.
6)    Determining the Competitive Nature of the Patenting Environment
The competitive nature of the overall patenting environment can be assessed by plotting the trends associated with filed applications, granted patents, and the grant rate. In addition, individual patent documents or groups of documents can be assessed by further leveraging the similarity scatter plot density display.
For example, an increasingly competitive environment would be characterized by an increasing applications-filed trend line, a declining granted patents trend line, and a low grant rate. Whereas a less competitive environment would be the opposite, with a declining number of applications but an increasing number of grants and a higher grant rate. The competitive nature of an IP segment can vary significantly depending on the specific domain collection.
The similarity scatter plot density display can be used to assess individual patent documents or groups of documents by highlighting those specific documents on the scatter plot display for the IP segment. Documents shown in high-density areas are more likely to encounter other patent documents that represent comparable, competitive, or alternative IP. However, documents shown in low-density areas are more likely to be unique.
7)    Developing Valuation Context on Patents and Portfolios
Determining patent and portfolio valuations can be a complex and inexact process, as many factors contribute to an appraised or market value. Some of these are analytical analyses already covered in this report, and others pertain to specific situations or market value perceptions. For example, the risk assessment, the FTO analyses, as well as the competitive assessments of the environment and specific patent documents, are all analyses covered previously that can factor into the valuation.
In addition, specific situational aspects can influence the valuation of a patent or portfolio, such as licensing-derived revenue, the inclusion of patents in an industry standard, or evidence of premium pricing resulting from patent protection. More intangible aspects can also factor into the valuation. These can include the perceived value of the patents as a defensive posture against IP litigation from competitors or the perceived value of excluding competitors from a market segment or from implementing a specific product or service functionality. Some companies may even put value on using patents as a promotional tool to exhibit the firm’s technological leadership.
8)    Collateralize Patents for Business Financing
Under certain circumstances, patent portfolios can be leveraged as collateral for business financing. While this is still a small portion of the overall corporate finance market, it’s an increasingly popular option for companies with patent portfolios that have demonstrated value. Justification and due diligence for these types of loans often leverage many of the analyses covered previously, such as assessments of value, litigation, overall risk, FTO, and the competitive environment, as well as a variety of other collateral diligence factors, including clear ownership and title of the IP, and the ability to perfect a security interest.
These types of business loans are often facilitated by or involve specialized organizations with expertise in IP collateralization, as most traditional business financing resources are not set up to execute the required due diligence or deal with the IP assets in case of default. Most IP collateralizations focus on patents, but trademarks, trade secrets, and copyrighted works of art with clear valuation metrics are also collateral candidates.
9)    Refining Internal R&D Processes to More Efficiently Develop Protectable IP
Patent document intelligence can be valuable information to integrate into corporate and academic R&D to improve and accelerate the idea generation process and better develop protectable IP. In general, this integration is facilitated by the broader availability of free quality IP research resources available via the Web, such as Google Patents, the USPTO’s patent search capabilities, and the European Patent Office’s worldwide patent search portal Espacenet, as well as a variety of other free and proprietary services. These websites can deliver patent document research capabilities directly to those individuals involved in R&D initiatives.
From a specific IP segment perspective, it’s often advantageous to utilize a domain collection analysis approach, the analytical results, and the similarity landscape to analyze patent documents specific to IP sector-focused R&D initiatives. The analytics domain collection and document similarity landscape can then be explored via keyword and key-phrase queries to find IP that’s useful for refining the R&D focus and for patent drafting intelligence that can improve the chances of a granted patent.
10)  Improving New Patent Application Filings
Research into existing patent documents can be valuable in drafting new patent applications. For example, the analytical patent document landscape, for a domain collection, can be used to tabulate the IP sector’s top patent and non-patent literature citations. These citations are the most important patent documents and non-patent publications referenced across the domain collection, which are also likely relevant to new patent applications in the IP sector.
In addition, the patent document similarity landscape can be explored via keyword and key-phrase queries to identify patent document and specific claims that may be similar to the inventive ideas of a new patent application. The intelligence gained from an analysis of these documents can then be used to help draft new applications which are more unique, avoid the existing patent documents, and have better chances of being approved as granted patents.
When leveraging existing patent document information in the drafting of new applications, it’s important to be aware of a regulatory requirement called the Duty of Disclosure (37 CFR 1.56). This regulation simply states that everyone involved in filing a patent application (inventors, patent practitioners, etc.) must disclose all information known to them that could be material to the patentability of the invention. So, patent documents encountered during the research process, that are materially similar to a new patent filing, must be cited as a reference in the new application.
11)  Integrate IP Analyses with Additional Business Intelligence
While the IP analyses described above provide significant strategic value, further value and insights can be realized by integrating additional business intelligence. This includes additional IP-oriented resources such as standard essential patent (SEP) databases like IPlytics and Questel’s Orbit Intelligence Search SEP, as well as a wide variety of other resources such as non-patent literature from sites like PubMed and Microsoft Academic, regulatory filings from the FDA and FCC, corporate information from Bloomberg or Crunchbase, and financial information from resources like Edgar Online or Financial Modeling Prep. Even AI systems like OpenAI’s ChatGPT and Google’s Bard can be integrated into the results from the analytical patent document landscape. This type of intelligence integration is quickly becoming the norm and will certainly be the future of strategic decision-making for not only startup commercialization but also for investing, corporate business development, M&A, and venture capital.