Introduction
We're excited to announce the release of our new search engine, SPADE (Search Platform for Art Discovery and Extraction), an algorithm that parses through large quantities of patents while efficiently carving new decision paths as it returns a complete set of search results of prior art for a target patent. In this blog post, we first discuss the myriad of challenges with searching through prior art for retrieval tasks related to patents before diving into how SPADE addresses these challenges.
Challenges with Prior Art Search
There are many types of patent search and retrieval, such as prior art search, freedom to operate (FTO), infringement search, and invalidity search [1]. However, regardless of the motivation behind the search, the overall goal of patent search is to ultimately return any and all prior art that relates to the claims of the target patent. The claims of the target patent provide exact information on the invention's novelty, obviousness, and industrial applicability [2]. While patent claims are the most important aspect of a patent, "any relevant written evidence may impact novelty or obviousness, [so one] should search full patent specifications and the claims and all available technical and nontechnical publications" [2, 3]. This makes patent search different from searching through a large corpus of academic research documents, as research papers are not nearly the same length as patents and have rigid structure for search algorithms to dive deeper into, (eg. "Problem Formulation" sections dictate exactly the details of the research in question). In contrast, patents have a variety of structured and unstructured sections with diverse formats in potentially multiple languages [1].
Patent language can also be very specific. For instance in the following subset of claims found in US-8361156-B2:
A spinal fusion implant of non-bone construction positionable within an interbody space between a first vertebra and a second vertebra, said implant comprising:
an upper surface including anti-migration elements to contact said first vertebra when said implant is positioned within the interbody space, a lower surface including anti-migration elements to contact said second vertebra when said implant is positioned within the interbody space, a distal wall, a proximal wall, a first sidewall, and a second sidewall generally opposite from the first sidewall, wherein said distal wall, proximal wall, first sidewall, and second sidewall comprise a radiolucent material.
Here we must make a decision on what our search should be. The obvious first choice is "spinal fusion implant", but we would lose out on important information such as the fact that the implant is of "non-bone construction" or "positionable within an interbody space". If we decide to search for the entire first sentence, we add in other words to our search query that might distract from the proper direction. The problem of determining the number of query terms is a well studied problem in the space [4], with research showing that keyword searching at its core can be imprecise [5]. Patent language is verbose for both legal certainty [1] but also to protect patents from being found, with text deliberately using obscure terminology and grammar [5].
Although claim language provides vital information, searching through the full text of a patent can also be helpful in providing context. However, by doing so, the risk of returning documents with little to no relevance increases [5].
Our Approach
In order to tackle these challenges, SPADE operates similarly to a human expert, optimizing across various steps of the search pipeline and repeatedly finding new paths of queries that have not been searched through. This is a prominent issue for search engines, which often run out of results trying to match all the search details [6].
SPADE is able to understand the target patent's concepts, allowing it to plan out its search strategy with higher intelligence than just a series of keyword searches. This understanding allows it to optimize its steps dynamically, which means that searches across different patents will have markably different search trajectories. We believe this is vital for any state-of-the-art prior art search algorithm and something that gives users reassurance that the approach is tailored for the target patent in question.

SPADE is also adept at filtering through patents that might return a high similarity score with the target patent, but does not match proper concepts and ideas. This is vital for patent lawyers and examiners as looking through hundreds of patents that have similar terms being used in completely different contexts can cost a lot of time. Previous approaches deal with hyperparameters that are not optimized for each search [7], or using similarity search through trained arguments [8] which struggle with translating to concepts outside of the training set. SPADE's returned search results are not limited by its algorithm, allowing it to dynamically assess the complete set of patents that fit the target patent in question.
A big issue with using AI in patent search is that oftentimes search engines powered by deep learning can be nothing more than a black-box, making it difficult for users to follow along with the process or understand the search strategy being used [6]. SPADE outputs its current step as it searches, allowing the user to follow along with the process. Future work includes fine tuning SPADE to the user's preferences, allowing them to leverage the power of its search with their own nuanced, expert touch.
Integration with Real World Workflows
We've already integrated SPADE into production on Garden's Claims Enhancement tool, which helps companies and firms write updated claims given prior art that has been published after their patented product. We've been getting positive feedback from customers and hope to test on future benchmarks to provide quantitative scores.
Conclusion
Garden SPADE offers state-of-the-art patent search capabilities, allowing users to comprehensively find prior art that covers all aspects of the target patent. Its search strategy derives from the search trajectories of experienced law professionals, with additional optimizations and algorithms on top. Reach out to sales@gardenintel.com if you are interested in trying it out.
References
[1] Amna Ali, Ali Tufail, Liyanage Chandratilak De Silva, and Pg Emeroylariffion Abas. Innovating patent retrieval: A comprehensive review of techniques, trends, and challenges in prior art searches. Applied System Innovation, 7(5), 2024.
[2] David Hunt, Long Nguyen, and Matthew Rodgers. Patent Searching: Tools & Techniques. John Wiley & Sons, Inc., Hoboken, New Jersey, 2007.
[3] Brahmeshwar Mishra and Gunjan Vasant Bonde. Patent Searching, pages 473–503. Springer Nature Singapore, Singapore, 2022.
[4] José Carlos Toucedo and David E. Losada. Formulating good queries for prior art search. In Carol Peters, Giorgio Maria Di Nunzio, Mikko Kurimo, Thomas Mandl, Djamel Mostefa, Anselmo Peñas, and Giovanna Roda (eds.), Multilingual Information Access Evaluation I. Text Retrieval Experiments, pages 418–425, Springer Berlin Heidelberg, 2010.
[5] Nigel S. Clarke. The basics of patent searching. World Patent Information, 54:S4–S10, 2018. Best of Search Matters.
[6] Renukswamy Chikkamath, Deepak Rastogi, Mahesh Maan, and Markus Endres. Is your search query well-formed? A natural query understanding for patent prior art search. World Patent Information, 76:102254, 2024.
[7] Anna Maria Villa and Manuel Wirz. A sequential patent search approach combining semantics and artificial intelligence to identify initial state-of-the-art documents. World Patent Information, 68:102096, 2022.
[8] Konrad Vowinckel and Volker D. Hähnke. Searchformer: Semantic patent embeddings by siamese transformers for prior art search. World Patent Information, 73:102192, 2023.
