Figuring out whether machine learning patents have been infringed is not the impossible task many believe it is, write Haseltine Lake Kempner’s Kimberly Bayliss and Alexandra Roy in this jointly published article.
One of the things applicants think about when applying for patents is whether the invention is detectable – after all, a patent is of limited use if you can’t prove that it has been infringed.
There seems to be a common assumption that machine learning (ML) models cannot be reverse engineered due to their black box nature, and that it has limited value to pursue “hidden AI” patents.
To get to the bottom of the detectability question, we sat down with Chris Whittleston, Delivery Lead at Faculty, one of the UK’s well-known AI companies, for an informal conversation about reverse engineering. We asked him what could be detected in different scenarios; this is a summary of what we learned.
If you can get your hands on a software product, say an executable file, then it’s likely you can get a handle on whether a machine learning model is being used
Many ML inventions fall into the category of what is known in Europe as ‘applied AI’† That is, they use well-known ML models in an innovative way.
In this scenario, if an applied AI invention were incorporated into a product, it is highly likely that open source libraries would be used to create and implement the ML model. After all, open source libraries are reliable, well tested and, why reinvent the wheel? Several open source libraries exist and the code can be downloaded for free.
If an open source library were used in a product, it could be detected through decompilation, a reverse engineering technique that recreates high-level, human-readable source code from compiled executable code (which is not human-readable) .
While decompilation allows an engineer to reconstruct functions and variables called by the executable code, the original names and labels for these variables and functions are replaced with arbitrary strings. However, the substitution is done in a consistent manner in the decompiled code, meaning that variables, function names, and the like are each replaced with an individual random string.
While the names and labels given to the various functions in the decompiled code may be unintelligible, the structure of these functions can be compared to known functions in open source libraries. This allows an engineer to recognize and identify the functions that are called in the decompiled code.
It is therefore probably possible to identify known neural network code structures in this way, such as code used to create the layers of the neural network or code used to perform common machine learning tasks (e.g., back propagation).
As for identifying specific variables within the decompiled code, variables can often be traced through the various functions. This trace can provide clues about the data type of the input to a neural network.
Thus, if a patent claim describes a neural network with certain inputs, it may be possible to determine from reverse engineered code that a neural network is present in the code (if a function is identified that corresponds to an open source neural network) and also to determine what input is provided to it.
Outputs are usually harder to identify from the decompiled code, not least because custom code can be used to handle this data. However, assuming that you can successfully trace a particular variable through the various functions applied to it, you may still get enough information to deduce the output data type.
In conclusion, using decompilation it is probably possible to determine whether a function of “using a machine learning model to predict x from y” is being violated.
What about core AI inventions, where the inventor has invented a new model architecture, or a new method of training a model?
Another type of ML invention is what is called a core AI invention in Europe. These are the ones where the inventor has made a fundamental improvement in the field of ML itself. For example, a new and improved model architecture or a new way to train a neural network.
These inventions are unlikely to be implemented using open source code alone, because by their very nature they represent new ML models or processes for training ML models (for example, the types of processes that eventually end up in open source libraries).
However, it is still likely that open source libraries will be used to build the new model – albeit with additional, custom code to implement the new feature. Taking an example of a new neural network with a new and improved layer, the cross-reference technique discussed above can be applied to identify the different functions used to build the new model.
In this scenario, code structures that do not fully match the known open source code structures could indicate new code segments that may have been added to implement a core AI invention. If the functionality of the new code segments can be inferred, infringement of a core AI invention can be demonstrated.
New training methods can be identified in a similar way, especially the functions responsible for the pre-processing of training data. That is, it is unlikely that a commercial product would be built using custom functions for standard data processing techniques (for example, standard libraries would most likely be used to implement processes such as rotation of an image, convolution or the like) and thus different combinations of such techniques that can be used to build a new training method can be easily determined from decompiled code.
The presence of ongoing training necessarily can also be derived if a model shows drift. For example, an ML model that undergoes no training (ie a static model), when iteratively provided the same input, would be expected to produce the same output for each iteration. On the other hand, if the iterative output drifted away from the initial output, it would indicate that the model is being trained (ie it is a non-static model).
Is it viable?
In principle, it appears that given a compiled executable, many commercial implementations of ML algorithms can be reverse engineered to a level that can be used to infer patent infringement. Nevertheless, the time and expense required for such a process is likely to make it unfeasible in many circumstances.
It will also be appreciated that the above solutions apply in circumstances where the executable is available for decompilation. However, many ML models will be made available as services over the cloud, effectively keeping their executable code behind closed doors.
In these circumstances, the only available mechanism to obtain information is to request the service. In some circumstances, a targeted search pattern can make it possible to obtain certain details about the underlying model. For example, training data can be built up (in the manner of an extraction attack) and in some circumstances the type of model and architectural detail can be determined by targeted querying around a decision boundary.
It’s worth patenting AI
Given the complexity of the ML models currently in use, it is easy to assume that they simply cannot be reverse engineered. But even from our brief discussion, it became clear that broad, sweeping statements like this just aren’t true; given enough time, AI models can be reverse engineered (as is the case with many other programs).
Whether this is commercially viable on a frequent basis is another question. However, even partial decompilation may be enough to justify legal action, especially given that the disclosure phase of lawsuits could be used to obtain more complete information.
Finally, even if the time and cost factors of reverse engineering are distasteful, this should not prevent an applicant from patenting AI, as future developments in AI regulation and standardization will impact the ability of commercial ML models to to stay hidden. For example, the European Union is: actively working on proposals to regulate AI† These will likely require making the inner workings of AI products used in certain industries transparent to ensure they comply with European regulatory requirements.
Applicants should therefore carefully consider whether to waive patent protection for their AI inventions based on the general assumption that all AI infringements are undetectable.
Previous articles by Haseltine Lake Kempner authors in this series can be found here: