binary neural network - artificial intelligence - machine learning

Conflicting machine learning explained: how attackers disrupt AI and ML systems

As more companies deploy artificial intelligence (AI) and machine learning (ML) projects, securing them becomes more important. A report released by IBM and Morning Consult in May stated that of more than 7,500 companies worldwide, 35% of companies are already using AI, a 13% increase from last year, while a further 42% are researching it. However, nearly 20% of companies say they have problems securing data and that this is slowing down AI adoption.

In a research conducted last spring by GartnerSecurity concerns were one of the biggest obstacles to AI adoption, which came in first place with the complexity of integrating AI solutions into existing infrastructure.

According to an paper that Microsoft has released last spring, 90% of organizations were not ready to defend against hostile machine learning. Of the 28 organizations large and small covered in the report, 25 did not have the tools they needed to secure their ML systems.

Securing AI and machine learning systems poses significant challenges. Some are not unique to AI. For example, AI and ML systems need data, and if that data contains sensitive or proprietary information, they become the target of attackers. Other aspects of AI and ML security are new, including defenses against hostile machine learning.

What is hostile machine learning?

Despite what the name suggests, hostile machine learning is not a form of machine learning. Rather, it is a set of techniques that adversaries use to attack machine learning systems.

“Adversarial machine learning exploits vulnerabilities and specifics of ML models,” said Alexey Rubtsov, senior research associate at Global Risk Institute and professor at Toronto Metropolitan University, formerly Ryerson. He is the author of a recent article on adversarial machine learning in financial services

For example, contradictory machine learning can be used to make ML trading algorithms make wrong trading decisions, make fraudulent operations more difficult to detect, provide inaccurate financial advice, and manipulate reports based on sentiment analysis.

Types of hostile machine learning attacks

According to Rubtsov, hostile machine learning attacks fall into four main categories: poisoning, evasion, extraction, and inference.

1. Poison Attack

In a poison attack, an opponent manipulates the training dataset, Rubtsov says. “For example, they distort it intentionally and the machine learns the wrong way.” For example, let’s say your home has an AI-powered security camera. An attacker could walk past your house at 3 a.m. every morning and let their dog walk across your lawn, triggering the security system. Finally, turn off these 3 hour alerts to avoid being woken up by the dog. That dog walker is basically providing training data that something that happens every night at 3 a.m. is a harmless event. If the system is trained to ignore anything that happens at 3 a.m., they will attack.

2. Dodge Attack

In an evasion attack, the model is already trained, but the attack may slightly alter the input. “An example might be a stop sign that you put a sticker on and the machine interprets it as a yield sign instead of a stop sign,” Rubtsov says.

In our dog walker example, the thief might put on a dog costume to break into your house. “The dodge attack is like an optical illusion to the machine,” Rubtsov says.

3. Extraction Attack

In an extraction attack, the opponent obtains a copy of your AI system. “Sometimes you can extract the model by just observing what input you give the model and what output it delivers,” Rubtsov says. “You poke the model and you see the reaction. If you can poke the model often enough, you can teach your own model to behave the same.”

In 2019, for example, a vulnerability in Proofpoint’s email security system generated email headers with an embedded score of how likely it was to be spam. Using these scores, an attacker could build a copycat spam detection engine to create spam emails that would evade detection.

If a company uses a commercial AI product, the adversary may also be able to get a copy of the model by purchasing it or using a service. For example, there are platforms available to attackers where they can test their malware against antivirus engines.

In the dog walking example, the attacker might grab a pair of binoculars to see what brand of security camera you have and buy the same one to figure out how to get around it.

4. Inference Attack

In an inference attack, the adversaries find out which training dataset was used to train the system and take advantage of vulnerabilities or biases in the data. “If you can figure out the exercise data, you can use common sense or advanced techniques to take advantage of that,” Rubtsov says.

For example, in the dog walking situation, the opponent might turn off the house to find out what the normal traffic patterns are in the area and notice that a dog walker comes by every morning at 3 a.m. and finds out that the system is biased and has learned to ignore people walking their dogs. to ignore.

Defending against hostile machine learning

Rubtsov advises companies to ensure that their training data sets are not biased and that the adversary cannot intentionally damage the data. “Some machine learning models use reinforcement learning and learning on the fly as new data comes in,” he says. “In that case, you have to be careful how you handle new data.”

When using a third-party system, Rubtsov advises companies to ask the suppliers how they protect their systems against attacks by adversaries. “Many suppliers have nothing in place,” he says. “They’re not aware of it.”

Most attacks on normal software can also be used against AI, according to Gartner. So many traditional security measures can also be used to defend AI systems. For example, solutions that protect data from access or compromise can also protect training data sets from tampering.

Gartner also recommends companies take additional steps if they need to protect AI and machine learning systems. First, to protect the integrity of AI models, Gartner recommends companies: reliable AI principles and perform validation checks on models. Second, to protect the integrity of AI training data, Gartner recommends using data poisoning detection technology.

Known for its industry-standard ATT&CK framework of hostile tactics and techniques, MITER collaborated with Microsoft and 11 other organizations to create an attack framework for AI systems called Adversarial Machine Learning Threat Matrix† It was renamed Adversarial Threat Landscape for Artificial Intelligence Systems (ATLAS) and includes 12 stages of attacks on ML systems.

Some vendors have started releasing tools to help companies secure their AI systems and defend against hostile machine learning. In May 2021, Microsoft counterbalance, an open-source automation tool for testing AI systems for security. “This tool grew out of our own need to assess Microsoft’s AI systems for vulnerabilities,” said Will Pearce, Microsoft’s AI Red Team Lead for Azure Trustworthy ML. in a blog post† “Counterfit started out as a corpus of attack scripts written specifically to target individual AI models, then turned into a generic automation tool to attack multiple AI systems at scale. Today, we routinely use Counterfit as part of our AI rescue team operations.”

The tool is useful for automating techniques in MITER’s ATLAS attack framework, Pearce said, but it can also be used in the AI ​​development phase to absorb vulnerabilities before they go into production.

IBM also has an open-source hostile machine learning defense tool called the Hostile Robustness Toolbox, now running as a Linux Foundation project. This project supports all popular ML frameworks and includes 39 attack modules that fall into four main categories of evasion, poisoning, extraction, and inference.

Fighting AI with AI

In the future, attackers could also use machine learning to launch attacks on other ML systems, said Murat Kantarcioglu, a computer science professor at the University of Texas. For example, a new type of AI are generative hostile systems. These are usually used to create deep fakes — very realistic photos or videos that can trick people into thinking they’re real. Attackers usually use them for online scams, but the same principle can be applied to, for example, creating undetectable malware.

“In a generative hostile network, part is called the discriminator and part is the generator and they attack each other,” said Kantarcioglu. For example, an antivirus AI could try to find out if something is malware. A malware-generating AI could try to create malware that the first system cannot catch. By repeatedly pitting the two systems against each other, the end result can be malware that is almost impossible to detect.

Copyright © 2022 IDG Communications, Inc.

Leave a Comment

Your email address will not be published. Required fields are marked *