In May 2023, Samsung Electronics prohibited its employees from using generative artificial intelligence (AI) tools like ChatGPT. The ban was issued in an official memo, after discovering that staff had uploaded sensitive code to the platform, which prompted security and privacy concerns for stakeholders, fearing sensitive data leakage. Apple and several Wall Street Banks have also enforced similar bans.
While generative AI contributes to increased efficiency and productivity in businesses, what makes it susceptible to security risks is also its core function: taking the user’s input (prompt) to generate content (response), such as text, codes, images, videos, and audio in different formats. The multiple sources of data, the involvement of third-party systems, and human factors influencing the adoption of generative AI add to the complexity. Failing to properly prepare for and manage security and privacy issues that come with using generative AI may expose businesses to potential legal repercussions.
Safety depends on where data is stored
So, the question becomes, how can businesses use generative AI safely? The answer resides in where the user’s data (prompts and responses) gets stored. The data storage location in turn depends on how the business is using generative AI, of which there are two main methods.
Off-shelf tools: The first method is to use ready-made tools, like OpenAI’s ChatGPT, Microsoft’s Bing Copilot, and Google’s Bard. These are, in fact, nothing but applications with user interfaces that allow them to interact with the base technology that is underneath, namely large language models (LLMs). LLMs are pieces of code that tell machines how to respond to the prompt, enabled by their training on huge amounts of data.
In the case of off-the-shelf tools, data resides in the service provider’s servers—OpenAI’s in the instance of ChatGPT. As a part of the provider’s databases, users have no control over the data they provide to the tool, which can cause great dangers, like sensitive data leakage.
How the service provider treats user data depends on each platform’s end-user license agreement (EULA). Different platforms have different EULAs, and the same platform typically has different ones for its free and premium services. Even the same service may change its terms and conditions as the tool develops. Many platforms have already changed their legal bindings over their short existence.
In-house tools: The second way is to build a private in-house tool, usually by directly deploying one of the LLMs on private servers or less commonly by building an LLM from scratch.
Within this structure, data resides in the organization’s private servers, whether they are on-premises or on the cloud. This means that the business can have far more control over the data processed by its generative AI tool.
Ensuring the security of off-the-shelf tools
Ready-made tools exempt users from the high cost of technology and talent needed to develop their own or outsource the task to a third party. That is why many organizations have no alternative but to use what is on the market, like ChatGPT. The risks of using off-the-shelf generative AI tools can be mitigated by doing the following:
Review the EULAs. In this case, it is crucial to not engage with these tools haphazardly. First, organizations should survey the available options and consider the EULAs of the ones of interest, in addition to their cost and use cases. This includes keeping an eye on the EULAs even after adoption as they are subject to change.
Establish internal policies. When a tool is picked for adoption, businesses need to formulate their own policies on how and when their employees may use it. This includes what sort of tasks can be entrusted to AI and what information or data can be fed into the service provider’s algorithms.
As a rule of thumb, it is advisable not to throw sensitive data and information into others’ servers. Still, it is up to each organization to settle on what constitutes “sensitive data” and what level of risk it is willing to tolerate that can be weighed out by the benefits of the tool adoption.
Ensuring the security of in-house tools
The big corporations that banned the use of third-party services ended up developing their internal generative AI tools instead and incorporated them into their operations. In addition to the significant security advantages, developing in-house tools allows for their fine-tuning and orienting to be domain and task-specific, not to mention gaining full control over their interface user experience.
Check the technical specifications. Developing in-house tools, however, does not absolve organizations from security obligations. Typically, internal tools are built on top of an LLM that is developed by a tech corporation, like Meta AI’s LLaMa, Google’s BERT, or Hugging Face’s BLOOM. Such major models, especially open-source ones, are developed with high-level security and privacy measures, but each has its limitations and strengths.
Therefore, it would still be crucial to first review the adopted model’s technical guide and understand how it works, which would not only lead to better security but also a more accurate estimation of technical requirements.
Initiate a trial period. Even in the case of building the LLM from scratch, and in all cases of AI tool development, it is imperative to test the tool and enhance it both during and after development to ensure safe operation before being rolled out. This includes fortifying the tool against prompt injections, which can be used to manipulate the tool to perform damaging cyber-attacks that include leaking sensitive data even if they reside in internal servers.
Parting words: be wary of hype
While on the surface, the hype surrounding generative AI offers vast possibilities, lurking in the depths of its promise are significant security risks that must not be overlooked. In the case of using ready-made tools, rigorous policies should be formulated to ensure safe usage. And in the case of in-house tool deployment, safety measures must be incorporated into the process to prevent manipulation and misuse. In both cases, the promises of technology must not blind companies to the very real threat to their sensitive and private information.