Are we giving away too much: Should we rein it in with generative AI?

Date & Time (GMT):

February 27, 2025 5:05 AM

Date & Time (EST):

Have you ever been told to keep your mouth shut? That you’re saying something you shouldn’t? Are you the king or queen of TMI?

We all know in a work setting what we should and shouldn’t be saying out loud. Especially to random strangers. We know we shouldn’t be sharing customer data or giving away company secrets that give you a competitive edge. Like that internal report about how you’re going to improve productivity among the workforce by giving everyone hoverboards to get to work (fun and efficient). That type of thing.

It seems most people don’t mind sharing that short of information with an artificial intelligence though. According to Cisco’s 2024 Data Privacy Benchmark Study, 48% of people were entering non-public information about their company into generative AI (GenAI) apps and 69% were concerned GenAI could hurt the company’s legal right and intellectual property (IP).

And yet, we seem to keep doing it.

Most of us have a basic understanding of how LLMs and GenAI are trained - on publicly available data - but until now, much of the conversation about privacy and IP concerns have been about our own personal security and the rights of artists and creators to be able to protect their own work. There has been less talk about the implications of sharing our own work, boring as we might think it is, with a chatbot. We’ve all been encouraged to harness the productivity and efficiency benefits of GenAI, but at what cost?

Are we giving too much away when we use these tools?

‍

How does GenAI work?

At this point, we’re going to assume everyone knows about GenAI. But just to make sure we’re on the same page, here’s a quick explanation of how it works using, probably the most well-known example, ChatGPT.

According to OpenAI, the developer of ChatGPT, it uses three data sources to train its models: information that’s publicly available on the internet; information from third-party partners; information that users, human trainers and researchers provide and generate.

On that last point, that means anyone who uses ChatGPT is helping to train the model. According to OpenAI:

“When you share your content with us, it helps our models become more accurate and better at solving your specific problems and it also helps improve their general capabilities and safety. We don’t use your content to market our services or create advertising profiles of you—we use it to make our models more helpful. ChatGPT, for instance, improves by further training on the conversations people have with it, unless you opt out.”

When an LLM model is trained, it reviews huge amounts of existing information and learns how words are likely to appear in context with other words. So, when it responds to a user’s prompt, it’s predicting the most likely words to use based on past examples. The more you use ChatGPT, the further you reinforce certain likelihoods of the same responses to similar prompts.

What does this mean for you?

Well, if you’re sharing confidential information with ChatGPT and asking specific questions, over time it will learn to predict the correct response. If someone asks a similar question or prompt, there is a chance they could receive the same response as you. Which could include some of the confidential info you shared with the chatbot.

But no-one’s going to ask that specific question, right?

Probably not accidentally. But someone might, on purpose. Like a hacker, or a competitor…

Don’t worry, it’s not a highly likely scenario, we don’t want to scaremonger. But the chances are there. And with ChatGPT-4o, the terms of usage allow them to “share your personal information with afficiliates, vendors, service providers and law enforcement”, so you can’t be sure exactly where the data will end up.

We also asked ChatGPT itself, by the way and its response felt as clear as any politician’s…so we’ll leave it up to you whether you want to trust it with confidential data.

‍

‍

What does this mean for GenAI in social listening tools?

As we know, GenAI is being incorporated into all kinds of tech, including social listening platforms. So, should we also be worried about sharing potentially confidential information or asking leading prompts to these tools?

We spoke to a few of the vendors who have incorporated GenAI to understand how they’re storing and working with user inputs. In general, because data privacy is such a big issue - one that companies can get in a lot of trouble over - it’s high on the agenda when developing their tools.

For example, Runic, an AI-native agentic research and marketing platform built almost entirely out of LLMs, provides each customer with a dedicated data silo that includes all relevant data, prompts and any models associated with that customer. ViralMoment also stores prompts and data inputs securely. They aren’t shared with other clients, but they are available to their customer success team for troubleshooting purposes.

When it comes to training their models, ViralMoment never uses user data. They explain, “Our AI system is designed to process and analyse data to generate insights, but it does not retain or use this data for further model training. This ensures that user inputs remain confidential and are not used to improve the model in a way that could inadvertently expose sensitive information.” With Runic, each customer has their own set of models and any fine-tuning and training is customer-specific. They explain that “models used for one customer will never be used for another, and all data inputs are solely for that customer’s purposes.”

YouScan have also integrated GenAI through Insights Copilot. They use OpenAI as their LLM provider, meaning user queries are processed by OpenAI's models. Anonymised user inputs are analysed and stored to improve their model by identifying common use cases, but the data isn’t used to train AI models and OpenAI doesn’t retain API data after processing.

‍

But are you allowed to use the results?

The other less talked about issue is who owns the outputs of GenAI, the IP. And when it is talked about, it’s usually in relation to creative outputs such as art, poetry etc. However, many people use GenAI to analyse data, compile reports or create social content. So who owns that? The company developing the AI? The user writing the prompts? The AI itself!?

After a bit of digging around, that seems like a complex issue that we’d need a whole other article for. But, when it comes to social listening platforms, the decision about who owns the IP of the output comes from the tech provider, so make sure to read that small print. In the case of the three vendors we talked to, all confirmed that the outputs generated by their GenAI models belong to the customer.

‍

So, is GenAI ‘safe’ to use?

It’s fair to say that, depending on the GenAI tool you’re using, there is a risk that data could be exposed to either the LLM itself or to other less-than-desirable people. And while that risk might be small, it’s definitely something you can, and probably should, avoid.

When you’re using generic GenAI apps, even mainstream ones like ChatGPT, who may or may not be using your prompts and inputs to train their models, the safest thing to do is not to share any confidential or personal information with them. That way, there’s no way the wrong person is getting hold of your company secrets.

When it comes to specialist tools, such as social listening platforms that have integrated GenAI, then it’s important to understand how they collect, store and use your data. Most will include this information on their website or in their terms of service, but if you’re unsure, just ask them. That’s what we did. That way you can make an informed decision based on how confident you are that they can protect your data from other customers and from outside threats.

‍

Or view the interview on LinkedIn

This interview was recorded via LinkedIn Live, if you prefer to view on LinkedIn, click the button below.

View Interview