Which AI Can You Trust? What Leading LLMs Do with Your Data

News & Resources

Shopping for LLMs and what they do with your data



Publish Date

February 12, 2025



The Privacy Paradox

A novelist sits alone at midnight, sharing the first pages of an unfinished manuscript with an AI. In the soft glow of the screen, words flow like whispered confessions – dreams, doubts, half-formed ideas seeking shape in the digital dark. The AI responds with thoughtful precision, each suggestion feeling like a trusted mentor leaning in close, understanding not just the words, but the spaces between them.

But who else is in the room?

We’ve grown comfortable with AI as our digital confidant, treating it like a tool we can pick up and set aside. But modern AI isn’t passive. It’s a student who never forgets a lesson. Every creative spark we share, every vulnerable question we ask, every rough draft we explore becomes part of its evolution. That data, those midnight confessions, don’t always stay between you and your digital companion.

The old internet warning – “If you’re not paying for the product, you are the product” – has deeper resonance in the AI era. Because today, the product doesn’t just observe you. It understands you.

How AI Uses Your Data: What You Should Know

Most people imagine their AI conversations dissolving into digital mist when they close their browser, an ephemeral moment that vanishes like breath on glass. The reality is more complex and much more permanent. Some platforms preserve your words like ancient scribes, archiving each interaction. Others treat your conversations as teachings, using them to shape future responses. Many do both, making each interaction part of the record and part of the lesson.

When you speak to an AI, you’re either talking to a meticulous librarian who catalogs your every word or a student who is learning from every interaction. Often, you’re speaking to both at once.

The patterns matter as much as the words. Even when an AI doesn’t keep your exact phrasing, it tracks your thinking: when you use it, which features you prefer, how you structure your thoughts. These digital breadcrumbs might seem insignificant alone, but together they create a detailed map of how your mind works.

Let’s dive in to some of the most popular Large Language Models (LLMs) available today and compare what data they collect, how they use it, and how you can protect yourself.

The Open Harvesters: Free AI and Your Data

If an AI tool is free, you are not just a user; you’re the raw material. Some companies admit this outright. Others hide it behind dense policy language. But the reality is simple: free AI is not free. You are paying with your thoughts, patterns, and creative ideas.

Meta’s AI Ecosystem: The Ultimate Data Machine

Meta doesn’t just collect data; it’s founded on it.

Facebook, Instagram, and WhatsApp create a vast network of user interactions, conversations, and behavioral patterns. With Meta AI and Llama, the company has added a new layer to that system, transforming everyday interactions into training data.

What Meta Collects:

Conversations with Meta’s AI models.
Public posts, captions, and engagement data.
Behavioral metadata, including usage patterns.

How It Uses Your Data:

Meta AI is trained on your content.
Your behavior influences Meta’s ad and engagement algorithms.
No meaningful opt-out exists.

How to Protect Yourself: If you care about privacy, avoid Meta’s AI models.

DeepSeek: The Hidden Giant

DeepSeek has quickly become a major force in AI, positioning itself as an open-source alternative to proprietary models. But “open” does not mean private.

What DeepSeek Collects:

All user interactions.
Data scraped from publicly available sources.
Aggregation from external services.

How It Uses Your Data:

DeepSeek stores or retrains user interactions.
The model is trained on various external data sources.
User data is stored in regions lacking privacy protections.

How to Protect Yourself: Assume anything you type into DeepSeek is being stored and can be linked directly to you.

The Premium Promise: Paid Services and Privacy Claims

Paying for an AI tool doesn’t automatically protect your data. Some subscription-based companies still collect interactions for model improvements, while others offer real privacy safeguards. Knowing the difference is more important than the price.

Microsoft Copilot: A Tale of Two Privacy Policies

Microsoft’s Copilot runs on OpenAI’s models, but the version determines privacy.

What Microsoft Collects:

Free-tier interactions are stored and analyzed.
365 users receive stronger assurances that their data isn’t used for AI training.
Microsoft tracks metadata across Windows, Office, and Edge.

How It Uses Your Data:

Free-tier interactions are stored and examined.
Paid versions of Copilot (like Microsoft 365) do not learn from user interactions.
Microsoft’s broader ecosystem tracks user behavior across its platforms.

How to Protect Yourself: Use paid Copilot versions for better privacy controls and review Windows telemetry settings.

OpenAI’s ChatGPT: Choose Your Privacy Level

OpenAI provides more transparency than most AI providers, but privacy depends on the tier you use.

What OpenAI Collects:

The conversations of free-tier users are stored and used for training.
ChatGPT Plus subscribers’ chats aren’t used for training, but metadata is logged.
Enterprise and API users receive the highest level of privacy, with no training on user data.

How to Protect Yourself: Turn off “Improve the model for everyone” in ChatGPT’s data control settings.

Google Gemini: The Cost of Integration

Google is the largest data collector in the world, and its AI models, such as Gemini, reflect that fact.

What Google Collects:

Interactions of free-tier Gemini users are stored and analyzed.
Gemini Advanced users receive stronger privacy protections but remain within Google’s data ecosystem.
Behavioral metadata is collected across Gmail, Docs, and Search.

How It Uses Your Data:

Interactions of free-tier Gemini users are used for training.
Data from enterprise users is logged for analysis.
Cross-service tracking raises concerns about AI interactions being associated with other Google products.

How to Protect Yourself:

Use Gemini Advanced for stronger privacy controls and review Google privacy settings to reduce tracking.

Anthropic’s Claude: The Privacy-First Alternative

Anthropic positions Claude as a privacy-conscious AI model, with stronger safeguards than most competitors.

What Anthropic Collects:

Data from free users is stored for analysis.
Claude Pro and API users receive the highest level of privacy, with no training on user interactions.
Compared to competitors, there is minimal metadata logging.

How It Uses Your Data:

Claude Pro and API users are assured that their conversations are not used for model improvements.
The internal team reviews free users’ data for analysis.

How to Safeguard Yourself:

Use Claude Pro or API plans for better privacy.

The Privacy Toolkit: Protecting Your Digital Self

In the AI era, privacy is something you build. While no AI provider makes it effortless, you can take steps to protect your data.

Start With Settings. Most AI platforms have privacy controls, but they are often buried in menus. Adjust them:

OpenAI: Turn off “Chat History & Training”.
Google: Disable cross-service tracking.
Microsoft: Use paid versions for stronger privacy.

The Golden Rule of Free AI: Treat every interaction like a crowded room conversation. Someone is always listening. Never share:

Financial details
Personal information
Trade secrets
Unpublished creative work

Paid Plans Offer More Privacy, But Not Total Protection:

ChatGPT Plus doesn’t train on your conversations but tracks metadata.
Gemini Advanced limits AI training but remains connected to Google’s ecosystem.
Enterprise and API accounts offer the highest level of data protection.

Making Informed Choices: The Price of Digital Intimacy

In quiet moments when we share our thoughts with AI – our creative sparks, professional insights, half-formed dreams – we’re not just using a tool. We’re participating in a new type of relationship with technology, one that promises understanding but demands vulnerability in return. And like any meaningful relationship, it requires us to think carefully about what, when, and why we share.

AI privacy isn’t a simple yes or no; it’s a spectrum of choices with trade-offs.

If convenience is your priority, treat AI like a thoughtful but indiscreet friend. Adjust your privacy settings, be mindful of what you share, and accept that your words might echo beyond your intended audience.
If security is your compass, invest in paid. Enterprise plans and API access offer the digital equivalent of soundproof rooms – spaces where your conversations stay truly yours.
If transparency guides you, support companies that prioritize privacy. The AI market will ultimately reflect our demands.

AI privacy isn’t about hiding, but choosing, consciously and carefully, what parts of ourselves to share with a technology that never forgets. In these choices, we are not just protecting our data. We’re shaping the future of human-AI interaction, one conversation at a time.

AI’s power lies in both what it can do and in what we allow it to do.

If you’d like to take this discussion further but don’t know where to turn, we can help. Reach out to Doyon Technology Group today at connect@doyontechgroup.com to get the conversation started with an AI expert.

––––––

About the Author

Greg Starling serves as the Head of Emerging Technology for Doyon Technology Group. He has been a thought leader for the past twenty years, focusing on technology trends, and has contributed to published articles in Forbes, Wired, Inc., Mashable, and Entrepreneur magazines. He holds multiple patents and has been twice named as Innovator of the Year by the Journal Record. Greg also runs one of the largest AI information communities worldwide.

Doyon Technology Group (DTG), a subsidiary of Doyon, Limited, was established in 2023 in Anchorage, Alaska to manage the Doyon portfolio of technology companies: Arctic Information Technology (Arctic IT®), Arctic IT Government Solutions, and designDATA. DTG companies offer a variety of technology services including managed services, cybersecurity, and professional software implementations and support for cloud business applications.

Which AI Can You Trust? What Leading LLMs Do with Your Data

Publish Date

Categories

Tags

The Privacy Paradox

How AI Uses Your Data: What You Should Know

The Open Harvesters: Free AI and Your Data

Meta’s AI Ecosystem: The Ultimate Data Machine

DeepSeek: The Hidden Giant

The Premium Promise: Paid Services and Privacy Claims

Microsoft Copilot: A Tale of Two Privacy Policies

OpenAI’s ChatGPT: Choose Your Privacy Level

Google Gemini: The Cost of Integration

Anthropic’s Claude: The Privacy-First Alternative

The Privacy Toolkit: Protecting Your Digital Self

Making Informed Choices: The Price of Digital Intimacy

Quick Links

Contact