Blog | ComplyTech

AI Pipelines 8 min read

How to Strip PII Before Sending Data to ChatGPT

We were three weeks into building an internal support tool when someone asked: "Wait, are we sending customer emails to OpenAI?" We were. Every single ticket. Here's how we fixed it with one API call.

Read article

GDPR 7 min read

Anonymise CSV Data for GDPR Compliance (Without Losing Your Mind)

Last month a colleague in finance asked: "I need to send this spreadsheet to the new analytics vendor. It's got customer names and emails in it. Do I need to do anything with that?" Yes. Yes you do.

Read article

Engineering 10 min read

Your Staging Database Is Full of Real Customer Data (And Everyone Knows It)

Nobody talks about this, but almost every engineering team has done it: copied the production database into staging so the test environment has "realistic" data. Here's how to keep the realism without the risk.

Read article

Engineering 5 min read

We Failed a Client Security Questionnaire Because of Our Staging Database

We hit question 47 of an enterprise security questionnaire: "Do non-production environments contain real customer data?" The honest answer was yes, 38,000 customers. Here's what we did before the SOC 2 audit.

Read article

Compliance 5 min read

The EU AI Act Starts Enforcement in August 2026. Here's What That Means for Your LLM Pipeline.

Most LLM applications won't be classified as high-risk AI. But GDPR already imposes data minimisation and DPA obligations on AI pipelines, and the Act's enforcement deadline is making compliance teams look harder at data flows they've been overlooking.

Read article

Engineering 6 min read

We Accidentally Logged 6,000 Email Addresses to Datadog

A debug logging statement stayed in production for three months. 6,000 customers' email addresses ended up in a third-party logging platform. Here's how we stopped it happening again.

Read article

GDPR 4 min read

Our Analytics Vendor Asked for Customer Data. We Almost Sent It Unredacted.

Our marketing ops lead pulled a 47,000-row customer export and was about to attach it to a vendor reply. Here's why deleting the name column isn't enough, and what we do instead.

Read article

AI Pipelines 5 min read

How to Build an Internal AI Tool Without Your Compliance Team Blocking It

The prototype gets blocked. The team is frustrated. Compliance isn't wrong. The fix isn't legal; it's architectural. Strip PII before the LLM and the compliance conversation goes very differently.

Read article

Engineering 5 min read

Stop Putting Real Customer Emails in Your CI Pipeline

500 real customers committed as test fixtures. Their names, emails, and phone numbers now in Git history, CI logs, and artefact storage. This isn't a theoretical concern; it's a data breach.

Read article

AI Pipelines 5 min read

Your RAG Pipeline Is Leaking Customer Data Into Vector Embeddings

When you embed a document chunk containing a customer's name and address, that personal data lives in your vector store. Here's the GDPR problem hiding in your RAG system and how to fix it before ingestion.

Read article

Compliance 4 min read

I Reviewed 50 Companies' AI Privacy Policies. Most of Them Aren't Telling the Whole Story.

43 used a third-party LLM API. Only 11 mentioned it in their privacy policy. 3 described any PII sanitisation. The rest rely on vague assurances and enterprise terms that don't say what they imply.

Read article

GDPR 4 min read

The GDPR Fine Calculator: What a Spreadsheet Leak Actually Costs

The ICO fines companies you've never heard of. A realistic cost breakdown for a mid-market spreadsheet incident: direct fine, legal costs, breach notifications, customer churn. The total might surprise you.

Read article

AI Pipelines 5 min read

A Checklist for AI Features That Won't Get Blocked by Legal

After watching three AI feature projects get killed by compliance reviews, I started keeping notes on what ships and what stalls. The difference is always the same: data handling addressed before someone asked.

Read article