Data & Privacy

How we handle your data with security, transparency, and respect for your requirements.

When you work with AI systems, especially ones trained on or connected to your proprietary data, trust is essential. You need to know exactly how your data is handled, where it's stored, who has access to it, and what safeguards are in place.

This page explains our approach to data handling and privacy. We design systems with security and privacy as foundational requirements, not afterthoughts. Our default is to minimize data handling and maximize your control.

Core Principles

1. Minimize Raw Data Handling

Wherever possible, we avoid storing or handling your raw proprietary documents and data. Instead, we work with derived artifacts—embeddings, summaries, metadata, and structured extracts—that preserve utility while reducing exposure.

For example, when building a RAG system, we can create embeddings from your documents and store those embeddings in your infrastructure while the original documents remain in your existing document management system. The AI accesses embeddings for retrieval, not full documents.

2. Client-Side and Transient Processing

When we do need to process raw data—for fine-tuning models, creating embeddings, or performing one-time data transformations—we design the workflow so that data is either processed entirely within your environment, or processed on our side transiently and then deleted.

We do not create permanent copies of your proprietary data on our infrastructure unless you explicitly choose that architecture and approve it. When transient processing is needed, we use secure transfer protocols, encrypted storage, and automated deletion policies.

3. No Cross-Client Data Use

We never use your data to train models for other clients. Your data is yours. If we fine-tune a model using your proprietary information, that model is exclusively for your use. If we build a RAG system on your knowledge base, no other client has access to those embeddings or retrieved content.

This is non-negotiable. We do not aggregate, anonymize, or reuse client data for our own purposes or for other client projects.

4. Infrastructure Choice and Control

We are infrastructure-agnostic. If your requirements dictate that all processing and storage must happen within your own cloud account, on-premises, or in a specific geographic region, we design for that. You control where data lives and where computation happens.

For many clients, this means models and embeddings run entirely in their AWS, Azure, or GCP environment, with no data leaving their control. For others, on-prem or dedicated hardware makes sense. We adapt to your constraints, not the other way around.

5. Minimal Logging and Telemetry

Application logs and telemetry are necessary for debugging and monitoring, but they can inadvertently capture sensitive data. We design logging to avoid capturing raw content wherever possible. When logs are necessary, we anonymize or redact sensitive fields and enforce strict retention policies.

For systems hosted on your infrastructure, you control logging configuration and retention. For systems we host, we default to minimal logging and will work with you to define what is acceptable.

Compliance & Regulatory Considerations

Many of our clients operate in regulated industries—legal, healthcare, financial services—where data handling is not just a preference but a legal requirement. We take these constraints seriously and design systems that meet your regulatory obligations.

HIPAA Compliance (Healthcare)

For healthcare organizations, we can design systems where Protected Health Information (PHI) never leaves your HIPAA-compliant infrastructure. Models and embeddings run on your servers or in your BAA-covered cloud environment. We will sign BAAs where required and appropriate.

Attorney-Client Privilege (Legal)

For law firms and legal departments, we ensure that privileged communications and case materials are processed and stored in ways that preserve privilege. This typically means on-premises or private cloud infrastructure with strict access controls and audit logging.

Financial Regulations (FINRA, SEC, etc.)

For financial services firms, we help navigate data residency requirements, audit trails, and recordkeeping obligations. We design systems that log what's required while protecting client confidentiality and meeting regulatory retention and deletion policies.

GDPR and Data Privacy Laws

For organizations subject to GDPR or similar privacy frameworks, we support data minimization, purpose limitation, access controls, and the ability to fulfill data subject requests (access, deletion, portability). Systems can be designed to run entirely within specific geographic regions if required.

Security Practices

Beyond privacy, we follow security best practices in all systems we build:

Encryption

Data in transit is encrypted using TLS. Data at rest is encrypted using industry-standard methods (AES-256 or equivalent). Encryption keys are managed securely and, where possible, controlled by you.

Access Controls

We implement least-privilege access controls. Only authorized personnel have access to your systems and data, and access is logged and auditable. We support SSO and multi-factor authentication where applicable.

Monitoring & Alerts

Production systems include monitoring for anomalous behavior, unauthorized access attempts, and performance degradation. Alerts are configured to notify appropriate parties promptly.

Secure Development

We follow secure coding practices, perform code reviews, manage dependencies for vulnerabilities, and test systems before deployment. We can work within your security review and approval processes.

Website Analytics

We use Plausible Analytics to understand how visitors use this website. Plausible is a privacy-friendly analytics platform that aligns with our commitment to respecting your data.

What This Means for You

→Cookieless: Plausible does not use cookies or store any personal data in your browser.
→Aggregated data only: We collect aggregated metrics (page views, visitor counts, referrers) to understand overall site usage and improve our content.
→No individual profiling: We do not create profiles of individual visitors or track you across sessions or websites.
→No third-party sharing: Your browsing behavior is not shared with advertisers or other third parties.

This approach allows us to make data-informed decisions about the site without compromising your privacy. If you'd like to learn more about Plausible's data practices, see their data policy.

Transparency & Questions

Every organization has different privacy and security requirements. What's appropriate for one client may not be sufficient for another. We don't believe in one-size-fits-all policies.

During discovery, we ask detailed questions about your requirements, constraints, and risk tolerance. We explain tradeoffs clearly—for example, how different architecture choices affect cost, performance, and security. Our goal is to help you make informed decisions, not to dictate a single approach.

If you have specific questions about how we would handle your data, or if you need to review technical architecture diagrams, data flow documentation, or security certifications as part of your evaluation, we're happy to provide those. Privacy and security are too important to be vague about.

Have Questions About Data Handling?

Let's discuss your specific privacy and security requirements in detail.

Schedule a Confidential Conversation

Or reach us via email: hello@fifthsuit.ai