Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
27 changes: 27 additions & 0 deletions .github/workflows/lint.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
name: Lint
on:
push:
branches: [main]
pull_request:
branches: [main]

jobs:
lint:
name: Defaults & Misspelling
runs-on: ubuntu-latest

steps:

- name: Setup Go
uses: actions/setup-go@v6
with:
go-version: "1.25"

- name: Check out code
uses: actions/checkout@v4

- name: Lint
uses: golangci/golangci-lint-action@v9
with:
version: v2.12
args: --enable misspell
32 changes: 32 additions & 0 deletions .github/workflows/module.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
name: Go Module

on:
push:
branches: [main]
pull_request:
branches: [main]

jobs:
test:
name: Test
runs-on: ubuntu-latest
strategy:
matrix:
go-version: ["1.25", "1.26"]

steps:
- name: Setup Go
uses: actions/setup-go@v6
with:
go-version: ${{ matrix.go-version }}

- name: Check out code
uses: actions/checkout@v4

- name: Install Dependencies
run: go mod download
env:
GOPROXY: https://proxy.golang.org,direct

- name: Run Unit Tests
run: make test
25 changes: 25 additions & 0 deletions .github/workflows/security.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
name: Go Security Checker
on:
push:
branches: [main]
pull_request:
branches: [main]

jobs:
gosec:
name: Inspect for security problems
runs-on: ubuntu-latest

steps:
- name: Setup Go
uses: actions/setup-go@v6
with:
go-version: "1.25"

- name: Check out code
uses: actions/checkout@v4

- name: Run Gosec scanner
uses: securego/gosec@master
with:
args: -exclude-dir=privacytest ./...
18 changes: 6 additions & 12 deletions Makefile
Original file line number Diff line number Diff line change
@@ -1,19 +1,13 @@
# DOCKER_NETWORK = lambda-local
.PHONY: lint security test test/cov_html test/cov_total bench bench/profile doc

# DYNAMODB_PORT = 8070
# DYNAMODB_VOLUME = dynamodb-local-v2.0

# KMS_PORT = 8090

# export DYNAMODB_ENDPOINT = http://localhost:$(DYNAMODB_PORT)
# export KMS_ENDPOINT = http://localhost:$(KMS_PORT)

.PHONY: lint
lint:
golangci-lint run --enable misspell

security:
gosec -exclude-dir=privacytest ./...

test:
packages=`go list ./... | grep -v privacytest`; \
packages=$$(go list ./... | grep -v privacytest); \
go test -race -cover $$packages -coverprofile coverage.out -covermode atomic

test/cov_html:
Expand All @@ -29,4 +23,4 @@ bench/profile:
go tool pprof -alloc_objects mem.prof

doc:
godoc -http=:6060
godoc -http=:6060
159 changes: 159 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,159 @@
privacy-engine
============
[![Go Module](https://github.com/ln80/privacy-engine/actions/workflows/module.yml/badge.svg)](https://github.com/ln80/privacy-engine/actions/workflows/module.yml)
[![GoDoc](https://godoc.org/github.com/ln80/privacy-engine?status.svg)](https://godoc.org/github.com/ln80/privacy-engine)

A Go library for field-level encryption, crypto-shredding, and tokenization of sensitive data in structs. Built on top of [struct-sensitive](https://github.com/ln80/struct-sensitive), it uses struct tags to identify PII fields and applies AES-256-GCM encryption per data subject.

Designed for immutable stores (event logs, audit trails) where you can't delete records but need to comply with data erasure requirements (GDPR Article 17) via cryptographic erasure.

## Installation

```bash
go get github.com/ln80/privacy-engine
```

## Features

- **Field-level encryption** using AES-256-GCM with per-subject data encryption keys (DEK)
- **Crypto-shredding** with graceful mode: disable keys first, recover within a grace period, then hard-delete
- **Streaming encryption** for large payloads with chunk-based authenticated encryption
- **Tokenization** to replace sensitive identifiers with opaque surrogate tokens
- **Key derivation** via HKDF-SHA256 for purpose-scoped keys from a subject's DEK
- **Multi-tenancy** with namespace isolation and a Factory for managing Protector instances
- **Pluggable backends** via `KeyEngine`, `Encryptor`, and `TokenEngine` interfaces

## Quick Start

```go
import (
"context"

"github.com/ln80/privacy-engine"
"github.com/ln80/privacy-engine/memory"
)

type User struct {
ID string `pii:"subjectID"`
Email string `pii:"data,replace=redacted"`
Country string
}

func main() {
ctx := context.Background()
protector := privacy.NewProtector("my-namespace", memory.NewKeyEngine())

user := User{ID: "user-123", Email: "alice@example.com", Country: "BE"}

// Encrypt PII fields in-place
_ = protector.Encrypt(ctx, &user)
// user.Email is now: "ENC..dXNlci0xMjM=.Base64CipherText..."
// user.Country is unchanged

// Decrypt back
_ = protector.Decrypt(ctx, &user)
// user.Email is "alice@example.com" again

// Crypto-shred: forget the subject's key
_ = protector.Encrypt(ctx, &user)
_ = protector.Forget(ctx, "user-123")
_ = protector.Decrypt(ctx, &user)
// user.Email is now "redacted" (from the replace tag option)
}
```

## Struct Tags

Fields are tagged using `pii`, `sensitive`, or `sens` (interchangeable):

| Tag | Purpose | Example |
|-----|---------|---------|
| `subjectID` | Identifies the data subject (one per struct) | `pii:"subjectID"` |
| `data` | Marks a field as sensitive | `pii:"data"` |
| `data,replace=X` | Replacement value when the subject is forgotten | `pii:"data,replace=deleted"` |
| `dive` | Recurse into nested structs | `pii:"dive"` |

## Architecture

```
┌──────────────┐
│ Protector │ ← Main API: Encrypt, Decrypt, Forget, Recover, Tokenize
└──────┬───────┘
┌────┴─────┐ ┌────────────┐ ┌──────────────┐
│KeyEngine │ │ Encryptor │ │ TokenEngine │
│(keys CRUD)│ │(AES-256-GCM)│ │(value↔token) │
└──────────┘ └────────────┘ └──────────────┘
```

- **KeyEngine** manages encryption key lifecycle (create, get, disable, re-enable, delete). Implementations: in-memory (for tests), DynamoDB + KMS (production, see [privacy-engine.elastic](https://github.com/ln80/privacy-engine.elastic)).
- **Encryptor** handles the actual encryption. Default: AES-256-GCM with random nonces and namespace-bound AAD.
- **TokenEngine** manages value-to-token mappings for pseudonymization. Optional.

## Tokenization

Replace sensitive identifiers with opaque tokens early in the pipeline:

```go
tokens, _ := protector.Tokenize(ctx, privacy.TokenDataSlice("alice@example.com"), privacy.WithPrefix("sub_"))
surrogateID := tokens.Get("alice@example.com").Token
// Use surrogateID downstream instead of the real email
```

## Streaming Encryption

For large payloads (files, attachments):

```go
encReader, _ := protector.EncryptStream(ctx, "user-123", plaintextReader)
// encReader emits authenticated ciphertext in 4MB chunks

decReader, _ := protector.DecryptStream(ctx, "user-123", encReader)
// decReader emits the original plaintext
```

## Factory & Monitoring

For multi-tenant applications, use the Factory to manage one Protector per namespace:

```go
factory := privacy.NewFactory(func(namespace string) privacy.Protector {
return privacy.NewProtector(namespace, keyEngine, opts...)
})

// Periodically clears key caches and evicts idle protectors
factory.Monitor(ctx)

protector, clearFn := factory.Instance("tenant-abc")
defer clearFn()
```

## Configuration

`NewProtector` accepts functional options:

| Option | Default | Description |
|--------|---------|-------------|
| `CacheEnabled` | `true` | Wrap engines with in-memory TTL cache |
| `CacheTTL` | `20s` | Cache time-to-live |
| `GracefulMode` | `true` | Disable keys before deleting (allows recovery) |
| `Encryptor` | AES-256-GCM | Encryption algorithm |
| `TokenEngine` | `nil` | Token engine for pseudonymization |

## Production Backend

See [privacy-engine.elastic](https://github.com/ln80/privacy-engine.elastic) for a serverless implementation using AWS DynamoDB (key/token storage) and KMS (master key management), deployable via SAM.

## Wire Format

Encrypted field values are stored as:

```
ENC.<version>.<base64(subjectID)>.<base64(ciphertext)>
```

The `ENC.` prefix is used for idempotency detection (fields already encrypted are not re-encrypted). Legacy `<pii:...` format is accepted during decryption for backward compatibility.

## License

MIT
6 changes: 3 additions & 3 deletions aes/256_gcm.go
Original file line number Diff line number Diff line change
Expand Up @@ -113,7 +113,7 @@ func (e *aes256gcm) EncryptStream(namespace string, key core.Key, r io.Reader) (
pr, pw := io.Pipe()

go func() {
defer pw.Close()
defer pw.CloseWithError(nil) //nolint:errcheck // CloseWithError always returns nil

// Write header
header := make([]byte, 1+len(baseNonce))
Expand All @@ -140,7 +140,7 @@ func (e *aes256gcm) EncryptStream(namespace string, key core.Key, r io.Reader) (
chunkIndex++

var lenBuf [4]byte
binary.BigEndian.PutUint32(lenBuf[:], uint32(len(ciphertext)))
binary.BigEndian.PutUint32(lenBuf[:], uint32(len(ciphertext))) // #nosec G115 -- ciphertext bounded by 4MB chunk size

if _, err := pw.Write(lenBuf[:]); err != nil {
pw.CloseWithError(err)
Expand Down Expand Up @@ -196,7 +196,7 @@ func (e *aes256gcm) DecryptStream(namespace string, key core.Key, r io.Reader) (
pr, pw := io.Pipe()

go func() {
defer pw.Close()
defer pw.CloseWithError(nil) //nolint:errcheck // CloseWithError always returns nil

var chunkIndex uint64

Expand Down
2 changes: 1 addition & 1 deletion aes/util.go
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ func deriveNonce(base []byte, counter uint64) []byte {
copy(nonce, base)

for i := 0; i < 8; i++ {
nonce[len(nonce)-1-i] ^= byte(counter >> (8 * i))
nonce[len(nonce)-1-i] ^= byte(counter >> (8 * i)) // #nosec G115 -- intentional byte extraction
}

return nonce
Expand Down
6 changes: 3 additions & 3 deletions example_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ func Example() {
newProtector := func(namespace string) privacy.Protector {
return privacy.NewProtector(namespace, memory.NewKeyEngine(), func(pc *privacy.ProtectorConfig) {
// Token engine is optional.
// if not provided, the protector service will panic when trying to Tokenize/Detokenize sensitive data
// If not provided, tokenization methods return core.ErrTokenEngineNotConfigured.
pc.TokenEngine = memory.NewTokenEngine()

// If cache is enabled then the service will decorates engines
Expand Down Expand Up @@ -88,8 +88,8 @@ func Example() {

// Encrypted Output ex:
// Profile{
// Email: "<pii::NDQ1ZDRhYTMtNWUwNS00MDcxLWEwNzAtMDlhMTM5MTFkM2Ex:7Q61HTCUT+XZtzzGp3HsVoHk6o74kwdEHqY46kB4eflXnRwswgHRVlApRg7mp4bNH5zSppV2u40=",
// Fullname: "<pii::NDQ1ZDRhYTMtNWUwNS00MDcxLWEwNzAtMDlhMTM5MTFkM2Ex:mNyZmcvUHTAKTMC+uY6f77bJ3sZ5+NYBZwWKj8zZ0sA4j8mOPz8188sV",
// Email: "ENC..NDQ1ZDRhYTMtNWUwNS00MDcxLWEwNzAtMDlhMTM5MTFkM2Ex.7Q61HTCUT+XZtzzGp3HsVoHk6o74kwdEHqY46kB4eflXnRwswgHRVlApRg7mp4bNH5zSppV2u40=",
// Fullname: "ENC..NDQ1ZDRhYTMtNWUwNS00MDcxLWEwNzAtMDlhMTM5MTFkM2Ex.mNyZmcvUHTAKTMC+uY6f77bJ3sZ5+NYBZwWKj8zZ0sA4j8mOPz8188sV",
// Role: "Teacher",
// }

Expand Down
5 changes: 4 additions & 1 deletion protector.go
Original file line number Diff line number Diff line change
Expand Up @@ -378,6 +378,9 @@ func (p *protector) DeriveSubjectKey(ctx context.Context, subID, purpose string)
if subID == "" {
return nil, errors.New("empty subject id")
}
if len(subID) > 1<<16-1 {
return nil, errors.New("subject id too long")
}

keys, err := p.KeyEngine.GetKeys(ctx, p.namespace, []string{subID})
if err != nil {
Expand All @@ -390,7 +393,7 @@ func (p *protector) DeriveSubjectKey(ctx context.Context, subID, purpose string)
}

info := make([]byte, 2+len(subID)+len(purpose))
binary.BigEndian.PutUint16(info, uint16(len(subID)))
binary.BigEndian.PutUint16(info, uint16(len(subID))) // #nosec G115 -- length validated above
copy(info[2:], subID)
copy(info[2+len(subID):], purpose)
r := hkdf.New(sha256.New, parentKey, []byte("privacy-engine-v1"), info)
Expand Down
Loading