Skip to content

fhnw-imvs/fhnw-data-obfuscation

Repository files navigation

Data Masking and Obfuscation

This project is a data masking and obfuscation application that can be used to mask data. It works as a proxy service between the client and its backend service. Moreover, it is deployed as a docker container.

Configuration

Since the masking application is a Spring Boot application, it is configured using the application-<profile>.yml file. The configuration file is located in the src/main/resources directory.

Basic Configuration

The configuration has the following main directives:

  • R2DBC database connection
  • Logging
  • Masking Configuration

The masking properties are the most important properties. The R2DBC database configuration is used to enable the masking application to keep track of masked data mappings, so that it can map masked URI parameters from frontend requests back to the original value.

spring:
  r2dbc:
    url: r2dbc:postgresql://db:5432/masking_db
    username: masking
    password: "${MASKING_DB_PASSWORD}"

logging:
  level:
    ch.fhnw.datamasking: INFO

masking:
  http-proxy:
    authentication-token-parameter-name: token
    frontend:
      auth-token: "${MASKING_FRONTEND_TOKEN}"
    backend:
      token: "${MASKING_BACKEND_TOKEN}"
      base-url: "http://${BACKEND_URL}:${BACKEND_PORT}"
  pepper: "${MASKING_PEPPER:pepper}"
  pseudolists:
    - /nachnamen.txt
    - /vornamen.txt
  paths:
    - path: /Studierende
      properties:
        - property-path: studierendeId
          function: FORMAT
          type: LONG
          config: "########"
          persistence:
            entity-name: student-id
        - property-path: nachname
          function: SUBSTITUTE
          type: STRING
          config: /nachnamen.txt
        - property-path: bild
          function: REPLACE
          type: STRING
          config: "*****"
        - property-path: emailAdmin
          function: FORMAT
          type: STRING
          config: 'aaaaaaaaaaa@aaaaaa.com'
        - property-path: studienjahrgangAnmeldungen.studierendeId
          function: FORMAT
          type: LONG
          config: "########"
          persistence:
            entity-name: student-id

Masking directive

The masking directive is where the main application is configured. The following table describes the directives within the masking directive.:

Directive Description
http-proxy Block for the HTTP proxy related configuration
.authentication-token-parameter-name Name of the GET parameter that carries the token
.frontend Block for the frontend configuration
..auth-token The token that the frontend sends to the masking application. This token is used to authenticate the frontend with the masking application.
.backend Block for the backend configuration
..token The token that the masking application sends to the backend. This token is used to authenticate the masking application with the backend.
..base-url The base URL of the backend service
pepper The pepper that is used for the masking algorithm
pseudolists The list of pseudonym lists that are used for substitution based masking
paths The list of paths that are proxied to the backend and require masking
.path The path that requires masking. Supports path parameters, see the below section for more information.
.properties The list of properties that require masking
..property-path The path to the property that requires masking. Supports nested properties, see the below section for more information.
..function The masking function that should be applied to the property. One of REPLACE, FORMAT, SUBSTITUTE
..type The type of the property. One of STRING, LONG, DOUBLE.
..config The configuration for the masking function. See the below section for more information.
..persistence The persistence configuration for the masked data. See the below section for more information.

Persistence Configuration

Masked values are only stored in the database if the persistence configuration is set. The entity-name works as a composite key together with the original value. It also links the masked property to a path parameter. If there are multiple paths that return the same entity's property, the same entity name should be used. Persisted properties are assumed to be identifiers, and as such there is collision detection in place.

Path Parameters Configuration

The path configuration supports path parameters. Each path segment follows the following format: {<data-type>:<entity-name?>}. Permitted data types are long and string. If the entity name is not specified, the parameter's data type will be validated, but no unmasking will be performed.

Full path example: /Studierende/{long:student-id}/Subpath/{string}

In this example, the first path parameter must be a long and is linked to the entity name student-id. The second path parameter can be any string.

Check out the HttpPathMatcherService for more information on how the path parameters are matched.

Property Path Configuration

The property path consists of attribute names joined by dots. It supports object and array nested properties. Arrays are "transparently" stepped through, meaning that the masking configuration is applied to all elements of the array.

Given the following JSON object:

{
  "top": [
    {
      "nested": {
        "name": "Joel"
      }
    },
    {
      "nested": {
        "name": "Mike"
      }
    }
  ]
}

The property path for all names would be top.nested.name.

Masking Function Configuration

The configuration varies by masking function, but for now, the configuration is always a string.

SUBSTITUTE: The configuration is the path to the pseudonym list on the classpath. The list also needs to be listed in the pseudolists directive.

REPLACE: The configuration is the replacement string.

FORMAT: The configuration is a format string. The following characters have a special meaning:

  • #: A digit
  • a: A lowercase letter
  • A: An uppercase letter

All other characters are treated as literals. Special characters can be escaped using a backslash (\).

Building the application

If you clone this repository locally you can build the application using the following maven Lifecycle commands:

# Package into JAR
mvn package

# Build (non-native) Docker image
mvn spring-boot:build-image

# Build native (ahead-of-time compiled) Docker image
# You can use (append) the following arguments to customize the build (also for non-native images):
# "-Dspring-boot.build-image.imageName=$IMAGE_NAME_TAG" - set the image name
# "-Dspring-boot.build-image.publish=true"              - publish the image to a registry
# "-Dspring-boot.build-image.createdDate=now"           - set the image creation date (by default is is epoch for reproducible builds, but that doesn't play nice with GitLab)
# "-Dspring.profiles.active=prd"                        - set the active profile(s)
# PLEASE NOTE: You need to activate your target profile to build the native image, because that influences the available classes at runtime
mvn -Pnative spring-boot:build-image

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors