The developer setup guide for developing Redbox on your own machine.
- Install Python
- Install Poetry
- Install Project Dependencies with Poetry
- Setup VSCode
- Setup Environment Variables
- Running the Project Locally
- Git Workflows
- LLM Evaluation
- Iconography
To ensure everyone uses the same Python version, follow one of the two options below depending on your preference or existing setup
Note: If you alreday have pyenv setup and would like to switch, ensure you've commented all pyenv initialization lines in your ~/.zshrc or ~/.bashrc. You may also need to restart your terminal and remove any existing pyenv venvs (Check which python and poetry env info. The base should point to the asdf-installed python).
Installation instructions here
asdf plugin add pythonFrom project root:
asdf install pythonThis installs and sets the local Python version for the project.
Because asdf uses shims, Poetry needs to be explicitly told what Python to use. From the project root and each individual app, run:
poetry env use $(asdf which python)Installation instructions here
Restart your terminal or run
source ~/.zshrc/source ~/.bashrcCheck the projects .tool-versions or pyproject.toml. Then from the projet root, run:
pyenv install $(awk '/^python / {print $2}' .tool-versions)
pyenv local $(awk '/^python / {print $2}' .tool-versions)This sets the local version in the project repository
Poetry will automatically detect the pyenv-managed Python version.
Poetry is not installed automatically when installing Python with asdf or pyenv, so you must install once on your host machine.
- We recommend using Poetry's official installer:
curl -sSL https://install.python-poetry.org | python3 -
- After installation, ensure the Poetry binary is on your PATH. You may want to add the following to the end of your
~/.zshrcor~/.bashrc(dependent on which shell you use). This means the binary will be loaded whenever you open a new terminal.
export PATH="$HOME/.local/bin:$PATH"
Currently, we use poetry to manage our python packages. There are 4 pyproject.tomls
- redbox - core AI package
- django-app - django webserver and background worker
- root - Integration tests, QA, and docs
- notebooks - Jupyter notebooks
Once Python has been configured and installed using either pyenv or asdf, and Poetry installed - from each applications root directory (django_app, redbox, notebooks), run the following:
poetry installRun these to confirm:
python --version
# Should output the correct Python version
poetry run python --version
# Should also output the correct Python version
# From each application root:
poetry env info
# Should show correct path to virtualenv using that Python versionVSCode is the IDE of choice. The .vscode/ directory is used for defining project-wide VSCode IDE settings.
Ensure your python interpreter is set to the root venv Python binary (should be ./venv/bin/python or ./.venv/bin/python).
Once the correct interpreter is selected it should display the pyproject.toml name ie. Python 3.12.7 (redbox-root-py3.12). Also, opening any new terminals in VSCode will automatically activate that environment (ie. source venv/bin/activate or source .venv/bin/activate).
To make use of the VSCode Workspaces setup open the workspace file .vscode/redbox.code-workspace. This will open the relevant services as roots in a single workspace. The recommended way to use this is:
- Create a venv in each of the main service directories (redbox, django-app) this should be in a directory called
venv - Configure each workspace directory to use it's own venv python interpreter. NB You may need to enter these manually when prompted as
./venv/bin/python
The tests should then all load separately and use their own env.
The devcontainer currently is not supported for project-wide dependency setup so it is generally recommended to do development on your host machine.
We use .env files to populate the environment variables for local development. When cloning the repository the file .env.example will be populated.
To run the project:
cp .env.example .envcp .aws/credentials.example .aws/credentials
Then set the relevant environment variables.
Typically this involves setting the following variables in .aws/credentials (after running cp .aws/credentials.example .aws/credentials):
AWS_ACCESS_KEYAWS_SECRET_ACCESS_KEYAWS_SESSION_TOKENAWS_CREDENTIAL_EXPIRATION- default 30
It is best to leave hostnames out of the .env file. These are then set manually by vscode tasks or pulled from a deployment .env like .env.test/.env.integration
Redbox can use different backends for chat and embeddings, which are used is controlled by env vars. The defaults are currently to use Bedrock for both chat and embeddings but other providers can be used (and pointed to their relevant compliant local service). The relevant env vars for overriding to use bedrock's titan model for embeddings are:
EMBEDDING_BACKEND- usuallyamazon.titan-embed-text-v2:0
.env and .aws/credentials are in .gitignore and should not be committed to git
How to run the project locally. This includes setting up AWS credentials.
To view all the build commands, check the Makefile that can be found here.
The project currently consists of multiple docker images needed to run the project in its entirety. If you only need a subsection of the project running, for example if you're only editing the django app, you can run a subset of the images. The images currently in the project are:
elasticsearchminiodbdjango-appworker
To build the images needed to run the project, use this command:
make buildor
docker compose buildOnce those images have built, you can run them using:
make runor
docker compose upSome parts of the project can be run independently for development, for example the django application, which can be run with:
docker compose up django-appSometimes, you might have used too much memory from previous docker runs. Memory need to be flushed before running docker. You can use the following commands:
docker system prune --all --force
DOCKER_DEFAULT_PLATFORM=linux/amd64 docker compose build
DOCKER_DEFAULT_PLATFORM=linux/amd64 docker compose up
# The DOCKER_DEFAULT_PLATFORM=linux/amd64 is only needed for certain MAC OS. You can omit this by adding the variable to your .envrc file.We recommend installing direnv to prevent having to specify DOCKER_DEFAULT_PLATFORM for each docker command. To install:
brew install direnvThen add the following to your ~/.zshrc or appropriate shell as seen here.
eval "$(direnv hook zsh)"
plugins=(git direnv)For any other commands available, check the Makefile here.
You can also choose to run the project with the VSCode Python Debugger, allowing you to create breakpoints in the code for programmatic inspection.
Warning
Please be aware this debugging implementation is relatively new and has some nuances due to deviations from the docker build configuration.
- Select a python file to view selected interpreter
- Set to
django_app/venv/bin/pythonordjango_app/.venv/bin/python - Interpeter should display as
(redbox-app-py3.12)
- Go to
Run and Debugtab on left side of VSCode window - Go to green play button dropdown and select
Full Stack Dev (Frontend + Django) - Click play button - should spin up dependency containers, build frontend, and then run main app with python debugger
- Open command palette - CMD + Shift + P
- Select
Tasks: Run Task - Run
Django: QCluster
Tests are split into different commands based on the application the tests are for. For each application there is a separate make command to run those tests, these are:
For the django app:
make test-djangoFor the core AI:
make test-redboxFor integration tests:
make test-integrationWe'll need to create a superuser to log in to the Django admin page, to do this run the following steps:
- Log into Redbox at http://localhost:8080/sign-in
- Run
make superuserin your terminal - Use the email you log into DBT services with (if you're not sure navigate to http://localhost:8080/admin).
Once the app is up and running, head to http://localhost:8080/admin/redbox_core/chatllmbackend/
Create a new chat llm backend with the following:
Name:
# Example:
anthropic.claude-3-sonnet-20240229-v1:0This may change over time, to get the correct ID, head to amazon bedrock in the aws console > Foundation Models > model catalog > Claude 3 Sonnet > Model ID
Provider:
BedrockIs default:
TrueEnabled:
TrueSave and head to http://localhost:8080/admin/redbox_core/aisettings/
Ensure the default settings uses the chat backend you just created and hit save again.
Chat and document uploads should now work as expected.
To recieve responses from the LLM you will need to have access to redboc aws account (See another member of the team about requesting access).
To configure your aws profile, run the following command or manaully update your ~/.aws/config file with assistance from another team member.
aws configure ssoNote: If using a non-default profile name, (e.g. redbox), please make sure you create an .envrc file with the AWS_PROFILE value set. See .envrc.example
Once access has been provided and credentials configured, run the aws-login script in the project root and follow the instructions on-screen to connect.
./aws-login.shOnce authenticated you should have a .aws directory within the project root and notebooks app with a credentials file populated. This directory is added to the gitignore and should NOT be commited.
Note: This script should be run periodically (daily) as the credentials will expire relatively soon.
There are a number of notebooks available, in various states of working! The Redbox core app is able to be created in a notebook and run to allow easy experiementation without the django side. agent_experiments.ipynb shows this best currently.
In order to run notebooks in vscode, you will need to use the virtualenv created by poetry within the notebooks directory. If this does not appear as an option, you may need to add the notebooks directory path to your vscode python settings:
- Open vscode settings:
[cmd + ,] - Search:
python.venvFolders, - Add the path to
./redbox/notebooks
You may also want to add the path for the other apps in order to select the correct interpreter during development.
Some notebooks may require specific environment variables to run. For non-sensetive variables that apply to all notebooks, add them to .env.notebook and override the root .env at the top of your notebook like so:
dotenv .env
dotenv -o ./.env.notebookFor sensetive environment variables, please create a seperate notebooks/.env file within the notebooks directory and add them there. You can then override the .env and .env.notebook in the same way.
dotenv .env
dotenv -o ./.env.notebook
dotenv -o ./.envThe workflows for using Git.
Consistent branch names help maintain a clean and predictable workflow. CI will fail if your branch does not follow conventions. Use the following prefixes:
feature/<name>— New features or enhancementschore/<name>— Maintenance tasks that don’t affect functionalitybugfix/<name>— Non-critical bug fixeshotfix/<name>— Urgent fixes for production issuesdependabot/<name>— Automated dependency updatessecurity/<name>— Security-related changes
- Download and install pre-commit to benefit from pre-commit hooks
pip install pre-commitpre-commit install
Notebooks with some standard methods to evaluate the LLM can be found in the notebooks/ directory.
You may want to evaluate using versioned datasets in conjunction with a snapshot of the pre-embedded vector store.
We use elasticsearch-dump to save and load bulk data from the vector store.
Install Node and npm (Node package manager) if you don't already have them. We recommend using nvm (Node version manager) to do this.
If you're familiar with Node or use it regularly we recommend following your own processes or the tools' documentation. We endeavour to provide a quickstart here which will install nvm, Node, npm and elasticsearch-dump globally. This is generally not good practise.
To install nvm:
curl -o- https://raw.githubusercontent.com/nvm-sh/nvm/v0.39.7/install.sh | bashRestart your terminal.
Install Node.
nvm install node
nvm use --ltsVerify installation.
node --versionInstall elasticsearch-dump globally.
npm install elasticdump -gThe default indicex we want is redbox-data-chunk
Dump these to data/elastic-dumps/ for saving or sharing.
elasticdump \
--input=http://localhost:9200/redbox-data-chunk \
--output=./data/elastic-dumps/redbox-data-chunk.json \
--type=dataIf you've been provided with a dump from the vector store, add it to data/elastic-dumps/. The below assumes the existance of redbox-data-chunk.json in that directory.
Consider dumping your existing indices if you don't want to have to reembed data you're working on.
Start the Elasticsearch service.
docker compose up -d elasticsearchLoad data from your JSONs, or your own file.
elasticdump \
--input=./data/elastic-dumps/redbox-data-chunk.json \
--output=http://localhost:9200/redbox-data-chunk \
--type=dataIf you're using this index in the frontend, you may want to upload the raw files to MinIO, though that's out of scope for this guide.
We currently use Google icons. When adding new icons, ensure the following customizations are made:
Weight: 300
Grade: 0
Optical Size: 24px
Style:
- Material Symbols (new)
- Rounded