Start GUAC with PostgreSQL using Docker Compose
If you’d prefer, you can set up GUAC with Kubernetes with the experimental Helm charts provided by Kusari. Note that these helm charts are still experimental and are hosted in a third-party repo and may not be synchronized with the GUAC repo.
This tutorial will walk you through how to deploy a full persistant GUAC deployment with a PostgreSQL database backend using Docker Compose.
Prerequisites
Optional - Verify images and binaries
Step 1: Download GUAC
-
Download the GUAC CLI
guaccollect
binary for your machine’s OS and architecture from the latest GUAC release if you have not already done so. For example:- Linux x86_64 :
guaccollect-linux-amd64
- MacOS x86_64 :
guaccollect-darwin-amd64
- Windows x86_64 :
guaccollect-windows-amd64.exe
- Linux x86_64 :
-
Rename the binary to
guaccollect
, mark it executable if necessary, and add it to your shell’s path. -
Download the compose yaml from the latest GUAC release.
-
Optional: If you want test data to use, download and unzip GUAC’s test data.
Step 2: Start the GUAC server
-
From the directory you downloaded the
guac-postgres-compose.yaml
, run:docker compose -f guac-postgres-compose.yaml up
-
Verify that GUAC is running:
docker compose ls
You should see:
NAME STATUS CONFIG FILES dirname running(9) /files/dirname/guac-postgres-compose.yml
If you don’t see the above, run
docker compose down
and try starting up GUAC again. Because Docker Compose caches the containers used, the unclean state can cause issues.
GUAC Ports
Port Number | GUAC Component | Note |
---|---|---|
8080 | GraphQL server | To see the GraphQL playground, visit http://localhost:8080. |
2782 | Collector Subscriber | This service is notified whenever you run a collector, such as guacone collect files below. Then subscribers can collect more data on any packages ingested. |
4222 | Nats | Ingestion pubsub endpoint |
8081 | REST server | GUAC endpoint for simplified REST queries. |
GUAC Volume Mounts
Two directories are created in the same directory as the compose file, these are used for:
-
blobstore: This directory is a temporary storage of documents that are being queued to the ingestor.
-
postgres-data: This directory contains the postgres database files.
Step 3: Start Ingesting Data
Before ingesting data, the blobstore directory must be writable by your local user. Because it was created by a the Ingestor docker container, it will have a different user id.
sudo chmod a+w ./blobstore
You can run the guaccollect files
ingestion command to load data into your GUAC deployment. For example we can ingest the sample guac-data
data. However, you may ingest what you wish to here instead.
guaccollect files --service-poll=false --blob-addr=file://./blobstore?no_tmp_dir=true ./guac-data-main/docs
This command will take all documents under the ./guac-data-main/docs
directory and ingest them into GUAC by placing messages on the Nats pubsub queue, and also placing the documents in the ./blobstore
directory for the ingestor to pick up.
Switch back to the compose window and you will soon see that the Ingestor is peforming the parsing and GraphQL mutations to add the documents to GUAC. Also, the deps.dev collector and OSV certifier have recognized the new packages and are looking up dependency and vulnerability information for them.
Step 4: Check that everything is ingesting and running
Run:
curl 'http://localhost:8080/query' -s -X POST -H 'content-type: application/json' \
--data '{
"query": "{ packages(pkgSpec: {}) { type } }"
}' | jq
You should see the types of all the packages ingested
{
"data": {
"packages": [
{
"type": "oci"
},
...
What is running?
Congratulations, you are now running a full GUAC deployment! Taking a look at the docker-compose.yaml
we can see what is actually running:
-
PostgreSQL: Serves as the persistant data store for all the GUAC data.
-
GraphQL Server: Serves GUAC GraphQL queries and stores the data. As the in-memory backend is used, no separate backend is needed behind the server.
-
Collector-Subscriber: Helps communicate to the collectors when additional information is needed.
-
Deps.dev Collector: Gathers further information from Deps.dev for supported packages.
-
OSV Certifier: Gathers OSV vulnerability information from osv.dev about packages.
-
Ingestor: Retrieves ingestion messages from Nats, parses the documents, and mutates the GraphQL graph to add the data to GUAC.
-
Nats: Serves as the pubsub to accept ingestion requests which the Ingestor pulls from.
-
OCI Collector: Collects additional metadata from OCI registries about any container references found in documents.
-
GUAC REST Server: Serves simplified query endpoints.
Next steps
This compose configuration is suitable to leave running in an environment that is accessible to your environment for further GUAC ingestion, discovery, analysis, and evaluation. Explore the types of collectors available under the guaccollect collect
command and see what will work for your build, ingestion, and SBOM workflow. These collectors can be run as another service that watches a location for new documents to ingest. If you’re curious about the various GUAC components and what they do, see How GUAC components work together.
You may wish to alter the volume configuration to change the blobstore and postgres-data locations. The blobstore needs to be accessable to any guaccollect
commands and supports cloud buckets. The guacone
command does not need access to the blobstore and interacts directly with the GraphQL server.