cache tool helps optimize CI/CD runtime by reusing files that your
project depends on, but are not part of version control. You should typically
use caching to:
- Reuse your project's dependencies so that Semaphore fetches and installs them only when the dependency list changes.
- Propagate a file from one block to the next.
The cache is created on a per-project basis and is available in every pipeline job. All cache keys are scoped per project.
cache tool uses key pairs for managing cached archives. An archive
can be a single file or a directory.
When running jobs in Semaphore's hosted environment, the total cache size is 9.6GB and each archive automatically expires in 30 days. When running jobs in a self-hosted environment, you have full control over the cache size and archive expiration.
The Semaphore caching script will try to recognize your project structure and automatically store or restore dependencies into or from default paths. The Semaphore cache works for the following languages and dependency managers:
- Ruby (bundler) - default cache path:
Gemfile.lockto be present in the repository.
- Node.js (npm, yarn) - default cache path:
package-lock.jsonis present or
yarn.lockexists in the repository.
- Python (pip) - default cache path:
- PHP (composer) - default cache path:
composer.lockto be present in the repository.
- Elixir (mix) - default cache path:
- Java (maven) - default cache path:
- nvm - default cache path:
.nvmrcis present in the repository.
- golang - default cache path:
go.sumis present in the repository.
cache store command that has zero arguments will look up default paths
used to store dependencies and cache them.
blocks: - name: Cache bundle task: jobs: - name: Bundle install and cache commands: - bundle install --path vendor/bundle - cache store - name: Use cache task: prologue: commands: - cache restore jobs: - name: Job 1 commands: echo Use cache 1 - name: Job 2 commands: echo Use cache 2 - name: Job 3 commands: echo Use cache 3
The output of cache store in a project that has a Gemfile.lock and packages-lock.json will look like this:
$ cache store ==> Detecting project structure and storing it in the cache. * Detected Gemfile.lock. * Using default cache path 'vendor/bundle'. Uploading 'vendor/bundle' with cache key 'gems-your-branch-33a6002a37f59b6f1841636085a22fbc'... Upload complete. * Detected package-lock.json. * Using default cache path 'node_modules'. Uploading 'node_modules' with cache key 'node-modules-your-branch-d17b3d82f1356d0c91469804e2fc320a'... Upload complete.
cache restore command that has zero arguments looks up cachable elements
and tries to pull them from the repository.
$ cache restore ==> Detecting project structure and storing it in the cache. * Detected Gemfile.lock. * Fetching 'vendor/bundle' directory with cache keys 'gems-your-branch-33a6002a37f59b6f1841636085a22fbc,gems-master-,gems-your-branch-'. HIT: gems-your-branch-d17b3d82f1356d0c91469804e2fc320a, using key gems-your-branch-33a6002a37f59b6f1841636085a22fbc Restored: vendor/bundle * Detected package-lock.json. * Fetching 'node_modules' directory with cache keys 'node-modules-your-branch-d17b3d82f1356d0c91469804e2fc320a,node-mdoules-master-,node-mdoules-your-branch-'. HIT: node-mdoules-your-branch-d17b3d82f1356d0c91469804e2fc320a, using key node-mdoules-your-branch-d17b3d82f1356d0c91469804e2fc320a Restored: node_modules/
If a third party project, such as Bundler, changes the location where they store dependencies or your project then dependency location is different than the default specified in Basic Usage; you might need to specify the key's path manually instead of using a caching shortcut.
cache store key path#
Here are a few examples of a cache store key path:
cache store our-gems vendor/bundle cache store gems-$SEMAPHORE_GIT_BRANCH vendor/bundle cache store gems-$SEMAPHORE_GIT_BRANCH-revision-$(checksum Gemfile.lock) vendor/bundle
cache store command archives a file or directory specified by
associates it with a given
cache store uses
tar, it automatically removes the preceding
/ from the
Any further changes of
path after the store command completes will not
be automatically propagated to the cache. The command always passes, i.e. exits
with a return code of 0.
In case of insufficient disk space,
cache store frees disk space by deleting
the oldest keys, by default. You can use the
--cleanup-by parameter to delete the smallest or least recently accessed keys, in that case:
# Deletes the smallest keys first, if no space is available. cache store our-gems vendor/bundle --cleanup-by SIZE # Deletes the least recently accessed keys first, if no space is available. cache store our-gems vendor/bundle --cleanup-by ACCESS_TIME
Cleaning up keys by access time
Cleaning up keys by access time is only available when using the SFTP backend. Additionally, for performance reasons, the access times on cache keys are only updated once every day, so they may not indicate the latest access times.
Overwriting cache keys
cache store does not overwrite data for an existing key. You need to delete the key first to update the associated information.
cache restore key [,second-key,...]#
Some examples of cache restore keys are:
cache restore our-gems cache restore gems-$SEMAPHORE_GIT_BRANCH cache restore gems-$SEMAPHORE_GIT_BRANCH-revision-$(checksum Gemfile.lock),gems-master
These will restore an archive which partially matches any given
In case of a cache hit, the archive is retrieved and available at its original
path in the job environment.
Each archive is restored in the current path from where the function is called.
In case of a cache miss, the comma-separated fallback takes over and the command looks up the next key. If no archives are restored, the command exits with 0.
cache has_key key#
cache has_key our-gems cache has_key gems-$SEMAPHORE_GIT_BRANCH cache has_key gems-$SEMAPHORE_GIT_BRANCH-revision-$(checksum Gemfile.lock)
This command checks if an archive with the provided key exists in the cache. The command passes if a key is found in the cache, otherwise it fails.
This command lists all cache archives for the project. By default, it uses the time the key was stored to sort the keys. The
--sort-by parameter can be used to sort the keys using other conditions:
# List all keys, sorted by size cache list --sort-by SIZE # List all keys, sorted by access time cache list --sort-by ACCESS_TIME
Sorting by access time
Sorting keys by access time is only available when using the SFTP backend. Additionally, for performance reasons, the access times on cache keys are only updated once every day, so they may not indicate the latest access times.
cache delete key#
cache delete our-gems cache delete gems-$SEMAPHORE_GIT_BRANCH cache delete gems-$SEMAPHORE_GIT_BRANCH-revision-$(checksum Gemfile.lock)
This will remove an archive with a given key if it is found in the cache. The command always passes.
Using this command will remove all cached archives for the project. The command always passes.
Note that in all commands of
cache, only the
cache has_key command can fail
(exit with non-zero status).
libchecksum scripts provide the
checksum command. The
checksum command is
useful for tagging artifacts or generating cache keys. It takes a
single argument - a file path - and outputs an
md5 hash value.
$ checksum package.json 3dc6f33834092c93d26b71f9a35e4bb3
This is the default backend for jobs running in Semaphore's hosted environment. The following environment variables are required and automatically set in every hosted job:
||Controls the storage backend used by the cache CLI. For the hosted environment, it is set to
||The IP address and port number of the cache sftp server (
||The username that will be used to connect to the cache sftp server (
||The path to the SSH key that will be used to connect to the cache sftp server (
For jobs in a self-hosted environment, these environment variables are not automatically set on every job.
AWS S3 backend#
The following environment variables are required for the
s3 storage backend to work:
||To use the S3 storage backend, this should be set to
||The S3 bucket name.|
cache CLI also needs your
~/.aws folder to be properly configured with the appropriate credentials in order to access your AWS S3 bucket. You can follow this guide to set this up.
The following environment variables are required for the
gcs storage backend to work:
||To use the GCS storage backend, this should be set to
||The GCS bucket name.|
cache CLI also needs your ADC credentials to be properly configured in order to access your GCS bucket. You can read more about ADC credentials here.
cache restore restores an archive with a corrupted archive message#
cache restore output log includes lines similar to the following, you can make sure that only one job is creating an archive under the specific cache key:
$ cache restore gems-$SEMAPHORE_GIT_SHA ==> HIT: gems-c964fbeac09ef1fad45c2b10c849a4e6b23763b4, using key gems-c964fbeac09ef1fad45c2b10c849a4e6b23763b4 gzip: stdin: unexpected end of file tar: Unexpected EOF in archive tar: Unexpected EOF in archive tar: Error is not recoverable: exiting now Restored: vendor/bundle
Cache archives usually get corrupted when
cache store is added to the prologue command sequence,
resulting in its execution for all jobs in the related block.
To address the issue, structure Semaphore yml's so that
cache store for an archive
is executed in one job and
cache restore is in the successive jobs.
blocks: - name: Cache dependencies task: jobs: - name: Cache gems commands: - checkout - cache restore bundle-gems-$(checksum Gemfile.lock) - bundle install --deployment --path vendor/bundle - cache store bundle-gems-$(checksum Gemfile.lock) vendor/bundle - name: Tests task: prologue: commands: - checkout - cache restore bundle-gems-$(checksum Gemfile.lock) - bundle install --deployment --path vendor/bundle jobs: - name: RSpec 0 commands: - name: RSpec 1 commands: - name: RSpec 2 commands:
Note: Launch a debugging session to clear corrupted archives for your project
cache clear or
cache delete <key>.