Amazon EKS
Fusion streamlines the deployment of Nextflow pipelines in Kubernetes because it replaces the need to configure and maintain a shared file system in your cluster.
Platform Amazon EKS compute environments
Seqera Platform supports Fusion in Amazon EKS compute environments.
See Amazon EKS for Platform instructions to enable Fusion.
Nextflow CLI
Fusion file system implements a lazy download and upload algorithm that runs in the background to transfer files in
parallel to and from the object storage into the container-local temporary directory (/tmp
). To achieve optimal performance, set up an SSD volume as the temporary directory.
Several AWS EC2 instance types include one or more NVMe SSD volumes. These volumes must be formatted to be used. See SSD instance storage for details.
To use Fusion directly in Nextflow with an Amazon EKS cluster, you must configure a namespace and service account and update your Nextflow configuration.
Kubernetes configuration
You must create a namespace and a service account in your Kubernetes cluster to run the jobs submitted during pipeline execution.
-
Create a manifest that includes the following configuration at minimum:
---
apiVersion: v1
kind: Namespace
metadata:
name: fusion-demo
---
apiVersion: v1
kind: ServiceAccount
metadata:
namespace: fusion-demo
name: fusion-sa
annotations:
eks.amazonaws.com/role-arn: "arn:aws:iam::<YOUR ACCOUNT ID>:role/fusion-demo-role"
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
namespace: fusion-demo
name: fusion-role
rules:
- apiGroups: [""]
resources: ["pods", "pods/status", "pods/log", "pods/exec"]
verbs: ["get", "list", "watch", "create", "delete"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
namespace: fusion-demo
name: fusion-rolebind
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: fusion-role
subjects:
- kind: ServiceAccount
name: fusion-sa -
The AWS IAM role must provide read-write permission to the S3 bucket used as the pipeline work directory:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": ["s3:ListBucket"],
"Resource": ["arn:aws:s3:::<YOUR-BUCKET>"]
},
{
"Action": [
"s3:GetObject",
"s3:PutObject",
"s3:PutObjectTagging",
"s3:DeleteObject"
],
"Resource": ["arn:aws:s3:::<YOUR-BUCKET>/*"],
"Effect": "Allow"
}
]
}Replace
<YOUR-BUCKET>
with a bucket name of your choice. -
The role must define a trust relationship similar to this:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"Federated": "arn:aws:iam::<YOUR ACCOUNT ID>:oidc-provider/oidc.eks.<YOUR REGION>.amazonaws.com/id/<YOUR CLUSTER ID>"
},
"Action": "sts:AssumeRoleWithWebIdentity",
"Condition": {
"StringEquals": {
"oidc.eks.eu-west-2.amazonaws.com/id/<YOUR CLUSTER ID>:aud": "sts.amazonaws.com",
"oidc.eks.eu-west-2.amazonaws.com/id/<YOUR CLUSTER ID>:sub": "system:serviceaccount:fusion-demo:fusion-sa"
}
}
}
]
}
Nextflow configuration
-
Add the following to your
nextflow.conf
file:wave.enabled = true
fusion.enabled = true
process.executor = 'k8s'
k8s.context = '<YOUR K8S CLUSTER CONTEXT>'
k8s.namespace = 'fusion-demo'
k8s.serviceAccount = 'fusion-sa'Replace
<YOUR K8S CLUSTER CONTEXT>
with the Kubernetes context in your Kubernetes config. -
Run the pipeline with the usual run command:
nextflow run <YOUR PIPELINE SCRIPT> -w s3://<YOUR-BUCKET>/work
Replace
<YOUR PIPELINE SCRIPT>
with your pipeline Git repository URI and<YOUR-BUCKET>
with your S3 bucket.