This repository was archived by the owner on Mar 30, 2020. It is now read-only.
Description I am working with a software package that takes as a parameter the name of an output directory. Then, a bunch of files are written to that output directory. I cannot see a way to specify those files as outputs since dockerflow appears to ignore the prefix when specifying the outputFile. Here is what the output directory structure looks like.
index/
├── hash.bin
├── header.json
├── indexing.log
├── quasi_index.log
├── rsd.bin
├── sa.bin
├── txpInfo.bin
└── versionInfo.json
Here is the code I am trying to use (with just one of the files above as an example):
static Task salmonIndex = TaskBuilder .named ("salmonIndex" )
.inputFile ("fasta" )
.outputFile ("indexVersion" ,"index/versionInfo.json" )
.docker ("seandavi/salmon" )
.preemptible (true )
.diskSize ("20" )
.memory (14 )
.cpu (2 )
.script ("salmon index --index=index --transcripts=${fasta}" )
.build ();
static WorkflowArgs workflowArgs = ArgsBuilder .of ()
.input ("fasta" , "${fasta}" )
.output ("indexVersion" , "${salmonIndex.indexVersion}" )
.build ();
And here is the error I am getting. Note that the gsutil cp fails because the index/ in the path appears to be ignored.
(d81d2f3a5be0ea0c): java.lang.RuntimeException:
com.google.cloud.dataflow.sdk.util.UserCodeException: java.lang.RuntimeException:
com.google.cloud.dataflow.sdk.util.UserCodeException: java.lang.RuntimeException:
com.google.cloud.dataflow.sdk.util.UserCodeException:
com.google.cloud.genomics.dockerflow.runner.TaskException: Operation
operations/ENbjmKaCKxj6h6-zxaC7zokBIL3n-N_FEioPcHJvZHVjdGlvblF1ZXVl failed. Details: 10:
Failed to delocalize files: failed to copy the following files: "/mnt/data/100346066-versionInfo.json
-> gs://gbseqdata/ockerflow_example/ch1/salmonIndex/index/versionInfo.json (cp failed: gsutil -q -
m cp -L /var/log/google-genomics/out.log /mnt/data/100346066-versionInfo.json
gs://gbseqdata/ockerflow_example/ch1/salmonIndex/index/versionInfo.json, command failed:
CommandException: No URLs matched: /mnt/data/100346066-
versionInfo.json\nCommandException: 1 file/object could not be transferred.\n)" at
com.google.cloud.dataflow.sdk.runners.worker.SimpleParDoFn$1.output(SimpleParDoFn.java:162
I am likely just misunderstanding some pieces here, but I thought I would just go ahead and ask.
Reactions are currently unavailable
I am working with a software package that takes as a parameter the name of an output directory. Then, a bunch of files are written to that output directory. I cannot see a way to specify those files as outputs since dockerflow appears to ignore the prefix when specifying the outputFile. Here is what the output directory structure looks like.
Here is the code I am trying to use (with just one of the files above as an example):
And here is the error I am getting. Note that the
gsutil cpfails because theindex/in the path appears to be ignored.I am likely just misunderstanding some pieces here, but I thought I would just go ahead and ask.