Dependency operator <-

The dependency operator provides a simple way to see if a file needs to be updated (i.e. recalculated) with respect to some inputs. For instance, when we already processed some files and have the corresponding results, we may save some work if the inputs have not changed (like "make" command).

We introduce the dependency operator <- (pronounced 'dep') which is a "make" style operator.

The expression out <- in is true if 'out' file needs to be updated. More formally, the expression is true if the file name represented by the variable 'out' does not exist, is empty (zero length) or has a creation date before 'in'. E.g.:

File test_06.bds

#!/usr/bin/env bds

string inFile  = "in.txt"
string outFile = "out.txt"

# Create 'in.txt' if it doesn't exist
if( !inFile.exists() ) {
    task echo Creating $inFile; echo Hello > $inFile
}

wait

# Create 'out.txt' only if needs to be updated resepct to 'in.txt'
if( outFile <- inFile ) {
    task echo Creating $outFile; cat $inFile > $outFile
}

When executing for the first time, both tasks are executed and both files ('in.txt' and 'out.txt') are created.

$ ./test_06.bds
Creating in.txt
Creating out.txt

If we execute for a second time, since files have not changed, no task is executed.

$ ./test_06.bds
$

If we now change the contents of 'in.txt', and run the script again, the second task will be executed (because 'out.txt' needs to be updated with respect to 'in.txt')

# Update 'in.txt'
$ date > in.txt

# Since we updated the input file, the output must be recalculated
$ ./test_06.bds 
Creating out.txt

Summary: If the file 'out.txt' is up to date with respect to 'in.txt', the following condition will be false and the task will not execute

if( outFile <- inFile ) {
    task echo Creating $outFile; cat $inFile > $outFile
}

Multiple dependencies

You can have a dependency operator expression out <- in, where either in and out or both can be lists of files. The same rules apply: The operator is true if any out file is missing, zero length or the minimum of modification times in out is less than the maximum modificaton times in in

This can be also used on lists:

in1 := "in1.txt"
in2 := "in2.txt"
out := "out.txt"

if( out <- [in1, in2] ) print("We should update $out\n")

or even:

in1 := "in1.txt"
in2 := "in2.txt"
out1 := "out1.txt"
out2 := "out2.txt"

if( [out1, out2] <- [in1, in2] ) print("We should update $out1 and $out2\n")

Using <- in tasks

This construction is so common that we allow for some syntactic sugar.

task( outFile <- inFile ) { 
    sys echo Creating $outFile; cat $inFile > $outFile
}

The above syntax means that task will only be executed if the dependency is satisfied, i.e. outFile needs to be updated respect to inFile.

Task dependency detection

Programming task dependencies can be difficult. BDS can help by inferring task dependencies and executing tasks in the correct order.

In this example, we have two tasks:

  • The first task uses an input file 'in.txt', to create an intermediate file 'inter.txt'
  • The second task uses the intermediate file 'inter.txt' to create the ouptut file 'out.txt'.

The script below does not have a wait statement. Instead bds automatically infers that the second task depends on the first one, and does not start execution until the first task begins. Notice that we don't tell bds to wait for the first task to finish (there is no explicit wait statement).

File test_07.bds

#!/usr/bin/env bds

# We use ':=' for declaration with type inference
inFile       := "in.txt"        
intermediate := "inter.txt"
outFile      := "out.txt"

task( intermediate <- inFile) {
    sys echo Creating $intermediate; cat $inFile > $intermediate; sleep 1 ; echo Done $intermediate
}

task( outFile <- intermediate ) {
    sys echo Creating $outFile; cat $intermediate > $outFile; echo Done $outFile
}

As a side note: We used the := operator to declare variables using type inference. So we can write inFile := "in.txt" instead of string inFile = "in.txt", which not only is shorter to type, but also makes the code look cleaner.

Now let's run the script

# Delete old file (if anY)
$ rm *.txt

# Create input file
$ date > in.txt

# Run
$ ./test_07.bds 
Creating inter.txt
Done inter.txt
Creating out.txt
Done out.txt

Note how the second task is executed only after the first one finished.