GNU Parallel Snippet

GNU Parallel is a shell tool for executing jobs in parallel on one or more computers.

Here's a single-computer toy example that came in very useful for me. It reads lines from file.txt, and spins up at most 10 indexing commands in parallel by invoking a bash function on each line.

#!/bin/bash

# Function to process a single line
process_line() {
line="$1"
command="zoekt-git-index -require_ctags -parallelism=4 -repo_cache /git-data/zoekt/repos -index /index-storage/zoekt -incremental -branches=HEAD,main,master,develop -rev-parse-head -allow_missing_branches=true $line"
echo "Running command: $command"
# Execute the command
$command
}

# Export the function so it is accessible to parallel
export -f process_line

# Set the maximum number of parallel processes
max_parallel=10

# Process each line in parallel using GNU Parallel
parallel -j "$max_parallel" -k process_line :::: file.txt