Skip to content

Install GNU parallel with Ansible

GNU parallel allows you to multiplex tasks, and possibly use more CPU resources to speed up the task at hand. This works if your task can be split down into multiple independent tasks which otherwise will be executed serially.

An example: you find files in a directory, and want to compress all of them:

find /path/to/directory -type f -exec bzip2 -9 {} \;

Above line will find all the files, and compress each of them, one after the other. Most modern systems have multiple CPU cores installed, but nevertheless above line will only use one of them. GNU parallel solves this by multiplexing the task, and starting multiple compress processes. Above line changes to:

find /path/to/directory -type f -print0 | parallel -0 --no-run-if-empty bzip2 -9 :::

By default, parallel will start as many parallel processes as CPU cores are available. The --jobs option can be used to specify a hardcoded number (as example: "8"), or limit the number depending on the number of available cores ("-2" will start 6 processes if 8 cores are available, "+2" will start 10 processes if 8 cores are available).

However when you start "parallel", it will nag you that you confirm that any time you use it for processing data for an academic article, you have to cite "parallel":

Academic tradition requires you to cite works you base your article on.
When using programs that use GNU Parallel to process data for publication
please cite:

  O. Tange (2011): GNU Parallel - The Command-Line Power Tool,
  ;login: The USENIX Magazine, February 2011:42-47.

This helps funding further development; AND IT WON'T COST YOU A CENT.
If you pay 10000 EUR you should feel free to use GNU Parallel without citing.

To silence this citation notice: run 'parallel --citation'.

That's ok if you use it manually, but in a server environment no one will ever see this note.

 

By using the following Playbook, you accept to follow the cited requirement above!

First task, install "parallel":

- name: Install parallel
  package:
    name: {{ item }}
    state: present
  loop:
    - parallel

The acknowledgement is a file created in the users home directory: ~/.parallel/will-cite. This file is empty. First the directory must be created, and secondly the empty file. In the following example the acknowledgement file is created for the "root" user. Modify the playbook if this needs to run for other users.

- name: Create /root/.parallel directory
  file:
    path: /root/.parallel
    state: directory
    owner: root
    group: root
    mode: 0755


- name: Create /root/.parallel/will-cite file
  copy:
    dest: /root/.parallel/will-cite
    content: ""
    force: no
    owner: root
    group: root
    mode: 0644

That's it, now "parallel" will no longer ask for citations, and you acknowledged the author's request.

Trackbacks

No Trackbacks

Comments

Display comments as Linear | Threaded

No comments

Add Comment

Enclosing asterisks marks text as bold (*word*), underscore are made via _word_.
E-Mail addresses will not be displayed and will only be used for E-Mail notifications.
To leave a comment you must approve it via e-mail, which will be sent to your address after submission.
Form options