While piping together commands that only output intermittently we run into the pipe buffers created by the pipe() system call (also see overview of pipes and FIFOs). This can particularly come into play when stringing together multiple pipes in a row (as there are multiple buffers to pass through).
For example in the command below while “tail -f” flushes on activity and awk will flush on output but the grep in the middle ends up with a buffered pipe and so a quiet access.log will result in long delays before updates are shown:
tail -f /var/log/apache2/access.log \ | grep -i bingbot \ | awk '{ "host " $1 | getline RDNS; print $0, " : ", RDNS }'
In this case grep does offer a command-line option (“–line-buffered”) however some utilities do not offer this option, and each one has a different command-line argument so stdbuf can be used to do this in a common way (and can also allow tweaking of buffering on input, output and error streams independently):
tail -f /var/log/apache2/access.log \ | stdbuf -oL grep -i bingbot \ | awk '{ "host " $1 | getline RDNS; print $0, " : ", RDNS }'
The above example sets line-buffering just on the output stream, which is sufficient to get small chunks of data flowing and for the output from this sequence to be visible straight away.
The following article on pixelbeat.org explains this nicely and with some handy diagrams for a deeper dive into what is going on.