Pretty Printing Apache Pig outputs

Nov 22, 2023-

If you’re into big-data analytics, you would have definitely come across Pig Latin scripts. It’s used in sync with Hadoop and has several useful applications.

However, the outputs when you dump look ugly and incomprehensible like this. You can definitely dump these into csv files and work on them. But if you’re somewhat of a perfectionist or like to analyze outputs then and there, you’re at the right place!

pig output of macOSpig output of macOS

I made just the fix to work around this. It’s a simple python script that intercepts the pig output and pretty-prints it for you.

Its incredibly easy to use and open-source as well. You can find the repository here.

pip install pig-manager

You can simply use it as:

pig-manager -f "path/to/your-file.pig"

The same huge incomprehensible output above now looks as such:

You can also dump the logs and original output into a file using the following options.

-f, --file_path: Path to the Pig script file.
-dl, --dump_log: Flag to dump error output.
-do, --dump_out: Flag to dump standard output.
-l, --dump_loc: Location to dump output and error files.

If you think this makes your life easier, do drop in a clap and a follow :).

Do consider starring the repo!