Pretty Printing a pandas dataframe

How can I print a pandas dataframe as a nice text-based table, like the following?

+------------+---------+-------------+
| column_one | col_two | column_3 |
+------------+---------+-------------+
| 0 | 0.0001 | ABCD |
| 1 | 1e-005 | ABCD |
| 2 | 1e-006 | long string |
| 3 | 1e-007 | ABCD |
+------------+---------+-------------+

9 Answers

I've just found a great tool for that need, it is called tabulate.

It prints tabular data and works with DataFrame.

from tabulate import tabulate
import pandas as pd
df = pd.DataFrame({'col_two' : [0.0001, 1e-005 , 1e-006, 1e-007], 'column_3' : ['ABCD', 'ABCD', 'long string', 'ABCD']})
print(tabulate(df, headers='keys', tablefmt='psql'))
+----+-----------+-------------+
| | col_two | column_3 |
|----+-----------+-------------|
| 0 | 0.0001 | ABCD |
| 1 | 1e-05 | ABCD |
| 2 | 1e-06 | long string |
| 3 | 1e-07 | ABCD |
+----+-----------+-------------+

Note:

To suppress row indices for all types of data, pass showindex="never" or showindex=False.

pandas >= 1.0

If you want an inbuilt function to dump your data into some github markdown, you now have one. Take a look at to_markdown:

df = pd.DataFrame({"A": [1, 2, 3], "B": [1, 2, 3]}, index=['a', 'a', 'b'])
print(df.to_markdown())
| | A | B |
|:---|----:|----:|
| a | 1 | 1 |
| a | 2 | 2 |
| b | 3 | 3 |

Here's what that looks like on github:

Note that you will still need to have the tabulate package installed.

A simple approach is to output as html, which pandas does out of the box:

df.to_html('temp.html')

If you are in Jupyter notebook, you could run the following code to interactively display the dataframe in a well formatted table.

This answer builds on the to_html('temp.html') answer above, but instead of creating a file displays the well formatted table directly in the notebook:

from IPython.display import display, HTML
display(HTML(df.to_html()))

Credit for this code due to example at: Show DataFrame as table in iPython Notebook

You can use prettytable to render the table as text. The trick is to convert the data_frame to an in-memory csv file and have prettytable read it. Here's the code:

from StringIO import StringIO
import prettytable
output = StringIO()
data_frame.to_csv(output)
output.seek(0)
pt = prettytable.from_csv(output)
print pt

I used Ofer's answer for a while and found it great in most cases. Unfortunately, due to inconsistencies between pandas's to_csv and prettytable's from_csv, I had to use prettytable in a different way.

One failure case is a dataframe containing commas:

pd.DataFrame({'A': [1, 2], 'B': ['a,', 'b']})

Prettytable raises an error of the form:

Error: Could not determine delimiter

The following function handles this case:

def format_for_print(df): table = PrettyTable([''] + list(df.columns)) for row in df.itertuples(): table.add_row(row) return str(table)

If you don't care about the index, use:

def format_for_print2(df): table = PrettyTable(list(df.columns)) for row in df.itertuples(): table.add_row(row[1:]) return str(table)

Following up on Mark's answer, if you're not using Jupyter for some reason, e.g. you want to do some quick testing on the console, you can use the DataFrame.to_string method, which works from -- at least -- Pandas 0.12 (2014) onwards.

import pandas as pd
matrix = [(1, 23, 45), (789, 1, 23), (45, 678, 90)]
df = pd.DataFrame(matrix, columns=list('abc'))
print(df.to_string())
# outputs:
# a b c
# 0 1 23 45
# 1 789 1 23
# 2 45 678 90

Maybe you're looking for something like this:

def tableize(df): if not isinstance(df, pd.DataFrame): return df_columns = df.columns.tolist() max_len_in_lst = lambda lst: len(sorted(lst, reverse=True, key=len)[0]) align_center = lambda st, sz: "{0}{1}{0}".format(" "*(1+(sz-len(st))//2), st)[:sz] if len(st) < sz else st align_right = lambda st, sz: "{0}{1} ".format(" "*(sz-len(st)-1), st) if len(st) < sz else st max_col_len = max_len_in_lst(df_columns) max_val_len_for_col = dict([(col, max_len_in_lst(df.iloc[:,idx].astype('str'))) for idx, col in enumerate(df_columns)]) col_sizes = dict([(col, 2 + max(max_val_len_for_col.get(col, 0), max_col_len)) for col in df_columns]) build_hline = lambda row: '+'.join(['-' * col_sizes[col] for col in row]).join(['+', '+']) build_data = lambda row, align: "|".join([align(str(val), col_sizes[df_columns[idx]]) for idx, val in enumerate(row)]).join(['|', '|']) hline = build_hline(df_columns) out = [hline, build_data(df_columns, align_center), hline] for _, row in df.iterrows(): out.append(build_data(row.tolist(), align_right)) out.append(hline) return "\n".join(out)
df = pd.DataFrame([[1, 2, 3], [11111, 22, 333]], columns=['a', 'b', 'c'])
print tableize(df)

Output:
+-------+----+-----+
| a | b | c |
+-------+----+-----+
| 1 | 2 | 3 |
| 11111 | 22 | 333 |
+-------+----+-----+

Update: an even better solution is to simply put the variable name of the dataframe on the last line of the cell. It will automatically print in a pretty format.

import pandas as pd
import numpy as np
df = pd.DataFrame({'Data1': np.linspace(0,10,11), 'Data2': np.linspace(10,0,11)})
df

Star Pop News

Pretty Printing a pandas dataframe

9 Answers

pandas >= 1.0

Your Answer

Sign up or log in

Post as a guest

You Might Also Like

What's the secret to playing Sniper?

How do I get to the second floor of the Lecture Building?

Order of Endings in NG for Sekiro

How to get the new minecraft launcher GUI