Converting pandas DataFrames into Publication-Ready Tables

Analysing, comparing and communicating the predictive performance of machine learning models is a crucial component of any empirical research effort. Pandas, a staple in the Python data analysis stack, not only helps with the data wrangling itself, but also provides efficient solutions for data presentation. Two of its lesser-known yet incredibly useful features are df.to_markdown() and df.to_latex(), which allow for a seamless transition from DataFrames to publication-ready tables. Here’s how you can use them!

Exporting DataFrames to Markdown

Markdown is widely used for its simplicity and readability, making it a go-to format for rendering your GitHub README or rebuttals on OpenReview. With the df.to_markdown() method, you can turn any DataFrame into a Markdown table with a single line of code.

import pandas as pd

# construct example DataFrame
results = pd.DataFrame(
    {
        "model": [
            "random forest", 
            "support vector machine", 
            "multi-layer perceptron"
            ],
        "AUC-ROC": [0.83, 0.79, 0.81],
        "AUC-PRC": [0.46, 0.48, 0.49],
        "ECE": [0.04, 0.09, 0.05],
        "runtime": [0.004, 0.003, 0.01],
    }
)

# convert it to Markdown
print(results.to_markdown(index=False))

This Markdown table can then be copied into any Markdown editor or platform that supports it (such as this website) and will be rendered as a neat table.

model	AUC-ROC	AUC-PRC	ECE	runtime
random forest	0.83	0.46	0.04	0.004
support vector machine	0.79	0.48	0.09	0.003
multi-layer perceptron	0.81	0.49	0.05	0.01

This function uses the tabulate library, which additionally allows you to specify a range of different table styles using the tablefmt argument – e.g. a text grid like this:

+------------------------+-----------+-----------+-------+-----------+
| model                  |   AUC-ROC |   AUC-PRC |   ECE |   runtime |
+========================+===========+===========+=======+===========+
| random forest          |      0.83 |      0.46 |  0.04 |     0.004 |
+------------------------+-----------+-----------+-------+-----------+
| support vector machine |      0.79 |      0.48 |  0.09 |     0.003 |
+------------------------+-----------+-----------+-------+-----------+
| multi-layer perceptron |      0.81 |      0.49 |  0.05 |     0.01  |
+------------------------+-----------+-----------+-------+-----------+

Exporting DataFrames to LaTeX

LaTeX is the de facto standard for the typesetting of machine learning papers. The df.to_latex() method can convert a DataFrame into LaTeX tabular format which can be included directly in your LaTeX documents. Using the same example as above

import pandas as pd

# construct example DataFrame
results = pd.DataFrame(
    {
        "model": [
            "random forest", 
            "support vector machine", 
            "multi-layer perceptron"
            ],
        "AUC-ROC": [0.83, 0.79, 0.81],
        "AUC-PRC": [0.46, 0.48, 0.49],
        "ECE": [0.04, 0.09, 0.05],
        "runtime": [0.004, 0.003, 0.01],
    }
)

# convert it to LaTeX
print(results.to_latex(index=False))

we can generate the following table:

Similar to the df.to_markdown() function, df.to_latex() is quite flexible and allows you to customize the LaTeX table output to a great extent. Here are some of the specialized formatting options you can use with df.to_latex() to e.g. align columns, add captions and labels and standardise number formatting:

# Custom LaTeX table with specialized formatting
latex_output = results.to_latex(index=False,
                                column_format='|l|r|r|r|r|',
                                caption='Model Performance Metrics.',
                                label='tab:model_performance',
                                multicolumn_format='c',
                                escape=False,
                                header=[
                                    'Model', 'AUC-ROC', 'AUC-PRC', 
                                    'ECE', 'Runtime (s)'
                                    ],
                                float_format="%.4f")

print(latex_output)

resulting in the following table:

Conclusion

Over are the days of having to manually copy-paste your results into Overleaf! Both df.to_markdown() and df.to_latex() are straightforward yet highly customisable tools that allow you to easily compile and present your results for papers, blog posts and GitHub documentation.

Author

Leo Klarner

View all posts

Oxford Protein Informatics Group

or "OPIG" to friends

Converting pandas DataFrames into Publication-Ready Tables

Exporting DataFrames to Markdown

Exporting DataFrames to LaTeX

Conclusion

Author