Saving Plots

I use Jupyter notebooks extensively for data analysis and exploration. It’s fantastic to be able to quickly see output, including plots, and have it all saved and persisted and viewable on GitHub.
However, when it comes time to prepare for publication, I need to save high-resolution and/or vector versions of the plots for use in LaTeX or Word. The display in Jupyter does not have nearly high enough resolution to copy and paste into a document and have it look acceptably good.
Most of my projects, therefore, have a convenience function for plots that are going into the paper. This function saves the plot to disk (in both PDF and 600dpi PNG formats) and returns it so it can also be displayed in Jupyter. That way I don’t have two copies of the plot code — one for saving and one for interactive exploration — that can get out of sync.
Python Code
The make_plot
function takes care of three things:
- Chaining the ggplot calls together (since the syntax is slightly less friendly in Python)
- Applying the theme I’m using for the notebook, along with additional theme options
- Saving the plots to both PDF and high-DPI PNG
- Returning the plot for notebook drawing
This function is built for plotnine, a Grammar of Graphics plotting library for Python that I currently use for most of my statistical visualization. It should be possible to write a similar function for raw Matplotlib, or for Plotly, but I have not yet done so.
It uses a global variable _fig_dir
to decide where to
put the figures. The extra keyword arguments (kwargs
) are
passed directly to another theme
call, to make per-figure
theme customizations easy.
Code:
import plotnine as pn
def make_plot(data, aes, *args, file=None, height=5, width=7, theme=theme_paper(), **kwargs):
= pn.ggplot(data, aes)
plt for a in args:
= plt + a
plt = plt + theme + pn.theme(**kwargs)
plt if file is not None:
= _fig_dir / file
outf if outf.suffix:
'file has suffix, ignoring')
warnings.warn('.pdf'), height=height, width=width)
plt.save(outf.with_suffix('.png'), height=height, width=width, dpi=300)
plt.save(outf.with_suffix(return plt
This can be used like this:
'DataSet', 'value', fill='gender'),
make_plot(data, pn.aes(='identity'),
pn.geom_bar(stat'qual', 'Dark2'),
pn.scale_fill_brewer(='Data Set', y='% of Books', fill='Gender'),
pn.labs(x=lbl_pct),
pn.scale_y_continuous(labelsfile='frac-known-books', width=4, height=2.5)
The width and height are in inches.
And here’s theme_paper
, a custom theme that extends
theme_minimal
with some text cleanups:
class theme_paper(pn.theme_minimal):
def __init__(self):
__init__(self, base_family='Open Sans')
pn.theme_minimal.self.add_theme(pn.theme(
=pn.element_text(size=10),
axis_title=pn.element_text(margin={'r': 12}),
axis_title_y=pn.element_rect(color='gainsboro', size=1, fill=None)
panel_border=True) ), inplace
I use these functions in the book author gender code.
R Code
I also have an R vesion from some older projects, before I switched
to Python. This one requires you to use +
yourself; it
doesn’t have any automatic ggplot calls.
= function(plot, file=NA, width=5, height=3, ...) {
make_plot if (!is.na(file)) {
png(paste(file, "png", sep="."), width=width, height=height, units='in', res=600, ...)
print(plot)
dev.off()
cairo_pdf(paste(file, "pdf", sep="."), width=width, height=height, ...)
print(plot)
dev.off()
}
plot }
You can use it like this:
make_plot(ggplot(frame, aes(x=DataSet, y=value, fill=gender))
+ geom_bar(stat='identity')
+ scale_fill_brewer('qual', 'Dark2')
+ labs(x='Data Set', y='% of Books', fill='Gender')
+ scale_y_continuous(labels=lbl_pct),
file="frac-known-books", width=4, height=2.5)
I also don’t have automatic theming in the R version, but it would be easy to add.