Maybe I missed the obvious, but it seemed to me that citations - at least citations from existing Bibtex bibliographies in standard LaTex style - are a bit non-trivial in Jupyter. Here are a few methods that I've tried, but I'm still not totally satisfied, and my testing has also been pretty brief/cursory so far. It'd be interesting to hear what experience other people have had here... and I will update this doc as things progress.
Why would one want to do this...? There are probably mulitple usage cases, but for myself I am interested mainly in
- including citations directly in notebooks,
- writing standard scientific papers using notebooks (or, perhaps more precisely, using contents from notebooks as the main source for papers, with the potential of some post-processing).
In both cases I'd like to have citations in text, and a bibliography section, as per a standard manuscript and, ideally, directly in Jupyter notebook/lab (i.e. display without any post-processing). More on this at the end.
Image from the ipypublish docs
(To be clear, in this post Jupyter notebook and lab refer to the Jupyter environment or interface, running in a web browser, while notebook or document or source used generically refers to a .ipynb computational notebook file, which can be rendered/viewed by multiple environments, and exported to other formats. Since some extensions work within a Jupyter environment, they may only work for notebook or lab. Methods which post-process notebook files are environment agnostic. For more background and a general introduction, see the project Jupyter website. Note that notebooks support Markdown as standard, and also process and displays LaTex maths via MathJax in browser - other LaTex (usually) requires post-processing.)
Unless otherwise stated, the test machine was running Ubuntu 18 LTS, with Firefox v76.0.1, Jupyter Notebook v6.0.3, Python v3.7.6 (in an Anaconda virtual env.).
Manual Markdown citations
The obvious answer for just dropping a couple of items into a notebook document is to just manually code the Markdown. For stand-alone notebooks, with just a couple of citations, this is pretty easy, and probably the quickest thing to do - but is a pain if you want to pipe things in from a Bibtex file and/or automate any part of the citation process.
For some more detailed notes on this:
- Discussion on Stack Exchange. Includes manual version via Pandoc, and raw LaTex inclusion in notebook.
- Further discussion re: markdown in notebooks
- Via Markdown: standard reference style,
Pros
- Quick and easy.
- OK for small docs.
Cons
- No bibtex.
- No automation.
Cite2c
Cite2c is a handy tool for pulling references from Zotero into a notebook. It supports citations in line, and a bibliography, in notebooks. (This works in a similar manner to the Nbconvert method below, with custom HTML tags included for the citations.) This is a notebook extension which requires installation, and provides cite
and bibliography
toolbar buttons once installed.
Pros
- Quick and easy.
- Looks good in notebook (I didn't yet test in Jupyter lab - likely it is not supported however).
Cons
- No control over citation style (at least not obviously - likely there is a way if one goes deep here).
- No references in output when processed to PDF. (EDIT: there's some notes for getting this working here, via a custom nbconvert template.)
- Requires Zotero account.
Nbconvert
Citations are supported directly for LaTex + Bibtex via nbconvert. Nbconver is the standard tool for converting notebooks to other (static) formats, e.g. HTML or PDF, so this is very much a method for post-processing notebooks.
This uses HTML style formatted citations in the source notebook, e.g. <cite data-cite="citation">(manual label)</cite>
, which are replaced in the nbconvert-processed output with standard latex \cite{citation}
commands. Note there is also a template file required for this processing to work.
Once set-up, the notebook is converted in the usual way, with the required settings passed explicitly
jupyter nbconvert --config ipython_nbconvert_config.py
(See the Github repo for an example config file.)
Pros
- Works, agnostic for notebook/lab.
- Fairly transparent.
- Using the same method should also allow for direct use of
\cite{}
in raw latex notebook cells (I didn't test this however). - OK for all notebook formats that nbconvert supports.
- For more control (with more effort), one can convert to .tex and then use the standard LaTex tool-chain. For a half-and-half solution, with LaTex control via nbconvert templates, try the article Making publication ready Python Notebooks, see also the nbconvert docs for more info - this is likely more effort than most people will want to make however!
Cons
- In the original notebook, citations use only the manually set label text, so may not match output format (if, e.g., it's set for automatic numerical refs.)
- No bibliography in source notebook (unless set up manually).
- There's a bit of set-up required on the back end, specifically a template file plus configuration options, which includes the path to the .bib files. (There is probably an easy way to stream-line this however.)
(Some) latex envs in Jupyter notebooks
This is a very handy notebook extension which provides enhanced LaTex support.
- It can be installed as part of the nbextensions package (this provides a full control panel with a set of extensions), or independently from source (Github).
- It provides a range of functions, including citations, accessilble from a menu and toolbar in Jupyter notebook.
- For citations, these are supported in the notebook directly, and propagated to outputs when processed with the supplied nbconvert templates (this basically converts the custom, in-notebook markup to standard LaTex).
Pros
- Full LaTex style citations directly in the notebook.
- Easy to use (includes new menus in notebook).
- Output latex seems as advertised in brief testing.
Cons
- Doesn't support Jupyter Lab.
- In brief testing, I couldn't get the Bibliography to render in a notebook, although citations appeared correctly in notebook, and the output LaTex looked OK (but required manually adding LaTex bibliography calls). This might be a machine/browser issue (testing on Ubuntu 18 LTS, with Firefox v76.0.1, Jupyter Notebook v6.0.3, Python v3.7.6 - this last may be a known issue).
Other methods
A few other approaches I've stumbled over, but not yet thought further about or tested...
Other tool-chains/languages
- Via the reST/nbsphinx/Sphinx tool-chain:
- "
nbsphinx
is a Sphinx extension that provides a source parser for *.ipynb files. Custom Sphinx directives are used to show Jupyter Notebook code cells (and of course their results) in both HTML and LaTeX output." - Citations/references supported in standard (manual) style.
- Bibtex support is also possible, via sphinxcontrib-bibtex extension. Note this supports both reST and raw LaTex.
- This probably makes most sense if one is planning to use reST to write the docs in the first place, and/or is outputting to a Sphinx based platform (e.g. readthedocs) - otherwise I'm not sure how it differs from using nbconvert directly. (I've been using this for ePSproc documentation, but this is a rather different use-case, which compiles HTML docs from a set of notebooks and also pulls reST directly from a bunch of python source code docstrings.)
- "
- R Markdown, with knitr and Pandoc
- Book building in a similar fashion to the Jupyter-based tool-chain, but with R-studio and R Markdown.
- Supports multiple languages, including Python.
Large docs/books/other docs from Jupyter notebook source
- ipypublish:
- "A package for creating and editing publication ready scientific reports and presentations, from Jupyter Notebooks."
- "Combining features of the Jupyter Notebook, WYSIWYG editors, Latex document preparation system and Sphinx HTML creation"
- This is a full tool-chain for post-processing notebooks, including multiple formats, via Pandoc.
- Looks very powerful for use-case (2), but a little bit of custom stuff to learn.
- Possibly replaced (or just supplemented?) by Jupyter Book project? See here for discussion, and notes below.
- Jupinx:
- "a build system for lectures."
- "Jupinx is an open source tool for converting ReStructuredText source files into a website via Jupyter Notebooks"
- Jupyter book project (part of the "Executable Book" project).
- "Jupyter Book is an open source project for building beautiful, publication-quality books and documents from computational material."
- "The Executable Book Project is essentially a collaboration between ipypublish, jupyter-book and jupinx to take the best bits of our packages and combine them."
- Basically a framework for uses the existing tools, and processing with Sphinx, to convert a bunch of files (.yml, .ipynb, .md ...) into a website/PDF.
- Some notes/discussion on differences/similarities with vanilla nbsphinx.
This last sounds generally very promising for any large project documentation, incorporating both computational and scientific manuscript-style content, although might be overkill for stand-alone manuscripts:
The goal of the EBP is to build tools that facilitate creating professional computational narratives (books, lecture series, articles, etc.) using open source tools. We want users in the scientific, academic, and data science communities to be able to do the following:
Write their content in either markdown text files, or Jupyter Notebooks. These files include rich content - outputs from running code, references and cross-references, equations, etc.
Execute content and cache the results. Intelligent caching means that only modified code cells are re-run.
Combine cached outputs with content files with a document model. Using the excellent Sphinx documentation stack, documents can include many features for publishing, such as equations, cross-references, and citations.
Build interactive HTML or publication-quality PDF outputs. Sometimes users wish to create rich and interactive websites, other times they want to send a high-quality PDF to a publisher. This system will treat both as equal citizens.
Control everything above with a simple command-line interface. Most users should not have to know anything about Sphinx, caching, etc. A simple user interface will hide most of the complexity of this process.
(Possibly) Deprecated Tools
- Calico/Calysto:
- I saw this mentioned a few places, initially as Calico Document Tools - which seems to no longer exist (aside from an old Bitbucket repo) - and later as Calysto notebook extensions.
- It didn't install for me
Summary
There's still some work/testing for me to do here, but I think that (some) latex envs notebook extension is the tool closest to what I want to use for the most part (use-cases as listed above) - user-friendly and with all the key features, apart from Jupyter Lab support. Hopefully a bit more testing will make clear if this is the best tool. I'm also planning to test the ipypublish and/or Jupyter book frameworks to see what they offer above and beyond the basics of including citations.