Features
The kernel supports the following Jupyter features:
-
Code execution:
-
Autocompletion (
TAB
in Jupyter notebook): -
Code inspection (
Shift-TAB
up to 4 times in Jupyter notebook). -
Colored error message displays.
-
Add maven dependencies at runtime (See also magics).
-
Display rich output. E.g. here is a chart produced by DFLib and ECharts:
-
eval
function. (See also kernel) Note: the signature isObject eval(String) throws Exception
. This evaluates the expression (a cell) in the user scope and returns the actual evaluation result instead of a serialized one. -
Configurable evaluation timeout
Installation
Prerequisites
-
Java 11 or newer
-
If you already have another version of
jjava
kernel installed, remove it with the following command:
jupyter kernelspec remove java
Install Python and Jupyter
There are a few ways to install Python and Jupyter, depending on your OS and preferences. Below we provide a few specific recipes to help you to get started (especially if you are a Java developer new to the Python environment). But generally, Python is available from their official site, and Jupyter has its own installation instructions.
MacOS
If you are on MacOS, you can install both Python and Jupyter ("lab" and "notebook") with a single Homebrew command:
brew install jupyter
Windows
If you are on Windows, you can install Python using the official installer, and use "pip" for Jupyter:
-
Go to https://www.python.org/downloads/windows/ and download the latest Python installer
-
Run the installer
-
Open Command Prompt (
cmd
), and runpip install jupyterlab
(orpip install notebook
if you prefer the "classic" notebook)
Install JJava
-
Download JJava: go to GitHub releases, pick the latest version (or a specific one that you need) and under the "Assets" section download a file called
jjava-${version}-kernelspec.zip
-
Unzip the file into a temporary location
-
Run the following commands from the parent directory that contains the unzipped kernel folder
jupyter kernelspec install jjava-${version}-kernelspec --user --name=java
The above is the most common install recipe. To see all options available, run jupyter kernelspec install help
. -
Check that the Java kernel is installed:
jupyter kernelspec list Available kernels: python3 /path/to/python/kernel java /path/to/java/kernel
Running Jupyter
Depending on which Jupyter environment you installed, there will be a specific command to run the notebook. E.g.:
jupyter notebook
jupyter lab
jupyter console --kernel=java
Configuring
JJava kernel behavior can be configured via environment variables. Here is
an example on Windows using cmd
:
set JJAVA_JVM_OPTS=-Xmx8192m
jupyter lab
And the same example on Linux or MacOS:
export JJAVA_JVM_OPTS=-Xmx8192m
jupyter lab
# alternatively, store it in ".bashrc" so that the
# variable is always set implicitly
# echo 'export JJAVA_JVM_OPTS=-Xmx8192m' >> ~/.bashrc
Sometimes you don’t fully control the startup environment. If that’s the case, you may store the same variables
in the kernel.json
file that is a part of the JJava installation. To locate this file, run jupyter kernelspec list
command. kernel.json
should be located in the directory corresponding to the "java" kernel as listed by this command. In the
file, look for the "env"
section, and set any number of variables. E.g:
{
"argv": [
"java",
"-jar",
"{resource_dir}/jjava-launcher.jar",
"{resource_dir}/jjava.jar",
"{connection_file}"
],
"display_name": "Java (jjava)",
"language": "java",
"interrupt_mode": "message",
"env": {
"JJAVA_JVM_OPTS" : "-Xmx8192m"
},
"metadata": {
}
}
If you store your variables in kernel.json , be aware that they will be wiped out on subsequent kernel reinstalls.
So you will have to do it again after every kernel upgrade.
|
Environment Variables
Environment variable | Default | Description |
---|---|---|
|
|
A space delimited list of
command line options that would be passed to the |
|
|
A duration specifying a timeout (in
milliseconds by default) for a single top level statement. If less
than |
|
|
A file path separator delimited list of classpath entries that should be available to the user code. Important: no matter what OS, this should use forward slash ``/'' as the file separator. Also each path may actually be a simple glob. |
|
|
A file path
seperator delimited list of |
|
|
A block of java code to
run when the kernel starts up. This may be something like
|
|
|
A space delimited list of command line
options that would be passed to the |
|
|
Option that controls autoloading Kernel extensions feature.
If you do not want third-party libraries to load anything implicitly you could turn it off by |
Glob Syntax
Variables that support this glob syntax may reference a set of files with a single path-like string. Basic glob queries are supported including:
-
*
to match 0 or more characters up to the next path boundary/
-
?
to match a single character -
A path ending in
/
implicitly adds a*
to match all files in the resolved directory
Any relative paths are resolved from the notebook server’s working
directory. For example the glob *.jar
will match all jars is the
directory that the jupyter notebook
command was run.
Note: users on any OS should use /
as a path separator.
Magics
"Magics" is an IPython concept adopted by JJava kernel. There are "line" and "cell" magics, both being syntactic sugar for invoking special kernel functions.
Line magics
Line magics are single-line function calls, with magic name prefixed with %
, and arguments separated by spaces:
%mavenRepo snapshots https://s01.oss.sonatype.org/content/repositories/snapshots/
%maven org.dflib:dflib-jupyter:1.1.0
List<String> addedJars = %jars C:/all/my/*.jar
Cell magics
Cell magics are function calls, with magic name prefixed with %%
and using the body of the entire cell as the last
argument:
%%loadFromPOM
<repository>
<id>snapshots</id>
<url>https://s01.oss.sonatype.org/content/repositories/snapshots/</url>
</repository>
<dependency>
<groupId>org.dflib</groupId>
<artifactId>dflib-jupyter</artifactId>
<version>2.0.0-SNAPSHOT</version>
</dependency>
JJava Magics
JJava provides a number of magics that help with notebook dependency management. Some allow to add local jars, others allow to reference dependencies from local and remote Maven repositories:
jars
(line)
Adds jars to the notebook classpath. Arguments are glob paths to jars on the local file system. If a glob matches a directory all files in that directory will be added.
classpath
(line)
Adds entries to the notebook classpath. Arguments are glob paths to entries on the local file system. This includes directories or jars.
maven
, aka addMavenDependencies
, addMavenDependency
(line)
Adds maven artifacts to the notebook classpath. All transitive dependencies are also added to the classpath. Arguments
are in the form of "dependency coordinates" like groupId:artifactId:[packagingType:[classifier]]:version
mavenRepo
, aka addMavenRepo
(line)
Add a maven repository to search for the benefit of maven
magic. Takes two arguments:
<repo_id> <repo_url>
.
loadFromPOM
(either line or cell)
Loads any dependencies specified in a POM. It ignores repositories added with addMavenRepo as the POM would likely specify its own. This cell magic is designed to make it very simple to copy and paste from any READMEs specifying maven POM fragments to use in depending on an artifact (including repositories other than central).
Line magic arguments:
-
path to local POM file
-
list of scope types to filter the dependencies by. Defaults to
compile
,runtime
,system
, andimport
if not supplied.
Cell magic arguments:
-
varargs list of scope types to filter the dependencies by. Defaults to
compile
,runtime
,system
, andimport
if not supplied. -
body: A partial POM literal.
If the body is an XML <project>
tag, then the body is used as a POM without modification. Otherwise, the magic
attempts to build a POM based on the XML fragments it gets. <modelVersion>
, <groupId>
, <artifactId>
, and <version>
are given default values if not supplied. All children of <dependencies>
and <repositories>
are collected
along with any loose <dependency>
and repository
tags.
E.g., to add a dependency not in central simply add a valid <repository>
and <dependency>
and the magic will take
care of putting it together into a POM:
%%loadFromPOM
<repository>
<id>snapshots</id>
<url>https://s01.oss.sonatype.org/content/repositories/snapshots/</url>
</repository>
<dependency>
<groupId>org.dflib</groupId>
<artifactId>dflib-jupyter</artifactId>
<version>2.0.0-SNAPSHOT</version>
</dependency>
Kernel
All code running in JJava flows through the kernel. This makes it the place to register magics, add things to the classpath, and perform many jupyter related operations.
Notebook functions
JJava injects a function for getting the active kernel instance and additional helpers for making use of the kernel at runtime. These are defined in the runtime Kernel class.
JavaKernel getKernelInstance()
Get a reference to the current kernel. It may return null if called
outside of a kernel context but should be considered @NonNull
when
inside a notebook or similar. The kernel api has lots of goodies, look
at the
JavaKernel
class for more information. Specifically there is access to adding to
the classpath, getting the magics registry and maven resolver, and
access to eval.
Object eval(String expr) throws Exception
The eval
function provides full access to the code evaluation
mechanism of the kernel. It evaluates the code in the same scope as
the kernel and returns an object. This object is an object that lives
in the kernel!
The given expression can be anything you would write in a cell, including magics.
(int) eval("1 + 2") + 3
Display
One of the many great things about the Jupyter front ends is the support for
display_data
.
JJava interfaces with the base kernel’s high-level rendering API.
Notebook functions
JJava injects two functions into the user space for displaying data:
display
and render
. Most use cases should prefer the former but
there is a necessary case for render
that is outline below. In
addition the updateDisplay
function can be used to update a
previously displayed object. All are defined in the runtime
Display class.
All display/render functions include a text/plain
representation in
their output. By default, this is the String.valueOf(Object)
value
but it can be overridden.
String display(Object o)
Display an object as it’s preferred types. If you don’t want a specific type it is best to let the object decide how it is best represented.
The object is rendered and published on the display stream. An id is
returned which can be used to updateDisplay
if desired.
String display(Object o, String... as)
Display an object as the requested types. In this case the object
attempts to be rendered as the desired mime types given in as
. No
promises though, if a type is unsupported it will simply not appear in
the output.
The object is rendered and published on the display stream. An id is
returned which can be used to updateDisplay
if desired.
This is useful when a type has many potential representations but not
all are preferred. For example a CharSequence
has many
representations but only the text/plain
is preferred. To display it
as executable javascript we can use the following:
display("alert('Hello from JJava!');", "application/javascript");
Since there is the potential that some front ends don’t support a given format many can be given and the front end chooses the best. For example, to display as html and markdown:
display("<b>Bold</b>", "text/html", "text/markdown");
This will trigger a display message with values for text/html
,
text/markdown
, and the implicit text/plain
.
DisplayData render(Object o)
Renders an object as it’s preferred types and returns it’s rendered
format. Similar to display(Object o)
but without publishing the
result.
DisplayData render(Object o, String... as)
Renders an object as the requested types and returns it’s rendered
format. Similar to display(Object o, String... as)
but without
publishing the result.
When expressions are the last code unit in a cell they are rendered with
the render(Object o)
semantics. If this is not desired it can be
hijacked by wrapping it in a call to this function.
String md = "Hello from **JJava**";
render(md, "text/markdown")
This will result in the Out[_]
result to be the pretty
text/markdown
representation rather than the boring text/plain
representation.
void updateDisplay(String id, Object o)
Renders an object as it’s preferred types and updates an existing
display with the given id to contain the new rendered object. Similar to
display(Object o)
but updates an existing displayed object instead
of appending a new one.
void updateDisplay(String id, Object o, String... as)
Renders an object as it’s requested types and updates an existing
display with the given id to contain the new rendered object. Similar to
display(Object o, String... as)
but updates an existing displayed
object instead of appending a new one.
String id = display("<b>Countdown:</b> 3", "text/html");
for (int i = 3; i >= 0; i--) {
updateDisplay(id, "<b>Countdown:</b> " + i, "text/html");
Thread.sleep(1000L);
}
render("<b>Liftoff!</b>", "text/html")
Jupyter-Aware Java Libs
If you are writing a Java library that is specifically intended to work inside Jupyter, JJava provides a way for such
a library to execute custom code when the kernel loads it. E.g., for user convenience it might add some library-specific
import
statements to the environment.
There are two pieces that need to be present in the library .jar
for the above to work. The first is a Java class
implementing org.dflib.jjava.jupyter.Extension
interface:
package my.lib;
import org.dflib.jjava.jupyter.Extension;
import org.dflib.jjava.jupyter.kernel.BaseKernel;
public class MyLibExtension implements Extension {
@Override
public void install(BaseKernel kernel) {
try {
kernel.eval("import my.lib.*"); (1)
} catch (Exception e) {
throw new RuntimeException(e);
}
}
}
1 | Adds common imports related to the library. |
The second piece is a file declaring this extension to the kernel. The file must have this exact location and
name - META-INF/services/org.dflib.jjava.jupyter.Extension
- placed on the classpath (e.g. under src/main/resources
of a Maven project). It must contain a single line of text corresponding to the fully-qualified name of the Java class
above:
my.lib.MyLibExtension
As long as those two files are in the library jar, adding the .jar
as a notebook dependency (e.g. as
%maven my.lib:my-lib:1.0.0
) will cause an execution of the install(..)
method.
JJava core itself provides an extension that implicitly loads common imports like java.io , java.time , etc.
in every notebook.
|
Another, more ad-hoc mechanism to load custom code per Jupyter instance (instead of per-notebook and per-library)
is setting JJAVA_STARTUP_SCRIPT environment variable to point to a custom JShell script.
|
To disable all custom extensions for a given Jupyter process, including the JJava core extension you can set
JJAVA_LOAD_EXTENSIONS
variable to 0
:
export JJAVA_LOAD_EXTENSIONS=0
Notebooks and Version Control
This is a common version control hint for Jupyter notebooks, not specific to Java. |
Jupyter places data generated by the notebook in the notebook itself. Sometimes this is the desired behavior (e.g. when you want to share the notebook results with your audience via GitHub), but very often this is just a nuisance. Assuming you are using Git, you may automatically strip off the data before each commit by using Git hooks. Here is one possible recipe:
-
Add a script for stripping off outputs and execution counts somewhere in your repo. In our example it will be in
bin/clean_ipynb.py
:#!/usr/bin/env python3 import sys import json nb = sys.stdin.read() json_in = json.loads(nb) def strip_output_from_cell(cell): if "outputs" in cell: cell["outputs"] = [] if "execution_count" in cell: cell["execution_count"] = None for cell in json_in["cells"]: strip_output_from_cell(cell) json.dump(json_in, sys.stdout, sort_keys=True, indent=1, separators=(",",": "))
-
Create
.gitconfig
file in the root of your repo:[filter "clean_ipynb"] smudge = cat clean = bin/clean_ipynb.py
-
Create
.gitattributes
file in the root of your repo, referencing the filter from.gitconfig
:*.ipynb filter=clean_ipynb
-
All the files above should be version-controlled. Every time a user clones the repo, they will need to execute the following command manually to enable this configuration:
git config --local include.path ../.gitconfig