---
title: "rolog: Prolog queries from R"
author: "Matthias Gondan
         (Department of Psychology, Universität Innsbruck, Austria)
         and Jan Wielemaker
         (Vrije Universiteit Amsterdam/SWI-Prolog Solutions b.v.)"
date: "2023-01-27"
bibliography: bibliography.bibtex
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{rolog: Prolog queries from R}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r, include=FALSE}
knitr::opts_chunk$set(collapse=TRUE, comment="#>")
library(htmltools)
library(DiagrammeR)
library(DiagrammeRsvg)
```

  Matthias Gondan\
  Universität Innsbruck\
  Department of Psychology\
  Innrain 9\
  A-6020 Innsbruck\
  Matthias.Gondan-Rochon@uibk.ac.at
  
# Abstract

Prolog is a classical logic programming language with many applications in
expert systems, computer linguistics and traditional, that is, symbolic 
artificial intelligence. The main strength of Prolog is
its concise representation of facts and rules for the representation of
knowledge and grammar, as well as its efficient built-in search engine for
closed world domains. R is a statistical programming language for
data analysis and statistical modeling which is widely used in academia and
industry. Besides the core library, a lot of packages have been developed for
all kinds of statistical problems, including statistics-based artificial
intelligence tools such as neural networks for machine learning and deep
learning. Whereas Prolog is weak in statistical computation, but strong in
symbolic manipulation, the converse may be said for the R language. SWI-Prolog
is a widely used Prolog system that offers a wide range of extensions for real
world applications, and there already exist two Prolog "packs" to invoke
R (`rserve-client`, `real`) from SWI-Prolog. Given the large user community of
R, there may also be a need for a connection in the reverse direction that
allows invoking Prolog queries in R computations. The R\ package `rolog`
connects to the SWI-Prolog system, thus enabling deterministic and
non-deterministic queries to the Prolog interpreter. Usage of `rolog` is
illustrated by a few examples.

## Keywords

Statistics; Logic Programming; Artificial Intelligence; R; Prolog

# 1. rolog: Prolog queries from R

The R [@R] programming language and environment is a widely used open source
software for statistical data analysis. The basic R is a functional language
with lots of support for storage and manipulation of different data types, and
a strong emphasis on operations involving vectors and arrays. Moreover, a huge
number of packages (e.g., CRAN, https://cran.r-project.org/) have been
contributed that cover problems from areas as diverse as bioinformatics, machine
learning, specialized statistical methods, web programming and connections to
other programming languages.

An interface to Prolog is lacking so far. Based on earlier work by Kowalski, the
logic programming language Prolog was invented in the 1970ies by Colmerauer and
Roussel [@Kowalski1988], mostly for the purpose of natural language processing.
Since then, logic programming has become an important driving force in research
on artificial intelligence, natural language processing, program analysis,
knowledge representation and theorem
proving [@Shoham1994;@Lally2011;@Carro2004;@Hsiang1987].
SWI-Prolog [@Wielemaker2012] is an open-source implementation of Prolog that
mainly targets developers of applications, with many users in academia, research
and industry. SWI-Prolog includes a large number of libraries for "the real
world", for example, a web server, encryption, interfaces to C/C++ and other
programming languages, as well as a development environment and debugger. In
addition, pluggable extensions (so-called packs) are available for specific
tasks to enhance its capabilities.

Unlike R, Prolog is a declarative programming language consisting of facts and
rules that define relations, for example, in a problem space [@Newell1972].
Prolog's major strength is its built-in query-driven search engine that
efficiently deals with complex structured data, with the data not necessarily
being numerical. In fact, Prolog only provides a basic collection of arithmetic
calculations via a purely functional interface (`is/2`). More complex
calculations such as matrix algebra, statistical models or machine learning need
help from other systems, for example, from R.

Angelopoulos et al. [-@Angelopoulos2013] summarize work at the intersection of
symbolic knowledge representation and statistical inference, especially in the
area of model fits [EM algorithms, MCMC, @Sato2001;@Angelopoulos2008] and
stochastic logic programs [@Cussens2000;@Kimmig2011]. One of the major strengths
of logic programming is handling constraints; and a number of systems for
constraint satisfaction tools have been developed (constraint logic programming
on booleans, finite domains, reals, and intervals) for that
purpose [e.g., @Fruehwirth1998;@Triska2018]. Some constraint handlers exist in
R (see the CRAN task view for optimization problems), but more of them would be
available via a bridge between R and Prolog. 

Earlier approaches to connect Prolog and R have been published as SWI-Prolog 
packs [real, rserve_client, @Angelopoulos2013;@Rserve] and as a YAP
module [YapR, @YapR]. Whereas `real` establishes a direct link to an embedded
instance of R, `rserve-client` communicates with a local or remote R
service [@Urbanek2021]. The former approach emphasizes speed, the latter might
be preferred from a security perspective, especially in systems such as
SWISH [@SWISH] that accept only a set of sandboxed commands for Prolog, but do
not impose restrictions on R. A common feature of the two packages is that they
provide an interface for R calls from Prolog, but not the other way round, that
is, querying Prolog from R is not possible, so far.

The present package fills this gap through Prolog queries in R scripts, for
example, to perform efficient symbolic computations, searches in complex graphs,
parsing natural language and definite clause grammars. In addition, two Prolog
predicates are provided that enable Prolog to ring back to the R system for
bidirectional communication. Similar to `real`, tight communication between the
two systems is established by linking to a shared library that embeds the
current SWI-Prolog runtime. The exchange of data is facilitated by the C++
interfaces of the two languages [@Edelbuettel2018;@Wielemaker2021]. A less tight
connection might be established using the recently developed machine query
interface [@Zinda2021] that allows socket-based communication between foreign
languages and SWI-Prolog (and, in fact, the `MQI` documentation includes an
example in which R is called).

A bidirectional bridge between R and Prolog might overcome the limitations of
both languages, thereby combining the extensive numerical and statistical power
of the R system with Prolog's skills in the representation of knowledge and
reasoning. In addition to the useful little tools shown in the examples
below, `rolog` can therefore contribute to progress at the intersection of
traditional artificial intelligence and contemporary statistical programming.

The next section presents the interface of `rolog` in detail. Section\ 3
presents possible extensions of the package at both ends, in R and Prolog.
Section\ 4 is a list of illustrative examples that offer useful extensions to
the R system. Conclusions and further perspectives are summarized in Section\ 5.

# 2. Basic syntax

`rolog` has a rather minimalistic syntax, providing only the basic ingredients
to establish communication with the SWI-Prolog runtime. Ways to extend the
interface are described in Section\ 3.

After installation with `install.packages("rolog")`, the package is loaded in
the standard way.

```{r}
library(rolog)
```

We can see a short message telling the user which SWI-Prolog was found. The
package searches for SWI-Prolog based on the environment
variable `SWI_HOME_DIR`, the registry (Windows only), an executable `swipl` in
the `PATH`, and if everything fails, R package `rswipl` [@rswipl]. The message
can be silenced by the usual option `quietly=TRUE` of the `library` command.

## R interface

Most of the work is done using the three R\ functions `query`, `submit`,
and `clear`. The R program in Listing\ 1 illustrates a query to
Prolog's `member/2` using `rolog`'s syntax rules.

```{r}
# member(1, [1, 2.0, a, "b", X, true])
query(call("member", 1L, list(1L, 2.0, quote(a), "b", expression(X), TRUE)))

# returns an empty list, stating that member(1, [1 | _]) is satisfied
submit()

# returns a list with constraints, stating that the query is also satisfied 
# if the fifth element of the list, X, is 1
submit()

# close the query
clear()
```

Listing\ 1.
: A query to Prolog's `member/2` predicate.

`query`. The function `query(call, options)` is used to create a Prolog
query (without invoking it yet). The first argument is a regular R call that is
created using R's function `call(name, ...)`. This call represents the Prolog
query that will be submitted in the later course. The creation of such
predicates and Prolog terms is described below and can become quite
contrived (see the examples in Section\ 4). The second argument, `options`, may
be used for ad hoc modifications of the translation between R and Prolog, see
the section below. The function returns `TRUE` on success. Note that `query`
does not check if a Prolog predicate corresponding to `call` actually
exists (see `submit()` below). Only a single query can be opened at a given
time. If a new query _Q_ is created while another query _R_ is still open, a
warning is shown and _R_ is closed.

`submit`. Once a query has been created, it can be submitted using `submit()`.
If the query fails, the return value is `FALSE`. If the query succeeds, a list
of constraints is returned, with bindings for the variables that satisfy the
query. Repeated calls to submit are possible, returning the different solutions
of a query (until it eventually fails). The distinction between the different
types of return values for success and failure (list vs. `FALSE`) is facilitated
by the R function `isFALSE(x)`.

`clear`. Closes the query. The name of the function was chosen to avoid name
clashes with R's own built-in function `close`. The function returns an
invisible `TRUE`, even if there is no open query.

Three more functions `consult`, `once`, and `findall` are provided for
convenience.

`consult`. In most applications, a number of Prolog facts and rules will be
loaded into the system. To facilitate this recurrent task, the Prolog
directive `consult/1` has been mirrored into R, `consult(filename)`, with
`filename` being a string or a vector of strings if multiple files are to be
consulted. The function returns `TRUE` on success; in case of problems, it
returns `FALSE` and an error message is shown.

`once` and `findall`. The function `once(call, options)` is a convenience
function that acts as a shortcut for `query(call, options)`, `submit()`, and
`clear()`. Similarly, `findall(call, options)` abbreviates the 
commands `query(call, options)`, repetition of `submit()` until failure,
and `clear()`, returning a list collecting the return values of the individual
submissions.

## Creating Prolog terms in R

Table\ 1 summarizes the rules for the translation from R objects to Prolog. Most
rules work in both directions, but a few exceptions exist.

Table\ 1
: Creating Prolog terms from R

|R                       |Prolog                    |Note/Alternatives        |
|:-----------------------|:-------------------------|:------------------------|
|`expression(X)`         |Variable X                |not necessarily uppercase|
|`as.symbol(abc)`        |Atom abc                  |`as.name`, `quote`       |
|`TRUE`, `FALSE`, `NULL` |Atoms true, false, null   |                         |
|`"abc"`                 |String \"abc\"            |                         |
|`3L`                    |Integer 3                 |                         |
|`3`                     |Float 3.0                 |                         |
|`call("term", 1L, 2L)`  |term(1, 2)                |                         |
|`list(1L, 2L, 3L)`      |List [1, 2, 3]            |                         |
|`list(a=1, b=2, c=3)`   |List [a-1, b-2, c-3]      |                         |
|`c(1, 2, 3, Inf)`       |##(1.0, 2.0, 3.0, 1.0Inf) |vectors of length > 1    |
|`c(1L, 2L, 3L)` or `1:3`|\'%%\'(1, 2, 3)           |                         |
|`c("a", "b", "c")`      |\$\$(\"a\", \"b\", \"c\") |                         |
|`c(TRUE, FALSE, NA)`    |!!(true, false, na)       |                         |
|`sin`                   |function(x) :- sin(x)     |primitive function       |
|`function(x) sin(x)`    |function(x) :- sin(x)     |self-written function    |
|`matrix(1:4, nrow=2)`   |\'%%%\'(\'%%\'(1, 3), ...)|see also ###, \$\$\$, !!!|

In R, the basic elements such as integers, floating point numbers, character
strings, and logicals are vectorized, and scalar entities are treated like
vectors with one element. Conversely, Prolog does not natively support vectors
or matrices. The problem is solved in the following way: 

* R vectors of length 0 are translated to Prolog's empty list.
* R vectors of length 1 are translated to Prolog scalars.
* R vectors of length $N > 1$ are translated to Prolog
  terms `##/N`, `%%/N`, `$$/N`, and `!!/N` for floating point numbers, integers,
  strings and logicals, respectively.
* R matrices are translated to Prolog
  terms `###/R`, `%%%/R`, `$$$/R`, and `!!!/R` with the respective row vectors
  as arguments.

In the reverse direction, Prolog terms like `##/N` are translated back to R 
vectors of length _N_, including the terms `##/0` and `##/1` that map to R
vectors of length 0 and 1, respectively. Translation of a polymorphic Prolog
term such as `##(a, 1.5)` to R will fail, since `rolog` expects the arguments to be numeric.

If a Prolog object cannot be translated to R (e.g., a cyclic term), an error is
raised. If an R object that lacks a suitable representation in
Prolog (e.g., S4 class), a warning is printed and the result is unified
with `na`.

To summarize, the rules for translation are not fully symmetrical. A quick check
for symmetry of the representation is obtained by a query to `=/2` or
even `r_eval/2` (see also below, subsection Prolog interface):

```{r}
Q <- call("=", expression(X), c(1, 2, NA, NaN, Inf))
once(Q, options=list(portray=TRUE))

Q <- call("r_eval", c(1, 2, NA, NaN, Inf), expression(X))
once(Q)
```

The optional argument `env` to query, once and findall allows to raise the
query (and, as a consequence, r_eval/1,2 in a specific environment.

## Package options

A few package-specific options have been defined to allow some fine-tuning of
the rules for translation between R and Prolog.

* *realvec* (string): Name of the Prolog term for vectors of
  floats (default is `##`)
* *realmat*: Name of the Prolog term for matrices of floats (default is `###`)
* *intvec*/*intmat*: same for vectors/matrices of
  integers (defaults are `%%`/`%%%`)
* *boolvec*/*boolmat*: same for vectors/matrices of 
  logicals (defaults are `!!`/`!!!`)
* *charvec*/*charmat* (string): same for vectors/matrices of
  character strings (defaults are `$$`/`$$$`). The single dollar cannot be
  used because it is the list operator in R.
* *scalar* (logical): if `TRUE` (default), R vectors of length\ 1 are translated
  to scalars in Prolog. If `FALSE` (rarely used), R\ vectors are always
  translated to `##/N`, or `%%/N`, `!!/N`, `$$/N`, even if they have only one
  element.
* *portray* (logical): if `TRUE` (default in `query`), the result
  of `query`, `once` and `findall` includes an attribute with a text
  representation of the query in Prolog.
* *preproc* (function with one argument): R hook that can be used
  to preprocess R terms before translation. The default is `rolog`'s own
  `preproc` function that maps R's `x <= y` to Prolog's `x =< y` and `!=` to
  `\=`. Preprocessing can be turned off by assigning the R
  function `dontCheck` to the preproc option.
* *postproc* (function with one argument): R hook that can be used
  to postprocess R terms after a query. The default is `rolog`'s own
  `postproc` function that reverses the mapping from `preproc`.

The command `rolog_options()` returns a list with all the options. The 
options can be globally modified with `options()` or in the optional
argument of `query`, `once`, and `findall`.

```{r}
options(rolog.intvec="iv")
Q <- call("member", expression(X), list(c(1L, 2L), c(3.5, 4.5)))
query(Q, options=list(realvec="rv"))
submit()
clear()
```

## Prolog interface to R

`rolog` offers some basic support to call R from Prolog, that is, connecting the
two systems in the reverse direction. Two predicates can be used for this
purpose, `r_eval(Call)` and `r_eval(Function, Result)`. The former just invokes
R with the command `Call` (ignoring the result); the latter evaluates `Function` 
and unifies the result with `Result`. Note that proper quoting of R functions is
needed at the Prolog end, especially with R functions that start with uppercase
letters and/or contain a dot in their name (see Section\ 4).

## Exceptions

Package `rolog` has limited support for exception handling. If Prolog raises an
exception, the error string is forwarded to R using the `stop` function.
The examples below illustrate this by querying an undefined Prolog predicate.

```{r}
Q <- call("membr", expression(X), list(1, 2, 3))
query(Q)
try(submit())
clear()
```

See Section\ 4 for another example with an error resulting from a malformed
query to `r_eval/2`.

# 3. Extending the package

R is a functional language, whereas Prolog is declarative. Obviously, there
cannot be a perfect one-to-one correspondence between the syntactic components
of two programming languages that follow completely different paradigms. Whereas
symbols, functions, numbers and character strings are easily mapped between R
and Prolog, there are loose ends at both sides. The package is intentionally
kept minimalistic, but can easily be extended by convenience functions at both
ends, Prolog and R, to facilitate recurrent tasks and/or avoid cumbersome
syntax. 

In particular, Prolog variables are translated from and to
R *expressions* (not to be confused with R symbols), and R vectors of
length greater than 1 are translated to the Prolog terms `#/N`, `%/N`, `!/N`, 
and `$$/N`, as mentioned above. These rules are, in principle, arbitrary
and can be intercepted at several stages.

* R functions that may be used to pre-process specific R elements before
  translation to Prolog (see, e.g., the R function `as.rolog`)
* Prolog wrappers that manipulate the term before it is called
  and afterwards (see the example with dicts below)
* R functions that post-process the result of a query

The process is illustrated in Figure\ 1.

```{r, echo=FALSE, fig.width=6, fig.height=2}
HTML(export_svg(grViz(
  'digraph G
   {
     rankdir=LR
     Query Result

     subgraph cluster_0 
     {
       style=filled
       color=lightgrey
       node [style=filled,color=white]
       r2rolog -> forth -> rolog_pl
     }

     subgraph cluster_1 
     {
       style=filled
       color=lightgrey
       node [style=filled,color=white]
       rolog2r -> back [dir=back]
       back -> pl_rolog [dir=back]
     }
  
     Query -> r2rolog
     rolog_pl:e -> Prolog
     pl_rolog:e -> Prolog [dir=back]
     Result -> rolog2r [dir=back]

     Query [shape=Mdiamond;width=0.7;height=0.7]
     r2rolog [shape=rect,label="preproc"]
     forth [label="(rolog)"]
     rolog_pl [shape=rect,label="preproc/2"]
     Prolog [shape=Mcircle]
     pl_rolog [shape=rect,label="postproc/2"]
     rolog2r [shape=rect,label="postproc"]
     back [label="(rolog)"]
     Result [shape=Msquare]
   }')))
```

Figure\ 1
: Workflow in rolog

## Preprocessing in R

`rolog` uses a default preprocessing function `preproc(query)` to map the R
operators `<=` and `!=` to their Prolog counterparts `=</2` and `\=/2`,
respectively.

However, we have seen above that raising even simple everyday Prolog queries
such as `member(X, [1, 2, 3, a, b])` require complicated R expressions
like `call("member", expression(X), list(1, 2, 3, quote(a), quote(b)))`. The R
function `as.rolog(query)` is meant to simplify this a bit by translating
symbols starting with a dot to Prolog variables, and calls
like `""[1, 2, 3, a, b]` to lists. In the example below, `as.rolog` is added to
the queue of preprocessing functions.

```{r}
a <- 5
Q <- quote(member(.X, ""[1, 2, 3, a, (a), 1 <= 2]))
once(Q, options=list(preproc=list(as.rolog, preproc), portray=TRUE))
```

Note that the name of the variable will still be `X` in the later course, 
not "dot-X". As illustrated by the example above, `as.rolog` treats the
argument `a` as a symbol; to evaluate the respective variable (i.e., "unquote"),
it can be put in parentheses.

Preprocessing can be turned off by setting the option `preproc` to the identity
function `dontCheck`.

Section\ 3 includes an example for mathematical rendering of R expressions. In 
that example, a preprocessing function is used to bring function calls with
named arguments to a canonical form which is then handled in Prolog. More
sophisticated work with quasi-quotations and unquoting expressions is described
in "Advanced R" [@Wickham2019].

## Postprocessing in R

In most cases, postprocessing will revert the manipulations during
preprocessing, and the default function `postproc(query)` actually translates
the Prolog operators `=<` and `\=` back to their respective counterparts in R.

Many Prolog programmers are used to operate with atoms, whereas character
strings are the preferred representation of symbolic information in R. In the
example below, a second hook is put in the queue that converts the result of a
query like `member(X, [a, b, c])` to strings.

```{r}
stringify <- function(x)
{
  if(is.symbol(x))
    return(as.character(x))

  if(is.call(x))
    x[-1] <- lapply(x[-1], FUN=stringify)

  if(is.list(x))
    x <- lapply(x, FUN=stringify)

  if(is.function(x))
    body(x) <- stringify(body(x))

  return(x)
}

Q <- quote(member(.X, ""[a, b, c]))
R <- findall(Q, options=list(preproc=list(as.rolog, preproc), 
       postproc=list(stringify, postproc)))
unlist(R)
```

In other words, the query is satisfied if `X` is either "a", or "b", or "c".

## Pre- and postprocessing in Prolog

Recent versions of SWI-Prolog support so-called dictionaries of the
form `Tag{Key1:Value1, Key2:Value2, ...}`. The tag is typically an
atom (but can be a variable, as well), the keys are unique atom or integers; the
values can be anything. Suppose we have a Prolog predicate that does something
with dicts, and we would like to query it from R. The simplest solution is a
wrapper in Prolog that translates *key*-*value*
pairs `[Key1-Value1, Key2-Value2, ...]` back and forth to dicts:

```prolog
do_something_with_pairs(Pairs0, Pairs1) :-
    dict_pairs(Dict0, my_dict, Pairs0),
    do_something_with_dicts(Dict0, Dict1),
    dict_pairs(Dict1, my_dict, Pairs1).
```

`do_something_with_pairs/2` can then be queried from R using, for example, lists
with named elements (see Table\ 1).

```r
once(call("do_something_with_pairs", list(a=1, b=2), expression(X)))
```

In the code above, `dict_pairs/2` takes the role of both `preproc/2`
and `postproc/2` in Figure\ 1. It illustrates that complicated syntax on the R
side can be much simplified when doing the conversion at the Prolog end. Ways to
extend Prolog by add-ons ("packs") are shown in the next section.

# 4. Examples and use cases

In this section we present a few usage examples for package `rolog` in
increasing complexity. Although the code snippets are mostly self-explanatory,
some familiarity with the Prolog language is helpful.

## Hello, world

Prolog's typical _hello world_ example is a search through a directed acyclic
graph (DAG), for example, a family tree like the one given in Listing\ 2.

```prolog
parent(pam, bob). parent(bob, ann). parent(bob, pat). parent(pat, jim).

ancestor(X, Z) :-
    parent(X, Z).

ancestor(X, Z) :-
    parent(X, Y),
    ancestor(Y, Z).
```

Listing\ 2
: A family tree in Prolog (see also family.pl)

Listing\ 2 is included in the package and is accessed using the
function `system.file`. Within Prolog, the normal workflow is to consult
the code with `[family]` and then to raise queries such
as `ancestor(X, jim)`, which returns, one by one, four solutions for the
variable *X*. In R, we obtain the following results:

```{r}
library(rolog)
consult(system.file(file.path("pl", "family.pl"), package="rolog"))
query(call("ancestor", expression(X), quote(jim)))
submit()        # solutions for X
submit()        # etc.
clear()         # close the query
```

As stated above, `consult` loads the facts and rules of Listing\ 2 into the
Prolog database. `query` initializes a query, and the subsequent calls
to `submit` return the conditions under which the query succeeds. In this
example, the query succeeds if `X` is either `pat`, `pam`, or `bob`. A query is
closed with `clear()`, or automatically if the query fails. If we are interested
in just the first solution, we can use `once(Call)` as a shortcut
to `query(Call)`, then `submit()`, then `clear()`. If we want to collect all
solutions of a query with a finite set of solutions, we can use `findall(Call)`.

As mentioned in Section\ 2, a simplified syntax is provided by `as.rolog`
that accepts quoted expressions with dots indicating Prolog variables:

```r
Q <- quote(ancestor(.X, jim))
findall(Q, options=list(preproc=as.rolog))
```

## Backdoor test

A useful application of DAGs is confounder adjustment in causal 
analysis [@greenland1999;@ggdag]. The Prolog file `backdoor.pl` is an
implementation of Greenland et al.'s criteria for the backdoor test 
for *d*-separation in DAGs, with a predicate `minimal/3` that searches for
minimally sufficient sets of variables for confounder adjustment on the causal
path between exposure and outcome. The nodes and arrows refer to Figure\ 12 in
Greenland et al.

```{r}
consult(system.file(file.path("pl", "backdoor.pl"), package="rolog"))

node <- function(N) invisible(once(call("assert", call("node", N))))
node("a"); node("b"); node("c"); node("f"); node("u")
node("e") # exposure
node("d") # outcome

arrow <- function(X, Y) invisible(once(call("assert", call("arrow", X, Y))))
arrow("a", "d"); arrow("a", "f"); arrow("b", "d"); arrow("b", "f")
arrow("c", "d"); arrow("c", "f"); arrow("e", "d"); arrow("f", "e")
arrow("u", "a"); arrow("u", "b"); arrow("u", "c")

R <- findall(call("minimal", "e", "d", expression(S)))
unlist(R)
```

The query to `minimal/3` returns two minimally sufficient sets of covariates for
confounder adjustment (namely, {a, b, c} and {f}).

## Definite clause grammars

One of the main driving forces of Prolog development was natural language 
processing [@Dahl1981]. Therefore, the next example is an illustration of 
sentence parsing using so-called definite clause grammars. As Listing\ 3 shows,
rolog can access modules from SWI's standard library (e.g., "dcg/basics.pl").

```prolog
:- use_module(library(dcg/basics)).

s(s(NP, VP)) --> np(NP, C), blank, vp(VP, C).
np(NP, C) --> pn(NP, C).
np(np(Det, N), C) --> det(Det, C), blank, n(N, C).
np(np(Det, N, PP), C) --> det(Det, C), blank, n(N, C), blank, pp(PP).
vp(vp(V, NP), C) --> v(V, C), blank, np(NP, _).
vp(vp(V, NP, PP), C) --> v(V, C), blank, np(NP, _), blank, pp(PP).
pp(pp(P, NP)) --> p(P), blank, np(NP, _).

det(det(a), sg) --> `a`.
det(det(the), _) --> `the`.
pn(pn(john), sg) --> `john`.
n(n(man), sg) --> `man`.
n(n(men), pl) --> `men`.
n(n(telescope), sg) --> `telescope`.
v(v(sees), sg) --> `sees`.
v(v(see), pl) --> `see`.
p(p(with)) --> `with`.

% Translate R string to code points and invoke phrase/2
sentence(Tree, Sentence) :-
    string_codes(Sentence, Codes),
    phrase(s(Tree), Codes).
```

Listing\ 3
: Simple grammar and lexicon. `sentence/2` preprocesses the R call.

As in the first example, we first consult a little Prolog program with a
minimalistic grammar and lexicon (Listing\ 3, see also `pl/telescope.pl`), and
then raise a query asking for the syntactic structure
of "john sees a man with a telescope". Closer inspection of the two results
reveals the two possible
meanings, "john sees a man *who carries* a telescope"
versus "john sees a man *through* a telescope". Further Prolog examples of
natural language processing are found in \citet{Blackburn2005}, including the
resolution of anaphoric references and the extraction of semantic meaning.

```{r}
consult(system.file(file.path("pl", "telescope.pl"), package="rolog"))
Q <- quote(sentence(.Tree, "john sees a man with a telescope"))
unlist(findall(Q, options=list(preproc=as.rolog)))
```

## Installation of add-ons for Prolog

In description of the previous example, we noted in passing that `rolog` can
access the built-in libraries of SWI-Prolog (e.g., by calls to `use_module/1`).
It is also possible to extend the installation by add-ons, including add-ons
that require compilation, if the build tools (essentially, RTools under Windows,
and xcode under macOS) are properly configured. This is illustrated below by the
demo add-on `environ` [@Environ] that collects the current environment
variables.

```r
once(call("pack_install", quote(environ), list(quote(interactive(FALSE)))))
once(quote(use_module(library(environ))))
once(call("environ", expression(X)))
```

The query then unifies *X* with a list with `Key=Value` terms. The purpose of
this example is obviously not to mimic the built-in function `Sys.getenv()` from
R, but to illustrate the installation and usage of Prolog extensions from within
R. In most situations, the user would install the pack from within Prolog
with `pack_install(environ).`.

## Term manipulation

Prolog is homoiconic, that is, code is data. In this example, we make use of
Prolog's ability to match expressions against given patterns and modify these
expressions according to a few predefined "buggy rules" [@Brown1978], inspired
by recurrent mistakes in the statistics exams of our students. Consider
the $t$-statistic for comparing an observed group average to a population mean:

$$
T = \frac{\overline{X} - \mu}{s / \sqrt{N}}
$$

Some mistakes may occur in this calculation, for example, omission of the
implicit parentheses around the numerator and the denominator when typing the
numbers into a calculator, resulting 
in $\overline{X} - \frac{\mu}{s} \div \sqrt{N}$, or forgetting the square root
around $N$, or both. Prolog code for the two buggy rules is given in
Listing\ 4.

```prolog
% Correct steps and mistakes
expert(tratio(X, Mu, S, N), frac(X - Mu, S / sqrt(N))).
buggy(frac(X - Mu, S / SQRTN), X - frac(Mu, S) / SQRTN).
buggy(sqrt(N), N).

% Apply expert and buggy rules, or enter expressions
step(X, Y) :-
    expert(X, Y) ; buggy(X, Y).
step(X, Y) :-
    compound(X),
    mapargs(search, X, Y),
    dif(X, Y).

% Search through problem space
search(X, X).
search(X, Z) :-
    step(X, Y),
    search(Y, Z).
```

Listing 4
: Manipulating terms in Prolog

The little e-learning system shown in Listing\ 4 produces six response
alternatives. The fourth and the sixth result are combinations of the same two
buggy rules (parenthesis, then square root, and the other way round). Some
additional filters would be needed to eliminate trivial and redundant solutions 
\citep[see, e.g., the chapter on generate-and-test in][]{Sterling1994}.

```{r}
consult(system.file(file.path("pl", "buggy.pl"), package="rolog"))
Q <- quote(search(tratio(x, mu, s, n), .S))
unlist(findall(Q, options=list(preproc=as.rolog)))
```

An important feature of such a term manipulation is that the evaluation of the
term can be postponed; for example, there is no need to instantiate the
variables *x*, *mu*, *s*, and *n* with given values before raising a query. This
is especially helpful for variables that may represent larger sets of data in
later steps.

It should be mentioned that R is homoiconic, too, and the Prolog code above can,
in principle, be rewritten in R using non-standard evaluation
techniques [@Wickham2019]. Prolog's inbuilt pattern matching algorithm
simplifies things a lot, though. 

## Rendering mathematical expressions

The R extension of the markdown language [@Xie2020] enables reproducible
statistical reports with nice typesetting in HTML, Microsoft Word, and Latex.
However, so far, R expressions such as `pbinom(k, N, p)` are typeset as-is; 
prettier mathematical expressions such as $P_\mathrm{Bi}(X \le k; N, p)$ require
Latex commands like `P_\mathrm{Bi}\left(X \le k; N, p\right)`, which are
cumbersome to type in and hard to read even if the expressions are simple.
Since recently, manual pages include support for mathematical
expressions [@Sarkar2022], which already is a big improvement.

Below Prolog's grammar rules are used for an _automatic_ translation of R calls
to MathML. The result can then be used for calculations or it can be rendered on
a web page. A limited set of rules for translation from R to MathML is found
in `pl/mathml.pl` of package `rolog`. A more comprehensive translator is
provided by the R package `mathml` [@mathml]. The relevant code snippets are
shown in the listings below, along with their output.

```{r}
library(rolog)
consult(system.file(file.path("pl", "mathml.pl"), package="rolog"))

# R interface to Prolog predicate r2mathml/2
mathml <- function(term)
{
  t <- once(call("r2mathml", term, expression(X)))
  cat(paste(t$X, collapse=""))
}
```

Listing 4
: Generate MathML from R expressions

The first example is easy. At the Prolog end, there is a handler for `pbinom/3`
that translates the term into a pretty MathML syntax like P_bi(X <= k; N, pi).

```{r, results="asis"}
term <- quote(pbinom(k, N, p))

# Pretty print
mathml(term)

# Do some calculations with the same term
k <- 10
N <- 22
p <- 0.4
eval(term)
```

The next example is interesting because Prolog needs to find out the name of
the integration variable for `sin`. For that purpose, rolog provides a
predicate `r_eval/2` that calls R from Prolog (i.e., the reverse direction, see
also next example). Here, the predicate is used for the 
R\ function `formalArgs(args(sin))`, which returns the name of the function
argument of `sin`, that is, `x`.

```{r, results="asis"}
term <- quote(integrate(sin, 0L, 2L*pi))
mathml(term)
eval(term)
```

Note that the Prolog end, the handler for `integrate/3` is rather rigid; it
accepts only these three arguments in that particular order, and without names,
that is, `integrate(sin, lower=0L, upper=2L * pi)` would not print the desired
result.

The extra R function `canonical()` applies `match.call()` to non-primitive R
calls, basically cleaning up the arguments and bringing them into the correct
order. Moreover, an extra handler maps the extractor function `$(Fn, "value")`
to `Fn`.

```{r, results='asis'}
canonical <- function(term)
{
  if(is.call(term))
  {
    f <- match.fun(term[[1]])
    if(!is.primitive(f))
      term <- match.call(f, term)
    
    # Recurse into arguments
    term[-1] <- lapply(term[-1], canonical)
  }

  return(term)
}

g <- function(u)
  sin(u)

# Mixture of (partially) named and positional arguments in unusual order
term <- quote(2L * integrate(low=-Inf, up=Inf, g)$value)
mathml(canonical(term))

# It is a bit of a mystery that R knows the result of this integral.
eval(term)
```

Note that both `sin` nor `g` in the above terms are R symbols, not R functions.
In order to render something like `call("integrate", low=-Inf, up=Inf, g)`,
or `call("integrate", low=-Inf, up=Inf, sin)`, with `g` and `sin` referring to
the respective functions, one would need to determine its name, which is not
possible in general.

```{r}
print(g)
```

## Calling R from Prolog

The basic workflow of the bridge from R to Prolog is to (A)\ translate an
R\ expression into a Prolog term (i.e., a predicate), (B)\ query the predicate,
and then, (C)\ translate the result (i.e., the bindings of the variables) back
to R (see also Figure\ 1). The reverse direction is straightforward, we start by
translating a Prolog term to an R\ expression (i.e. Step\ C), evaluate the
R\ expression, and then translate the result back to a Prolog\ term (Step\ A).
Package `rolog` provides two predicates for that purpose, `r_eval(Expr)`
and `r_eval(Expr, Res)`. The former is used to invoke an R\ expression `Expr`
for its side effects (e.g., initializing a random number generator); it does not
return a result. The latter is used to evaluate the R\ expression and return the
result `Res`. The code snippet in Listing\ 6 (`r_eval.pl`) illustrates this
behavior.

```prolog
r_seed(Seed) :-
    r_eval('set.seed'(Seed)).

r_norm(N, L) :-
    r_eval(rnorm(N), L).
```

Listing\ 6
: Calling R from Prolog using `r_eval/1` and `r_eval/2`. The R\ call `set.seed` is quoted because the dot is an operator in Prolog.

```{r}
consult(system.file(file.path("pl", "r_eval.pl"), package="rolog"))
invisible(once(call("r_seed", 123L)))
once(call("r_norm", 3L, expression(X)))
```

The example in Listing\ 6 is a bit trivial, basically illustrating the syntax
and the workflow. More serious applications of are shown in the next two
sections where `r_eval/2` is used to evaluate monotonically behaving
R\ functions and to obtain the names of function arguments in R.

As show below, the default environment of `rolog`'s `r_eval/2` is `.GlobalEnv`,
this can be changed in an optional argument to `once()`, `findall()`,
and `query()`.

```{r}
# Set variable in R, read in Prolog
env <- new.env()
with(env, a <- 1)
once(call("r_eval", quote(a), expression(X)), env=env)

# Set R variable in Prolog, read in R
invisible(once(call("r_eval", call("<-", quote(b), 2))))
cat("b =", b)
```

If the R call raises an exception, an error is propagated to Prolog
and finally to the `rolog` package:

```{r}
#try(once(quote(r_eval(rnorm(-1))))) # return "-1" random normals
```

## Interval arithmetic

Let $\langle\ell, u\rangle$ denote a number between $\ell$ and $u$, $\ell\le u$.
It is easily verified that the result of the
difference $\langle\ell_1, u_1\rangle - \langle\ell_2, u_2\rangle$ is somewhere
in the interval $\langle \ell_1 - u_2, u_1 - \ell_2\rangle$, and a number of
rules exist for basic arithmetic operations and (piecewise) monotonically
behaving functions [@Hickey2001]. For ratios, denominators with mixed sign yield
two possible intervals, for example,
$\langle 1, 2\rangle / \langle -3, 3\rangle = \langle -\infty, 3\rangle \cup \langle 3, \infty\rangle$,
as shown in Figure 4 in Hickey et al.'s article. The number of possible
candidates increases if more complicated functions are involved, as unions of
intervals themselves appear as arguments (e.g., if $I_1 \cup I_2$ is added
to $I_3 \cup I_4$, the result
is $I_1 + I_3 \cup I_1 + I_4 \cup I_2 + I_3 \cup I_2 + I_4$). As a consequence,
calculations in interval arithmetic are non-deterministic in nature, and the
number of possible results is not foreseeable and cannot, in general, be
vectorized as is often done in R. Use cases for interval arithmetic are the
limitations of floating-point representations in computer hardware, but
intervals can also be used to represent the result of measurements with limited
precision, or truncated intermediate results of students doing hand
calculations. A few rules for basic interval arithmetic are found
in `pl/interval.pl`; a few examples are shown below. Again, Prolog rings back to
R via `r_eval/2` to determine the result of `dbinom(X, Size, Prob, Log)`.

```{r}
#consult(system.file(file.path("pl", "interval.pl"), package="rolog"))

#Q <- quote(int(`...`(1, 2) / `...`(-3, 3), .Res))
#unlist(findall(Q, options=list(preproc=as.rolog)))

#D  <- quote(`...`(5.7, 5.8))
#mu <- 4
#s  <- quote(`...`(3.8, 3.9))
#N  <- 24L
#tratio <- call("/", call("-", D, mu), call("/", s, call("sqrt", N)))
#once(call("int", tratio, expression(Res)))

# Binomial density
#prob = quote(`...`(0.2, 0.3))
#once(call("int", call("dbinom", 4L, 10L, prob, FALSE), expression(Res)))
```

The slightly cumbersome syntax for entering an interval $\langle \ell, u\rangle$
is due to the fact that the ellipsis is a reserved symbol in R and cannot be
used as an infix operator. A powerful and comprehensive system for constraint
logic programming over intervals is available as a Prolog pack [@Workman2021]
and can easily be connected to R using, for example, the present package.

# 5. Conclusions

R has become the primary language for statistical programming and data science,
but is currently lacking support for traditional, symbolic artificial
intelligence. There are already two add-ons for SWI-Prolog that allow to run R
calculations from Prolog [@Angelopoulos2013;@Rserve], but a connection in the
other direction was missing, so far. `rolog` bridges this gap by providing an
interface to a SWI-Prolog distribution in an R package. The communication
between the two systems is mainly in the form of queries from R to Prolog, but
two predicates allow Prolog to ring back and evaluate terms in R. The design of
the package is minimalistic, providing three main
functions `query()`, `submit()`, and `clear()`, and a very limited set of
convenience tools (`consult()`, `once()`, and `findall()`) to facilitate
recurrent everyday actions. As both systems are homoiconic in nature, it was
easy to establish a one-to-one correspondence between many of the elements of
the two languages. Most exceptions (e.g., lack of R support for empty symbols)
can be avoided and/or circumvented by wrapper functions at both ends.

Simple ways to extend the package have been described in Section\ 2; such
extensions could, for example, include R objects and structures like those
returned by `lm()`, or S4 classes. In many use cases, this may be realized by
transforming the R object to a list with named elements, and rebuild the object
on the Prolog end on an as-needed basis. After a query, the process is reversed.
If speed is an issue, more of these steps can, in principle, be moved into the
package and implemented in `Rcpp`.

`rolog`, thus, opens up a wide of applications in logic programming for
statisticians and researchers at the intersection of symbolic and connectionist
artificial intelligence, where concise knowledge representation is combined with
statistical power. Moreover, `rolog` provides starting points for useful
small-scale solutions for everyday issues in data science (term transformations,
pretty mathematical output, interval arithmetic, see Section\ 3).

At its present stage, a major limitation of `rolog` is its relatively slow
speed. For example, translation of R lists or vectors to the respective elements
of the Prolog language (also lists, `#/N`) is done element-wise, in both
directions. The translation is optimized by using `Rcpp` [@Edelbuettel2018], but
there remains an upper bound in the efficiency, because Prolog does not support
vectors or matrices. Since Prolog's primary purpose is not vector or matrix
calculation, this limitation may not show up in real-world applications. Another
issue, maybe a bit annoying, is the rather cumbersome syntax of the interface,
with the need for quoted calls and R expressions for representing Prolog
variables. `rolog` was deliberately chosen to be minimalistic and, so far, only
depends on base R. A more concise representation might be obtained by tools from
the "Tidyverse" ecosystem, as described in Chapter\ 19 of
Advanced\ R [@Wickham2019]. Finally, at this stage, `rolog` is unable to deal
with cyclic
terms (e.g., `once(call("=", expression(A), call("f", expression(A))))`,
i.e., `A = f(A)` raises an error message).

`rolog` is available for R Version 4.2 and later, and can easily be installed
using the usual `install.packages("rolog")`. The source code of the package is
found at https://github.com/mgondan/rolog/, including installation instructions
for Unix, Windows and macOS.

# Acknowledgement

Development of the package profited substantially from the Prolog
packs `rserve_client` [@Rserve] and `real` [@Angelopoulos2013].

# Note

The results in this paper were obtained using
R\ `r paste(R.Version()[7:8], collapse = ".")` with
the `rolog`\ `r packageVersion("rolog")` package. R\ itself and all packages
used are available from the Comprehensive R Archive Network (CRAN) 
at https://CRAN.R-project.org/.

# References