The Ollama R library is the easiest way to integrate R with Ollama, which lets you run language models locally on your own machine. Main site: https://hauselin.github.io/ollama-r/
To use this R library, ensure the Ollama app is installed. Ollama can use GPUs for accelerating LLM inference. See Ollama GPU documentation for more information.
See Ollama’s Github page for more information. This library uses the Ollama REST API (see documentation for details).
Note: You should have at least 8 GB of RAM available to run the 7B models, 16 GB to run the 13B models, and 32 GB to run the 33B models.
ollamar
If you use this library, please cite this paper using the following BibTeX entry:
@article{Lin2024Aug,
author = {Lin, Hause and Safi, Tawab},
title = {{ollamar: An R package for running large language models}},
journal = {PsyArXiv},
year = {2024},
month = aug,
publisher = {OSF},
doi = {10.31234/osf.io/zsrg5},
url = {https://doi.org/10.31234/osf.io/zsrg5}
}
This library has been inspired by the official Ollama Python and Ollama JavaScript libraries. If you’re coming from Python or JavaScript, you should feel right at home. Alternatively, if you plan to use Ollama with Python or JavaScript, using this R library will help you understand the Python/JavaScript libraries as well.
Download and install Ollama.
Open/launch the Ollama app to start the local server.
Install either the stable or latest/development version of
ollamar
.
Stable version:
install.packages("ollamar")
For the latest/development version with more features/bug fixes (see
latest changes here),
you can install it from GitHub using the install_github
function from the remotes
library. If it doesn’t work or
you don’t have remotes
library, please run
install.packages("remotes")
in R or RStudio before running
the code below.
# install.packages("remotes") # run this line if you don't have the remotes library
::install_github("hauselin/ollamar") remotes
ollamar
uses the httr2
library
to make HTTP requests to the Ollama server, so many functions in this
library returns an httr2_response
object by default. If the
response object says Status: 200 OK
, then the request was
successful. See Notes section below for more
information.
library(ollamar)
test_connection() # test connection to Ollama server
# if you see Ollama local server running, it's working
# generate a response/text based on a prompt; returns an httr2 response by default
<- generate("llama3.1", "tell me a 5-word story")
resp
resp
#' interpret httr2 response object
#' <httr2_response>
#' POST http://127.0.0.1:11434/api/generate # endpoint
#' Status: 200 OK # if successful, status code should be 200 OK
#' Content-Type: application/json
#' Body: In memory (414 bytes)
# get just the text from the response object
resp_process(resp, "text")
# get the text as a tibble dataframe
resp_process(resp, "df")
# alternatively, specify the output type when calling the function initially
<- generate("llama3.1", "tell me a 5-word story", output = "text")
txt
# list available models (models you've pulled/downloaded)
list_models()
name size parameter_size quantization_level modified1 codegemma:7b 5 GB 9B Q4_0 2024-07-27T23:44:10
2 llama3.1:latest 4.7 GB 8.0B Q4_0 2024-07-31T07:44:33
Download a model from the ollama library (see API doc). For the list of models you can pull/download, see Ollama library.
pull("llama3.1") # download a model (the equivalent bash code: ollama run llama3.1)
list_models() # verify you've pulled/downloaded the model
Delete a model and its data (see API
doc). You can see what models you’ve downloaded with
list_models()
. To download a model, specify the name of the
model.
list_models() # see the models you've pulled/downloaded
delete("all-minilm:latest") # returns a httr2 response object
Generate a response for a given prompt (see API doc).
<- generate("llama3.1", "Tomorrow is a...") # return httr2 response object by default
resp
resp
resp_process(resp, "text") # process the response to return text/vector output
generate("llama3.1", "Tomorrow is a...", output = "text") # directly return text/vector output
generate("llama3.1", "Tomorrow is a...", stream = TRUE) # return httr2 response object and stream output
generate("llama3.1", "Tomorrow is a...", output = "df", stream = TRUE)
# image prompt
# use a vision/multi-modal model
generate("benzie/llava-phi-3", "What is in the image?", images = "image.png", output = 'text')
Generate the next message in a chat (see API doc). See the Notes section for details on how chat messages and chat history are represented/formatted.
<- create_message("what is the capital of australia") # default role is user
messages <- chat("llama3.1", messages) # default returns httr2 response object
resp # <httr2_response>
resp resp_process(resp, "text") # process the response to return text/vector output
# specify output type when calling the function
chat("llama3.1", messages, output = "text") # text vector
chat("llama3.1", messages, output = "df") # data frame/tibble
chat("llama3.1", messages, output = "jsonlist") # list
chat("llama3.1", messages, output = "raw") # raw string
chat("llama3.1", messages, stream = TRUE) # stream output and return httr2 response object
# create chat history
<- create_messages(
messages create_message("end all your sentences with !!!", role = "system"),
create_message("Hello!"), # default role is user
create_message("Hi, how can I help you?!!!", role = "assistant"),
create_message("What is the capital of Australia?"),
create_message("Canberra!!!", role = "assistant"),
create_message("what is your name?")
)cat(chat("llama3.1", messages, output = "text")) # print the formatted output
# image prompt
<- create_message("What is in the image?", images = "image.png")
messages # use a vision/multi-modal model
chat("benzie/llava-phi-3", messages, output = "text")
<- create_message("Tell me a 1-paragraph story.")
messages
# use "llama3.1" model, provide list of messages, return text/vector output, and stream the output
chat("llama3.1", messages, output = "text", stream = TRUE)
# chat(model = "llama3.1", messages = messages, output = "text", stream = TRUE) # same as above
Get the vector embedding of some prompt/text (see API doc). By default, the embeddings are normalized to length 1, which means the following:
embed("llama3.1", "Hello, how are you?")
# don't normalize embeddings
embed("llama3.1", "Hello, how are you?", normalize = FALSE)
# get embeddings for similar prompts
<- embed("llama3.1", "Hello, how are you?")
e1 <- embed("llama3.1", "Hi, how are you?")
e2
# compute cosine similarity
sum(e1 * e2) # not equals to 1
sum(e1 * e1) # 1 (identical vectors/embeddings)
# non-normalized embeddings
<- embed("llama3.1", "Hello, how are you?", normalize = FALSE)
e3 <- embed("llama3.1", "Hi, how are you?", normalize = FALSE) e4
If you don’t have the Ollama app running, you’ll get an error. Make sure to open the Ollama app before using this library.
test_connection()
# Ollama local server not running or wrong server.
# Error in `httr2::req_perform()` at ollamar/R/test_connection.R:18:9:
httr2_response
objects with
resp_process()
ollamar
uses the httr2
library
to make HTTP requests to the Ollama server, so many functions in this
library returns an httr2_response
object by default.
You can either parse the output with resp_process()
or
use the output
parameter in the function to specify the
output format. Generally, the output
parameter can be one
of "df"
, "jsonlist"
, "raw"
,
"resp"
, or "text"
.
<- list_models(output = "resp") # returns a httr2 response object
resp # <httr2_response>
# GET http://127.0.0.1:11434/api/tags
# Status: 200 OK
# Content-Type: application/json
# Body: In memory (5401 bytes)
# process the httr2 response object with the resp_process() function
resp_process(resp, "df")
# or list_models(output = "df")
resp_process(resp, "jsonlist") # list
# or list_models(output = "jsonlist")
resp_process(resp, "raw") # raw string
# or list_models(output = "raw")
resp_process(resp, "resp") # returns the input httr2 response object
# or list_models() or list_models("resp")
resp_process(resp, "text") # text vector
# or list_models("text")
chat()
functionInternally, messages are represented as a list
of many
distinct list
messages. Each list/message object has two
elements: role
(can be "user"
or
"assistant"
or "system"
) and
content
(the message text). The example below shows how the
messages/lists are presented.
list( # main list containing all the messages
list(role = "user", content = "Hello!"), # first message as a list
list(role = "assistant", content = "Hi! How are you?") # second message as a list
)
To simplify the process of creating and managing messages,
ollamar
provides utility/helper functions to format and
prepare messages for the chat()
function.
create_messages()
: create messages to build a chat
historycreate_message()
creates a chat history with a single
messageappend_message()
adds a new message to the end of the
existing messagesprepend_message()
adds a new message to the beginning
of the existing messagesinsert_message()
inserts a new message at a specific
index in the existing messages
delete_message()
delete a message at a specific index
in the existing messages
# create a chat history with one message
<- create_message(content = "Hi! How are you? (1ST MESSAGE)", role = "assistant")
messages # or simply, messages <- create_message("Hi! How are you?", "assistant")
1]] # get 1st message
messages[[
# append (add to the end) a new message to the existing messages
<- append_message("I'm good. How are you? (2ND MESSAGE)", "user", messages)
messages 1]] # get 1st message
messages[[2]] # get 2nd message (newly added message)
messages[[
# prepend (add to the beginning) a new message to the existing messages
<- prepend_message("I'm good. How are you? (0TH MESSAGE)", "user", messages)
messages 1]] # get 0th message (newly added message)
messages[[2]] # get 1st message
messages[[3]] # get 2nd message
messages[[
# insert a new message at a specific index/position (2nd position in the example below)
# by default, the message is inserted at the end of the existing messages (position -1 is the end/default)
<- insert_message("I'm good. How are you? (BETWEEN 0 and 1 MESSAGE)", "user", messages, 2)
messages 1]] # get 0th message
messages[[2]] # get between 0 and 1 message (newly added message)
messages[[3]] # get 1st message
messages[[4]] # get 2nd message
messages[[
# delete a message at a specific index/position (2nd position in the example below)
<- delete_message(messages, 2)
messages
# create a chat history with multiple messages
<- create_messages(
messages create_message("You're a knowledgeable tour guide.", role = "system"),
create_message("What is the capital of Australia?") # default role is user
)
You can convert data.frame
, tibble
or
data.table
objects to list()
of messages and
vice versa with functions from base R or other popular libraries.
# create a list of messages
<- create_messages(
messages create_message("You're a knowledgeable tour guide.", role = "system"),
create_message("What is the capital of Australia?")
)
# convert to dataframe
<- dplyr::bind_rows(messages) # with dplyr library
df <- data.table::rbindlist(messages) # with data.table library
df
# convert dataframe to list with apply, purrr functions
apply(df, 1, as.list) # convert each row to a list with base R apply
::transpose(df) # with purrr library purrr
For the generate()
and chat()
endpoints/functions, you can specify output = 'req'
in the
function so the functions return httr2_request
objects
instead of httr2_response
objects.
<- "Tell me a 10-word story"
prompt <- generate("llama3.1", prompt, output = "req") # returns a httr2_request object
req # <httr2_request>
# POST http://127.0.0.1:11434/api/generate
# Headers:
# • content_type: 'application/json'
# • accept: 'application/json'
# • user_agent: 'ollama-r/1.1.1 (aarch64-apple-darwin20) R/4.4.0'
# Body: json encoded data
When you have multiple httr2_request
objects in a list,
you can make parallel requests with the
req_perform_parallel
function from the httr2
library. See httr2
documentation for details.
library(httr2)
<- "Tell me a 5-word story"
prompt
# create 5 httr2_request objects that generate a response to the same prompt
<- lapply(1:5, function(r) generate("llama3.1", prompt, output = "req"))
reqs
# make parallel requests and get response
<- req_perform_parallel(reqs) # list of httr2_request objects
resps
# process the responses
sapply(resps, resp_process, "text") # get responses as text
# [1] "She found him in Paris." "She found the key upstairs."
# [3] "She found her long-lost sister." "She found love on Mars."
# [5] "She found the diamond ring."
Example sentiment analysis with parallel requests with
generate()
function
library(httr2)
library(glue)
library(dplyr)
# text to classify
<- c('I love this product', 'I hate this product', 'I am neutral about this product')
texts
# create httr2_request objects for each text, using the same system prompt
<- lapply(texts, function(text) {
reqs <- glue("Your only task/role is to evaluate the sentiment of product reviews, and your response should be one of the following:'positive', 'negative', or 'other'. Product review: {text}")
prompt generate("llama3.1", prompt, output = "req")
})
# make parallel requests and get response
<- req_perform_parallel(reqs) # list of httr2_request objects
resps
# process the responses
sapply(resps, resp_process, "text") # get responses as text
# [1] "Positive" "Negative."
# [3] "'neutral' translates to... 'other'."
Example sentiment analysis with parallel requests with
chat()
function
library(httr2)
library(dplyr)
# text to classify
<- c('I love this product', 'I hate this product', 'I am neutral about this product')
texts
# create system prompt
<- create_message("Your only task/role is to evaluate the sentiment of product reviews provided by the user. Your response should simply be 'positive', 'negative', or 'other'.", "system")
chat_history
# create httr2_request objects for each text, using the same system prompt
<- lapply(texts, function(text) {
reqs <- append_message(text, "user", chat_history)
messages chat("llama3.1", messages, output = "req")
})
# make parallel requests and get response
<- req_perform_parallel(reqs) # list of httr2_request objects
resps
# process the responses
bind_rows(lapply(resps, resp_process, "df")) # get responses as dataframes
# # A tibble: 3 × 4
# model role content created_at
# <chr> <chr> <chr> <chr>
# 1 llama3.1 assistant Positive 2024-08-05T17:54:27.758618Z
# 2 llama3.1 assistant negative 2024-08-05T17:54:27.657525Z
# 3 llama3.1 assistant other 2024-08-05T17:54:27.657067Z