An interface to Azure Cosmos DB, a NoSQL database service from Microsoft.
Azure Cosmos DB is a fully managed NoSQL database for modern app development. Single-digit millisecond response times, and automatic and instant scalability, guarantee speed at any scale. Business continuity is assured with SLA-backed availability and enterprise-grade security. App development is faster and more productive thanks to turnkey multi region data distribution anywhere in the world, open source APIs and SDKs for popular languages. As a fully managed service, Azure Cosmos DB takes database administration off your hands with automatic management, updates and patching. It also handles capacity management with cost-effective serverless and automatic scaling options that respond to application needs to match capacity with demand.
On the Resource Manager side, AzureCosmosR extends the AzureRMR class framework to allow creating and managing Cosmos DB accounts. On the client side, it provides a comprehensive interface to the Cosmos DB SQL/core API as well as bridges to the MongoDB and table storage APIs.
The primary repo for this package is at
https://github.com/Azure/AzureCosmosR; please submit issues and PRs
there. It is also mirrored at the Cloudyr org at
https://github.com/cloudyr/AzureCosmosR. You can install the development
version of the package with
devtools::install_github("Azure/AzureCosmosR")
.
AzureCosmosR provides a suite of methods to work with databases, containers (tables) and documents (rows) using the SQL API.
library(dplyr)
library(AzureCosmosR)
<- cosmos_endpoint("https://myaccount.documents.azure.com:443/", key="mykey")
endp
list_cosmos_databases(endp)
<- get_cosmos_database(endp, "mydatabase")
db
# create a new container and upload the Star Wars dataset from dplyr
<- create_cosmos_container(db, "mycontainer", partition_key="sex")
cont bulk_import(cont, starwars)
query_documents(cont, "select * from mycontainer")
# an array select: all characters who appear in ANH
query_documents(cont,
"select c.name
from mycontainer c
where array_contains(c.films, 'A New Hope')")
You can easily create and execute stored procedures and user-defined functions:
<- create_stored_procedure(
proc
cont,"helloworld",
'function () {
var context = getContext();
var response = context.getResponse();
response.setBody("Hello, World");
}'
)
exec_stored_procedure(proc)
create_udf(cont, "times2", "function(x) { return 2*x; }")
query_documents(cont, "select udf.times2(c.height) from cont c")
Aggregates take some extra work, as the Cosmos DB REST API only has
limited support for cross-partition queries. Set
by_pkrange=TRUE
in the query_documents
call,
which will run the query on each partition key range (pkrange) and
return a list of data frames. You can then process the list to obtain an
overall result.
# average height by sex, by pkrange
<- query_documents(
df_lst
cont,"select c.gender, count(1) n, avg(c.height) height
from mycontainer c
group by c.gender",
by_pkrange=TRUE
)
# combine pkrange results
%>%
df_lst bind_rows(.id="pkrange") %>%
group_by(gender) %>%
summarise(height=weighted.mean(height, n))
You can query data in a MongoDB-enabled Cosmos DB instance using the mongolite package. AzureCosmosR provides a simple bridge to facilitate this.
<- cosmos_mongo_endpoint("https://myaccount.mongo.cosmos.azure.com:443/", key="mykey")
endp
# a mongolite::mongo object
<- cosmos_mongo_connection(endp, "mycollection", "mydatabase")
conn $find("{}") conn
For more information on working with MongoDB, see the mongolite documentation.
You can work with data in a table storage-enabled Cosmos DB instance using the AzureTableStor package.
<- AzureTableStor::table_endpoint("https://myaccount.table.cosmos.azure.com:443/", key="mykey")
endp
<- AzureTableStor::storage_table(endp, "mytable")
tab ::list_table_entities(tab, filter="firstname eq 'Satya'") AzureTableStor
On the ARM side, AzureCosmosR extends the AzureRMR class framework
with a new az_cosmosdb
class representing a Cosmos DB
account resource, and methods for the az_resource_group
resource group class.
<- AzureRMR::get_azure_login()$
rg get_subscription("sub_id")$
get_resource_group("rgname")
$create_cosmosdb_account("mycosmosdb", interface="sql", free_tier=TRUE)
rg$list_cosmosdb_accounts()
rg<- rg$get_cosmosdb_account("mycosmosdb")
cosmos
# access keys (passwords) for this account
$list_keys()
cosmos
# get an endpoint object -- detects which API this account uses
<- cosmos$get_endpoint()
endp
# API-specific endpoints
$get_sql_endpoint()
cosmos$get_mongo_endpoint()
cosmos$get_table_endpoint() cosmos