Search()
and Search_uri()
gain new
parameter ignore_unavailable
to determine what happens if
an index name does not exist (#273)connect()
gains new parameter
ignore_version
. Internally, elastic
sometimes
checks the Elasticsearch version that the user is connected to to
determine what to do. may be useful when it’s not possible to check the
Elasticsearch version, e.g., when its not possible to ping the root
route of the API (#275)digits
that is
passed down to jsonlite::toJSON() used internally
. thus,
digits
will control the number of decimal digits used in
the JSON the package creates to be bulk loaded into Elasticsearch
(#279)index_shrink()
for index shrinking
(#192)docs_bulk()
to allow
pipline attachments to work, all docs_bulk
methods that do
http requests (i.e, not prep fxns) gain the parameter query
to pass through query parameters to the http request, including for
example pipeline
, _source
etc. (#253)Search()
and Search_uri()
gain the
parameter track_total_hits
(default: TRUE
)
(#262) thanks @orenovwarn
parameter in connect()
was not
being used across the entire package; now all methods should capture any
warnings returned in the Elasticsearch HTTP API headers (#261)connect()
does not create a DBI
like connection object (#265)index_analyze()
function where as is
method I()
should only be applied if the input parameter is
not NULL
- to avoid a warning (#269)docs_bulk_update()
: subsetting data.frame’s was
not working correctly when data.frame’s had only 1 column; fixed
(#260)es_ver()
in the
Elasticsearch
class to be more flexible in capturing
Elasticsearch version (#268)crul
version, helps fix a problem with
passing along authentication details (#267)(#87) The connect()
function is essentially the same,
with some changes, but now you pass the connection object to each
function all. This indeed will break code. That’s why this is a major
version bump.
There is one very big downside to this: breaks existing code. That’s the big one. I do apologize for this, but I believe that is outweighed by the upsides: passing the connection object matches behavior in similar R packages (e.g., all the SQL database clients); you can now manage as many different connection objects as you like in the same R session; having the connection object as an R6 class allows us to have some simple methods on that object to ping the server, etc. In addition, all functions will error with an informative message if you don’t pass the connection object as the first thing.
pipeline_create
,
pipeline_delete
, pipeline_get
,
pipeline_simulate
, and pipeline_attachment()
(#191) (#226)docs_delete_by_query()
and
docs_update_by_query()
to delete or update multiple
documents at once, respectively; and new function reindex()
to reindex all documents from one index to another (#237) (#195)crul
for HTTP requests. this only should
matter with respect to passing in curl options (#168)connect()
(#241)docs_bulk_create()
, docs_bulk_delete()
,
docs_bulk_index()
. each of which are tailored to doing the
operation in the function name: creating docs, deleting docs, or
indexing docs (#183)type_remover()
as a utility function
to help users remove types from their files to use for bulk loading;
could be used on example files in this package or user supplied files
(#180)alias_rename()
to rename aliasesscroll()
example that wasn’t working (#228)alias_create()
(#230)docs_get
gains new parameters
source_includes
and source_excludes
to include
or exclude certain fields in the returned document (#246) thanks @Jensxyindex_create()
(#211)Search()
and Search_uri()
docs of how to use profiles
(https://www.elastic.co/guide/en/elasticsearch/reference/current/search-profile.html)
(#194)docs_bulk_prep()
for doing
a mix of actions (i.e., delete, create, etc.)include_type_name
param in mappings fxns
(#250)docs_bulk_update()
was not handling boolean values
correctly. now fixed (#239) (#240) thanks to @dpmccabeinfo()
method has been moved inside of the
connection object. after calling x = connect()
you can call
x$info()
ping()
method has been marked as deprecated;
instead, call ping()
on the connection object created by a
call to connect()
docs_bulk_update()
to do bulk
updates to documents (#169)id
is now optional in docs_create()
- if
you don’t pass a document identifier Elasticsearch generates one for you
(#216) thanks @jbrantdocs_bulk()
gains new parameter quiet
to
optionally turn off the progress bar (#202)docs_bulk()
for encoding in different locales
(#223) (#224) thanks @Lchiffonindex_get()
: you can now only pass in one value
to the features
parameter (one of settings, mappings, or
aliases) (#218) thanks @happyshowsindex_create()
to handle a list body, in
addition to a JSON body (#214) thanks @emillykkejensendocs_bulk()
for document IDs as factors (#212)
thanks @AMR-KELEGdocs_bulk()
(and
taking up disk space) are cleaned up now (deleted), though if you pass
in your own file paths you have to clean them up (#208) thanks @emillykkejensencharacter
and
list
.scroll()
and
scroll_clear()
is now x
, should only matter if
you specified the parameter name for the first parameterscroll
parameter in scroll()
function is
now time_scroll
asdf
(for “as data.frame”) to
scroll()
to give back a data.frame (#163)scroll()
, see parameter
stream_opts
in the docs and examples (#160)tasks
and tasks_cancel
for
the tasks API (#145)Search()
, see parameter
stream_opts
in the docs and examples. scroll
parameter in Search()
is now time_scroll
(#160)field_caps
(for field capabilities) - in
ES v5.4 and greaterreindex
for the reindex ES API (#134)index_template_get
,
index_template_put
, index_template_exists
, and
index_template_delete
for the indices templates ES API
(#133)index_forcemerge
for the ES index
_forcemerge
route (#176)Search
and
Search_uri
for how to show progress bar (#162)docs_bulk
to clarify what’s allowed
as first parameter input (#173)docs_bulk
change to internal JSON preparation to use
na = "null"
and auto_unbox = TRUE
in the
jsonlite::toJSON
call. This means that NA
’s in
R become null
in the JSON and atomic vectors are unboxed
(#174) thanks @pieterprovoostmapping_create
gains update_all_types
parameter; and new man file to explain how to enable fielddata if
sorting needed (#164)suggest
is used through query DSL instead of a route,
added example to Search
(#102)ping()
calls - so that after the
first one we used the cached version if called again within the same R
session. Should help speed up some code with respect to http calls
(#184) thanks @henfibercontent-type
headers, for
the most part application/json
(#197), though functions
that work with the bulk API use application/x-ndjson
(#186)mapping_create
egs (#199)type_exists
to work on ES versions less to and
greater than v5 (#189)field_stats
to indicate that its no longer
avail. in ES v5.4 and above - and that the fields
parameter
in ES >= v5 is gone (#190)docs_update()
to do partial document
updates (#152)docs_bulk_prep()
to prepare bulk format
files that you can use to load into Elasticsearch with this package, on
the command line, or in any other context (Python, Ruby, etc.)
(#154)elastic
works with
Elasticsearch v5. Note that not all v5 features are included here yet.
(#153)docs_bulk()
was not working on single column
data.frame’s. now is working. (#151) thanks @gustavobiodocs_*
functions now support ids with whitespace in
them. (#155)docs_mget()
to fix requesting certain fields
back.es_base
parameter in
connect()
- Now, instead of stop()
on
es_base
usage, we use its value for es_host
.
Only pass in one or the other of es_base
and
es_host
, not both. (#146) thanks @MarcinKosinskiSearch_template()
,
Search_template_register()
,
Search_template_get()
,
Search_template_delete()
, and
Search_template_render()
(#101)docs_delete
,
docs_get
and docs_create
to list correctly
that numeric and character values are accepted for the id parameter -
before stated that numeric values allowed only (#144) thanks @dominoFireSearch
and related functions
where wildcards in indeces didn’t work. Turned out we url escaped twice
unintentionally. Fixed now, and more tests added for wildcards. (#143)
thanks @martijnvanbeersdocs_bulk()
to always return a list, whether
it’s given a file, data.frame, or list. For a file, a named list is
returned, while for a data.frame or list an unnamed list is returned as
many chunks can be processed and we don’t attempt to wrangle the list
output. Inputs of data.frame and list used to return NULL
as we didn’t return anything from the internal for loop. You can wrap
docs_bulk
in invisible()
if you don’t want the
list printed (#142)docs_bulk()
and msearch()
in
which base URL construction was not done correctly (#141) thanks @steeled !scroll_clear()
to clear search contexts
created when using scroll()
(#140)ping()
to ping an Elasticsearch server to
see if it is up (#138)connect()
gains new parameter es_path
to
specify a context path, e.g., the bar
in
http://foo.com/bar
(#137)httr::content()
calls to parse to plain text
and UTF-8 encoding (#118)scroll()
all scores
are zero b/c scores are not calculated/tracked (#127)connect()
no longer pings the ES server when run, but
can now be done separately with ping()
(#139)connect()
(#129)transport_schema
param to connect()
to specify http or https (#130)docs_bulk()
(#125)docs_bulk()
function so that user supplied
doc_ids
are not changed at all now (#123)Compatibility for many Elasticsearch versions has improved. We’ve
tested on ES versions from the current (v2.1.1
) back to
v1.0.0
, and elastic
works with all versions.
There are some functions that stop with a message with some ES versions
simply because older versions may not have had particular ES features.
Please do let us know if you have problems with older versions of ES, so
we can improve compatibility.
index_settings_update()
function to allow
updating index settings (#66)JSON
. Error parsing has thus changed in
elastic
. We now have two levels of error behavior: ‘simple’
and ‘complete’. These can be set in connect()
with the
errors
parameter. Simple errors give back often just that
there was an error, sometimes a message with explanation is supplied.
Complete errors give more explanation and even the ES stack trace if
supplied in the ES error response (#92) (#93)msearch()
to do multi-searches. This works
by defining queries in a file, much like is done for a file to be used
in bulk loading. (#103)validate()
to validate a search.
(#105)percolate_count()
, percolate_delete()
,
percolate_list()
, percolate_match()
,
percolate_register()
. The percolator works by first storing
queries into an index and then you define documents in order to retrieve
these queries. (#106)field_stats()
to find statistical
properties of a field without executing a search (#107)cat_nodeattrs()
index_recreate()
as a convenience function
that detects if an index exists, and if so, deletes it first, then
creates it again.docs_bulk()
now supports passing in document ids (to
the _id
field) via the parameter doc_ids
for
each input data.frame or list & supports using ids already in
data.frame’s or lists (#83)cat_*()
functions cleaned up. previously, some
functions had parameters that were essentially silently ignored. Those
parameters dropped now from the functions. (#96)/_search/exists
), but have removed that in favor of using
regular _search
with size=0
and
terminate_after=1
instead. (#104)lenient
in Search()
and
Search_uri
to allow format based failures to be ignored, or
not ignored.docs_get()
when gthe document
isn’t founddocs_bulk()
in the use case where
users use the function in a for loop, for example, and indexing started
over, replacing documents with the same id (#83)cat_()
functions in which they sometimes
failed when parse=TRUE
(#88)docs_bulk()
in which user supplied
document IDs weren’t being passed correctly internally (#90)Search()
and Search_uri()
where multiple indices weren’t supported, whereas they should have been
- supported now (#115)mlt()
,
nodes_shutdown()
, index_status()
, and
mapping_delete()
(#94) (#98) (#99) (#110)index_settings_update()
function to allow
updating index settings (#66)RCurl::curlEscape()
with
curl::curl_escape()
(#81)v1
of
httr
Search_uri()
where the search is defined
entirely in the URL itself. Especially useful for cases in which
POST
requests are forbidden, e.g, on a server that prevents
POST
requests (which the function Search()
uses). (#58)nodes_shutdown()
(#23)docs_bulk()
gains ability to push data into
Elasticsearch via the bulk http API from data.frame or list objects.
Previously, this function only would accept a file formatted correctly.
In addition, gains new parameters: index
- The index name
to use. type
- The type name to use.
chunk_size
- Size of each chunk. (#60) (#67) (#68)cat_*()
functions gain new parameters: h
to specify what fields to return; help
to output available
columns, and their meanings; bytes
to give numbers back
machine friendly; parse
Parse to a data.frame or notcat_*()
functions can now optionally capture data
returned in to a data.frame (#64)Search()
gains new parameter search_path
to set the path that is used for searching. The default is
_search
, but sometimes in your configuration you’ve setup
so that you don’t need that path, or it’s a different path.
(023d28762e7e1028fcb0ad17867f08b5e2c92f93)docs_mget()
added internal checker to make sure user
passes in the right combination of index
,
type
, and id
parameters, or index
and type_id
, or just index_type_id
(#42)index
, type
, and id
parameters required in the function docs_get()
(#43)scroll()
to allow long
scroll_id
’s by passing scroll ids in the body instead of as
query parameter (#44)Search()
function, in the
error_parser()
error parser function, check to see if
error
element returned in response body from Elasticsearch,
and if so, parse error, if not, pass on body (likely empty) (#45)Search()
function, added helper function to check
size and from parameter values passed in to make sure they are numbers.
(#46)index
and type
parameters used, now using RCurl::curlEscape()
to URL
escape. Other parameters passed in are go through httr
CRUD
methods, and do URL escaping for us. (#49)First version to go to CRAN.
scroll()
and a scroll
parameter to the Search()
function (#36)explain()
to easily get at
explanation of search results.?units-time
and ?units=distance
?searchapis
tokenizer_set()
to set tokenizersconnect()
run on package load to set default base url
of localhost
and port of 9200
- you can
override this by running that fxn yourself, or storing
es_base
, es_port
, etc. in your
.Rprofile
file.es_search()
changed to Search()
.\dontrun
instead of
\donttest
so they don’t fail on CRAN checks.es_search_body()
removed - body based queries using the
query DSL moved to the Search()
function, passed into the
body
parameter.elastic
more in line with the official Elasticsearch Python
client (http://elasticsearch-py.readthedocs.org/en/master/).index
manual page, and all
functions prefixed with index_()
. Thematic manual files
are: index
, cat
, cluster
,
alias
, cdbriver
, connect
,
documents
, mapping
, nodes
, and
search
.es_cat()
was changed to
cat_()
- we avoided cat()
because as you know
there is already a widely used function in base R, see
base::cat()
.cat
functions to separate functions for each
command, instead of passing the command in as an argument. For example,
cat('aliases')
becomes cat_aliases()
.es_
prefix remains only for
es_search()
, as we have to avoid conflict with
base::search()
.assertthat
package import, using
stopifnot()
instead (#14)