Local Solr setup

Scott Chamberlain

2020-04-22

The Solr version you are working with my differ from below. Don’t worry, just go with the version you want to use.

OSX

  1. Download most recent version from an Apache mirror https://solr.apache.org/
  2. Unzip/untar the file. Move to your desired location. Now you have Solr v.7.0.0
  3. Go into the directory you just created: cd solr-7.0.0
  4. Launch Solr: bin/solr start -e cloud -noprompt - Sets up SolrCloud mode, rather than Standalone mode. As far as I can tell, SolrCloud mode seems more common.
  5. Once Step 4 completes, you can go to http://localhost:8983/solr/ now, which is the admin interface for Solr.
  6. Load some documents: bin/post -c gettingstarted docs/
  7. Once Step 6 is complete (will take a few minutes), navigate in your browser to http://localhost:8983/solr/gettingstarted/select?q=*:*&wt=json and you should see a bunch of documents

Linux

You should be able to use the above instructions for OSX on a Linux machine.

Windows

You should be able to use the above instructions for OSX on a Windows machine, but with some slight differences. For example, the bin/post tool for OSX and Linux doesn’t work on Windows, but see https://solr.apache.org/guide/8_2/post-tool.html#PostTool-Windows for an equivalent.

solrium usage

First, setup a connection object

library(solrium)
(conn <- SolrClient$new())
## <Solr Client>
##   host: 127.0.0.1
##   path: 
##   port: 8983
##   scheme: http
##   errors: simple
##   proxy:

And we can now use the solrium R package to query the Solr database to get raw JSON data:

conn$search("gettingstarted", params = list(q = '*:*', rows = 3), raw = TRUE)
## [1] "{\n  \"responseHeader\":{\n    \"status\":0,\n    \"QTime\":5,\n    \"params\":{\n      \"q\":\"*:*\",\n      \"rows\":\"3\",\n      \"wt\":\"json\"}},\n  \"response\":{\"numFound\":48,\"start\":0,\"docs\":[\n      {\n        \"id\":\"TWINX2048-3200PRO\",\n        \"name\":[\"CORSAIR  XMS 2GB (2 x 1GB) 184-Pin DDR SDRAM Unbuffered DDR 400 (PC 3200) Dual Channel Kit System Memory - Retail\"],\n        \"manu\":[\"Corsair Microsystems Inc.\"],\n        \"manu_id_s\":\"corsair\",\n        \"cat\":[\"electronics\",\n          \"memory\"],\n        \"features\":[\"CAS latency 2,  2-3-3-6 timing, 2.75v, unbuffered, heat-spreader\"],\n        \"price\":[185.0],\n        \"popularity\":[5],\n        \"inStock\":[true],\n        \"store\":[\"37.7752,-122.4232\"],\n        \"manufacturedate_dt\":\"2006-02-13T15:26:37Z\",\n        \"payloads\":[\"electronics|6.0 memory|3.0\"],\n        \"_version_\":1664688417970061312},\n      {\n        \"id\":\"VS1GB400C3\",\n        \"name\":[\"CORSAIR ValueSelect 1GB 184-Pin DDR SDRAM Unbuffered DDR 400 (PC 3200) System Memory - Retail\"],\n        \"manu\":[\"Corsair Microsystems Inc.\"],\n        \"manu_id_s\":\"corsair\",\n        \"cat\":[\"electronics\",\n          \"memory\"],\n        \"price\":[74.99],\n        \"popularity\":[7],\n        \"inStock\":[true],\n        \"store\":[\"37.7752,-100.0232\"],\n        \"manufacturedate_dt\":\"2006-02-13T15:26:37Z\",\n        \"payloads\":[\"electronics|4.0 memory|2.0\"],\n        \"_version_\":1664688418109521920},\n      {\n        \"id\":\"VDBDB1A16\",\n        \"name\":[\"A-DATA V-Series 1GB 184-Pin DDR SDRAM Unbuffered DDR 400 (PC 3200) System Memory - OEM\"],\n        \"manu\":[\"A-DATA Technology Inc.\"],\n        \"manu_id_s\":\"corsair\",\n        \"cat\":[\"electronics\",\n          \"memory\"],\n        \"features\":[\"CAS latency 3,   2.7v\"],\n        \"popularity\":[0],\n        \"inStock\":[true],\n        \"store\":[\"45.18414,-93.88141\"],\n        \"manufacturedate_dt\":\"2006-02-13T15:26:37Z\",\n        \"payloads\":[\"electronics|0.9 memory|0.1\"],\n        \"_version_\":1664688418113716224}]\n  }}\n"
## attr(,"class")
## [1] "sr_search"
## attr(,"wt")
## [1] "json"

Or parsed data to a data.frame (just looking at a few columns for brevity):

conn$search("gettingstarted", params = list(q = '*:*', fl = c('id', 'address_s')))
## # A tibble: 10 x 2
##    id                address_s                                                 
##    <chr>             <chr>                                                     
##  1 TWINX2048-3200PRO <NA>                                                      
##  2 VS1GB400C3        <NA>                                                      
##  3 VDBDB1A16         <NA>                                                      
##  4 SOLR1000          <NA>                                                      
##  5 adata             46221 Landing Parkway Fremont, CA 94538                   
##  6 apple             1 Infinite Way, Cupertino CA                              
##  7 asus              800 Corporate Way Fremont, CA 94539                       
##  8 ati               33 Commerce Valley Drive East Thornhill, ON L3T 7N6 Canada
##  9 belkin            12045 E. Waterfront Drive Playa Vista, CA 90094           
## 10 canon             One Canon Plaza Lake Success, NY 11042

Other Vignettes

See the other vignettes for more thorough examples: