Store local copies of remote imagery in GeoBlacklight.
This GeoBlacklight plugin captures remote images from geographic web services and saves them locally. It borrows the concept of a SolrDocumentSidecar from Spotlight, to have an ActiveRecord-based "sidecar" to match each non-AR SolrDocument. This allows us to use ActiveStorage to attach images to our solr documents.
- Background Job Processor
Sidekiq is an excellent choice if you need an opinion.
GeoBlacklight v4 with Aardvark metadata / Add the gem to your Gemfile.
gem "geoblacklight_sidecar_images", "~> 1.0"
GeoBlacklight v3 with GBL v1.0 metadata / Add the gem to your Gemfile.
gem "geoblacklight_sidecar_images", "~> 0.9.1", "< 1.0"
Run the generator.
$ bin/rails generate geoblacklight_sidecar_images:install
Run the database migration.
$ bin/rails db:migrate
Complete any necessary Active Storage setup steps, for example:
- Add a config/storage.yml file
local:
service: Disk
root: <%= Rails.root.join("storage") %>
- Add config/environments declarations, development.rb for example:
# Store uploaded files on the local file system (see config/storage.yml for options)
config.active_storage.service = :local
Create a new GeoBlacklight instance with the GBLSI code
$ rails new app-name -m https://raw.githubusercontent.com/geoblacklight/geoblacklight_sidecar_images/develop/template.rb
# Run your GBL instance
bundle exec rake geoblacklight:server
# Index the GBL test fixtures
bundle exec rake gblsci:sample_data:seed
Spawns background jobs to harvest images for all documents in your Solr index.
bundle exec rake gblsci:images:harvest_all
Allows you to add images one document id at a time. Pass a DOC_ID env var.
DOC_ID='stanford-cz128vq0535' bundle exec rake gblsci:images:harvest_doc_id
Reattempt image harvesting for all non-successful state objects.
bundle exec rake gblsci:images:harvest_retry
bundle exec rake gblsci:images:harvest_states
We use a state machine library to track success/failure of our harvest tasks. The states we track are:
- initialized - SolrDocumentSidecar created, no harvest attempt run
- queued - Harvest attempt queued as background job
- processing - Harvest attempt at work
- succeeded - Harvest was successful, image attached
- failed - Harvest failed, no image attached, error logged
- placeheld - Harvest was not successful, placeholder imagery will be used
SolrDocumentSidecar.in_state(:succeeded) => [#<SolrDocumentSidecar:0x0000000170697960 ... ]
SolrDocumentSidecar.image.attached? => false
SolrDocumentSidecar.image_state.current_state => "placeheld"
SolrDocumentSidecar.image_state.last_transition => #<SidecarImageTransition id: 207, to_state: "placeheld", metadata: {"solr_doc_id"=>"stanford-cg357zz0321", "solr_version"=>1616509329754554368, "placeheld"=>true, "viewer_protocol"=>"wms", "image_url"=>"http://geowebservices-restricted.stanford.edu/geoserver/wms/reflect?&FORMAT=image%2Fpng&TRANSPARENT=TRUE&LAYERS=druid:cg357zz0321&WIDTH=300&HEIGHT=300", "service_url"=>"http://geowebservices-restricted.stanford.edu/geoserver/wms/reflect?&FORMAT=image%2Fpng&TRANSPARENT=TRUE&LAYERS=druid:cg357zz0321&WIDTH=300&HEIGHT=300", "gblsi_thumbnail_uri"=>false, "error"=>"Faraday::Error::ConnectionFailed"},...>
Remove all sidecar objects and attached images
bundle exec rake gblsci:images:harvest_purge_all
Remove all sidecar objects and attached images for AR objects without a corresponding Solr document
bundle exec rake gblsci:images:harvest_purge_orphans
Remove sidecar objects and attached images via a CSV file of document ids
bundle exec rake gblsci:images:harvest_destroy_batch
Generate a CSV file of sidecar objects and associated image state. Useful for debugging problem items.
bundle exec rake gblsci:images:harvest_report
Prints details for failed state harvest objects to stdout
bundle exec rake gblsci:images:harvest_failed_state_inspect
If you add a thumbnail uri to your geoblacklight solr documents...
{
...
"dc_format_s":"TIFF",
"dc_creator_sm":["Minnesota. Department of Highways."],
"thumbnail_path_ss":"https://umedia.lib.umn.edu/sites/default/files/imagecache/square300/reference/562/image/jpeg/1089695.jpg",
"dc_type_s":"Still image",
...
}
Then you can edit your GeoBlacklight settings.yml file to point at that solr field (Settings.GBLSI_THUMBNAIL_FIELD). Any docs in your index that have a value for that field will harvest the image at that URI instead of trying to retrieve an image via IIIF or the other web services.
Use basic Active Storage patterns to display imagery in your application.
# Is there an image?
document.sidecar.image.attached?
# Can the image size be manipulated?
document.sidecar.image.variable?
# Example image_tag with resize
<%= image_tag document.sidecar.image.variant(resize_to_fit: [100, 100]), {class: 'media-object'} %>
This GBL plugin includes a custom catalog/_index_split_default.html.erb file. Look there for examples on calling the image method.
Example for adding a thumbnail to the show page sidebar.
catalog/_show_sidebar.html.erb
# Add to end of file
<% if @document.sidecar.image.attached? %>
<% if @document.sidecar.image.variable? %>
<div class="card">
<div class="card-header">Thumbnail</div>
<div class="card-body">
<%= image_tag @document.sidecar.image.variant(resize_to_fit: [200, 200]), {class: 'mr-3'} %>
</div>
</div>
<% end %>
<% end %>
# Run test suite
bundle exec rake ci
# Launch test app server
cd .internal_test_app/
bundle exec rake geoblacklight:server
# Load test fixtures
bundle exec rake gblsci:sample_data:seed
# Run harvest
bundle exec rake gblsci:images:harvest_all
# Tail image service log file
tail -f log/image_service_development.log