-
Notifications
You must be signed in to change notification settings - Fork 1
Write a Darlingtonia importer
- Write a CSV importer using the darlingtonia ruby gem
- Be able to point to the parts of the importer
If you have changes in your current branch -- you can check on this via git status
-- you'll want to save those before starting this lesson (which uses a separate branch):
git checkout -b your_branch_name
git add .
git commit -m 'checkpoint before beginning darlingtonia importer'
git checkout simple_importer
NOTE: If you make experimental changes and want to get back to the minimal code state necessary to run this lesson, you can check the starting code out again using:
git checkout simple_importer
Since we want to achieve the same goals for our second importer, the test is going to look pretty familiar.
Make a file in the spec/importers
folder called modular_importer_spec.rb
and paste the following content into it:
# frozen_string_literal: true
require 'rails_helper'
require 'active_fedora/cleaner'
RSpec.describe ModularImporter, :clean do
let(:modular_csv) { 'spec/fixtures/csv_files/modular_input.csv' }
let(:user) { ::User.batch_user }
before do
ENV['IMPORT_PATH'] = File.expand_path('../fixtures/images', File.dirname(__FILE__))
DatabaseCleaner.clean
ActiveFedora::Cleaner.clean!
end
it "imports a csv" do
expect { ModularImporter.new(modular_csv).import }.to change { Image.count }.by 3
end
it "puts the title into the title field" do
ModularImporter.new(modular_csv).import
expect(Image.where(title: 'A Cute Dog').count).to eq 1
end
it "puts the url into the source field" do
ModularImporter.new(modular_csv).import
expect(Image.where(source: 'https://www.pexels.com/photo/animal-blur-canine-close-up-551628/').count).to eq 1
end
it "creates publicly visible objects" do
ModularImporter.new(modular_csv).import
imported_work = Image.first
expect(imported_work.visibility).to eq 'open'
end
it "attaches files" do
allow(AttachFilesToWorkJob).to receive(:perform_later)
ModularImporter.new(modular_csv).import
expect(AttachFilesToWorkJob).to have_received(:perform_later).exactly(3).times
end
end
Run this test and you should again see an error saying it can't find the expected class:
NameError:
uninitialized constant ModularImporter
Make a file called app/importers/modular_importer.rb
that contains just enough of an importer class that your test can run and give a meaningful error:
class ModularImporter
def initialize(csv_file)
@csv_file = csv_file
raise "Cannot find expected input file #{csv_file}" unless File.exist?(csv_file)
end
def import
end
end
- Run your test:
bundle exec rspec spec/importers/modular_importer_spec.rb
It should fail with a message like
expected `Image.count` to have changed by 3, but was changed by 0
So, at this point, your test is running, but the importer isn't yet creating any records.
- Add the darlingtonia gem to your
Gemfile
and runbundle install
:
gem 'darlingtonia', '~> 2.0'
- Edit
app/importer/modular_importer.rb
so it looks like this:
require 'darlingtonia'
class ModularImporter
def initialize(csv_file)
@csv_file = csv_file
raise "Cannot find expected input file #{csv_file}" unless File.exist?(csv_file)
end
def import
file = File.open(@csv_file)
Darlingtonia::Importer.new(parser: Darlingtonia::CsvParser.new(file: file), record_importer: Darlingtonia::HyraxRecordImporter.new).import
file.close # Note that we must close any files we open.
end
end
- Now your test should pass with output something like this:
ModularImporter
Creating record: ["A Cute Dog"].Record created at: jw827b648Record created at: jw827b648Creating record: ["An Interesting Cat"].Record created at: 3n203z084Record created at: 3n203z084Creating record: ["A Flock of Birds"].Record created at: wm117n96bRecord created at: wm117n96b imports a csv
Finished in 7.56 seconds (files took 9.06 seconds to load)
1 example, 0 failures
Make a file called lib/tasks/darlingtonia_import.rake
and paste the following code into it:
# frozen_string_literal: true
namespace :csv_import do
desc "Load sample CSV"
task darlingtonia_import: :environment do
ENV['IMPORT_PATH']=Rails.root.join('spec', 'fixtures', 'images').to_s
Rake::Task["hyrax:default_admin_set:create"].invoke
Rake::Task["hyrax:default_collection_types:create"].invoke
Rake::Task["hyrax:workflow:load"].invoke
load_csv_sample
end
def load_csv_sample
csv_sample = Rails.root.join('spec', 'fixtures', 'csv_files', 'modular_input.csv')
ModularImporter.new(csv_sample).import
end
end
Run the rake task (rake csv_import:darlingtonia_import
) and visit localhost:3000/catalog
to see the imported objects.
Note: You can see the changes we made in this section on github.
- How do you attach more then one file to an object in this importer?
- How do you specify where the files are on disk?
- What happens if a future version of our CSV file has the headings in a different order?
- What do you need to do if you want to add another of the core Hyrax metadata fields to the data?
- Can you identify the parts of an importer we talked about? Where is the:
- top level kickoff?
- parser?
- mapper?
- record importer?
- logger?