Islandora 8 Webinar: Migrating from Islandora 7 to Islandora 8

  • Introduction (0:00)
    • They are focusing on this heavily as they want people to have as smooth a transition as possible. Especially because this is not a trivial thing as Drupal and Fedora have had major updates between 7 and 8.
  • Two Options (0:44)
    • Pull directly from 7.x (which is what will be the focus of this video) using the migrate_7x_claw code.
    • Clean up exported metadata and import that. This would involve pulling down the metadata into OpenRefine, where you would clean it up. From there you would spit out a csv file which can be uploaded with their migrate_islandora_csv code. He claims there is a video for this, but I haven’t seen it yet.
  • Migration Prerequisites (1:44)
    • Either way you need to know your metadata. You need to know the extent and also what’s in the metadata. Particularly, in this case, you need to know what’s in your Solr index and XML metadata.
    • You have to configure Drupal 8 to accept your metadata. The Islandora defaults should approximate basic MODS and be a good starting place.
    • You have to configure Drupal 8 to import your metadata as well.
  • Migrate API (4:11)
    • Everything is built off of Drupal’s migrate API and is built in. It is well used, but not super well documented.
    • Anatomy of a Migration (4:50)
      • They’re YML files that follow the ETL (Extract, Transform, Load) pattern. Meaning, you pull your data, do stuff to it to get it into the correct format, and then load it onto the new platform.
      • It sounds like they provide plugins for these steps: extract -> source plugin; transform -> process plugin; load -> destination plugin.
      • Source plugins (6:08) will contain everything you need to get the body of data and break it into bodies of work. You can also define constants in a source plugin. There are source plugins available for JSON, XML, and CSV. You can use files from your server or the web (including REST APIs). The way you extract the data is pretty straight forward based on file type: column name for CSV, key name for JSON, and XPath for XML. They also wrote their own custom source plugin for Islandora 7 that works with JSON from Solr and FOXML from Fedora.
      • Process Plugins (8:32) transform source data. Used to format dates, manipulate strings, looking up other entities and generating them if they don’t exist yet. Can do a lot of things here. You can use it to clean up your data, but they aren’t doing that in this video. Process plugins can be chained together to make pipelines.
      • Destination Plugins (10:33) take the process results and put them somewhere. Really comes down to what type of entity you’re making in Drupal 8. Every time you need a different destination plugin, you need another YML. Full migrations are thus a sequence of smaller migrations: nodes, media, files, subjects, locations, people, etc.
  • migrate_7x_claw (12:30)
    • This queries Solr to get the list of objects to migrate. You control this. For each object it retrieves fields from Solr, retrieves FOXML, pulls over every datastream from the FOXML and extracts metadata from XML datastreams to make other entities.
  • Demo (14:24)
  • Questions (51:10)
    • (51:10) Could you give us a quick reminder about why raising features and not composer?
    • (52:30) Does the deduplication for generating entities, for example people, normalize out punctuation or is it doing an exact string match? If we have a URI in the at value URI, will that be taken into consideration when doing the deduplication?
    • (53:39) Clarifying on the features question: is this essentially done to make it more command friendly and less command-line/developer oriented?
    • (55:45) As I understand it, Islandora 8 really only uses Fedora as a file system. Was today’s demo and pull from Fedora to Fedora? And if so, could you also migrate to a flattish file system? (Yes, yes, and yes)
    • (57:36) What is the status on multi-page content? The newspaper module in Islandora 7, for instance.