Most of the Document and Records Management (EDRMS) projects that I’ve worked on, or have had visibility to, have all worked on the premise that they needed to migrate their old documents from legacy systems (usually shared network folders) on to the new EDRMS. This is not surprising considering the issues and challenges that most organisations have with managing their business documents in shared network folders (covered in my previous post).
However, despite all the good intentions to migrate, when it came to it, most of these organisations actually performed very little document migration. Perhaps it was when the realisation hit home about everything that needed to be done to get those documents migrated and the challenges, effort and cost involved, that they questioned if there really was a business case for doing it? In the end, most only migrated the minimum number of documents.
The analogy is like documents attempting to emigrate into “ECMland” from their shared folder homeland, but most of them being turned away at the border. Sorry, you are not coming in, you’ve not gained enough “entry points”, you are simply too much hassle for me to process (see key challenges discussed below).
There are, of course, many organisations that have performed very large document migrations, and continue to do so. However, there are also many organisations that shy away from document migration and this post reflects my observations as to why this might be so.
Key challenges to document migration
Marc Fresko has written a good paper on the what, why, how, who, when and where of document migration. Focusing in on the “what”, the starting point of any document migration exercise, this is usually done in the form of an “as is” inventory or information audit whose purpose is to build a picture of the type, usage, owner and volume of documents across the shared folders. Many EDRMS vendors and third parties (such as www.foldersizes.com) provide tools to assist in this process by crawling and analysing the shared folders, reporting on what they find, identifying potential duplicates and recommending what documents could potentially be cleansed (retired / deleted / moved elsewhere).
Whilst there is certainly some manual review and validation work required to get that overall “what documents do we have” perspective, this falls well short in comparison to the level of manual work required in order to identity and make a decision on what documents to actually migrate and preparing them for the migration.
For example, some of the common challenges include:
- Version Control – Finding different versions of the same document scattered across shared folders, idenfitying which is the latest version and configuring the migration tool to upload the documents in the right order (so as all the versions are uploaded in sequence);
- Metadata – Documents need to be classified and tagged with appropriate metadata on upload into the EDMRS. This can involve significant manual work although it is usually possible to semi-automate aspects of this process. For example, a number of techniques can be used such as inferring metadata from the folder structure where a document is stored, inferring metadata based on file name conventions used, by extracting property fields from Office documents (and mapping them to metadata fields) and by using classification tools to analyse the contents of the document and assigning a “best-guess” value for metadata;
- Security and Access Control – Rules can be setup to assign appropriate security and access control to documents on upload into the EDMRS. However, it can be dangerous to do this unilaterally and will therefore require some manual intervention;
- Relationships and Dependencies – Many documents may have embedded links within them to other documents on shared folders. Once the documents are migrated, these links will no longer work unless the migration tool is clever enough to automatically re-link them to the associated (and also migrated) documents in the EDRMS.
Although good document migration tools can significantly ease the burden of migration, the overall resource effort required to analyse, design and implement the migration can still be quite significant. The work also requires a lot of “air-time” with the business owners of the documents, unfortunately often the type of people who have little free time to give, which can add further complications and delays to the migration process.
Faced with the challenges, effort and costs involved in performing a large-scale document migration, many organisations decide to scale back on the scope and number of documents to migrate.
The rationale is that it isn’t worth it when you consider that, for most organisations, only a small subset of documents, usually the most recent, are accessed on a regular basis. This concept is illustrated in the diagram below.
As a compromise to a full migration, the approach that I’ve often seen taken involves migrating frequently used / important documents up-front, leaving the remainder in the shared folders and progressively migrating them on demand (as and when required).
Although less intensive on manpower, this approach does come with some complications:
- There may be people and process issues around the co-existence of migrated documents in the EDRMS and non-migrated documents still in the shared folders. This needs to be mitigated by careful end-user training with clear operational guidance;
- What happens to the shared folders? Most organisations tend to make the (designated) shared folders read-only but indexed by a search engine, with the goal that all new business documents are stored and managed in the EDRMS and with a longer term view to decommission the shared folders.
Out of interest, I know of some organisations that decided not to migrate any documents, starting completely fresh in their new EDRMS.
I would welcome your feedback on what I’ve written in this post and the approach that you, or your customer, has taken regards document migration. Arlene Spence provides an interesting discussion on the handling of shared drives in her Scrubbing Content series of posts.