Posted by: mcgratha | May 15, 2011

Why most documents don’t emigrate

Most of the Document and Records Management (EDRMS) projects that I’ve worked on, or have had visibility to, have all worked on the premise that they needed to migrate their old documents from legacy systems (usually shared network folders) on to the new EDRMS. This is not surprising considering the issues and challenges that most organisations have with managing their business documents in shared network folders (covered in my previous post).

However, despite all the good intentions to migrate, when it came to it, most of these organisations actually performed very little document migration. Perhaps it was when the realisation hit home about everything that needed to be done to get those documents migrated and the challenges, effort and cost involved, that they questioned if there really was a business case for doing it? In the end, most only migrated the minimum number of documents.

The analogy is like documents attempting to emigrate into “ECMland” from their shared folder homeland, but most of them being turned away at the border. Sorry, you are not coming in, you’ve not gained enough “entry points”, you are simply too much hassle for me to process (see key challenges discussed below).

Document Arrivals

There are, of course, many organisations that have performed very large document migrations, and continue to do so. However, there are also many organisations that shy away from document migration and this post reflects my observations as to why this might be so.

Key challenges to document migration

Marc Fresko has written a good paper on the what, why, how, who, when and where of document migration. Focusing in on the “what”, the starting point of any document migration exercise, this is usually done in the form of an “as is” inventory or information audit whose purpose is to build a picture of the type, usage, owner and volume of documents across the shared folders. Many EDRMS vendors and third parties (such as www.foldersizes.com) provide tools to assist in this process by crawling and analysing the shared folders, reporting on what they find, identifying potential duplicates and recommending what documents could potentially be cleansed (retired / deleted / moved elsewhere).

Whilst there is certainly some manual review and validation work required to get that overall “what documents do we have” perspective, this falls well short in comparison to the level of manual work required in order to identity and make a decision on what documents to actually migrate and preparing them for the migration.

For example, some of the common challenges include:

  • Version Control – Finding different versions of the same document scattered across shared folders, idenfitying which is the latest version and configuring the migration tool to upload the documents in the right order (so as all the versions are uploaded in sequence);
  • Metadata – Documents need to be classified and tagged with appropriate metadata on upload into the EDMRS. This can involve significant manual work although it is usually possible to semi-automate aspects of this process. For example, a number of techniques can be used such as inferring metadata from the folder structure where a document is stored, inferring metadata based on file name conventions used, by extracting property fields from Office documents (and mapping them to metadata fields) and by using classification tools to analyse the contents of the document and assigning a “best-guess” value for metadata;
  • Security and Access Control – Rules can be setup to assign appropriate security and access control to documents on upload into the EDMRS. However, it can be dangerous to do this unilaterally and will therefore require some manual intervention;
  • Relationships and Dependencies – Many documents may have embedded links within them to other documents on shared folders. Once the documents are migrated, these links will no longer work unless the migration tool is clever enough to automatically re-link them to the associated (and also migrated) documents in the EDRMS.

Although good document migration tools can significantly ease the burden of migration, the overall resource effort required to analyse, design and implement the migration can still be quite significant. The work also requires a lot of “air-time” with the business owners of the documents, unfortunately often the type of people who have little free time to give, which can add further complications and delays to the migration process.

Reality check

Faced with the challenges, effort and costs involved in performing a large-scale document migration, many organisations decide to scale back on the scope and number of documents to migrate.

The rationale is that it isn’t worth it when you consider that, for most organisations, only a small subset of documents, usually the most recent, are accessed on a regular basis. This concept is illustrated in the diagram below.

Long tail of document accessAn approach often taken

As a compromise to a full migration, the approach that I’ve often seen taken involves migrating frequently used / important documents up-front, leaving the remainder in the shared folders and progressively migrating them on demand (as and when required).

Although less intensive on manpower, this approach does come with some complications:

  • There may be people and process issues around the co-existence of migrated documents in the EDRMS and non-migrated documents still in the shared folders. This needs to be mitigated by careful end-user training with clear operational guidance;
  • What happens to the shared folders? Most organisations tend to make the (designated) shared folders read-only but indexed by a search engine, with the goal that all new business documents are stored and managed in the EDRMS and with a longer term view to decommission the shared folders.

Out of interest, I know of some organisations that decided not to migrate any documents, starting completely fresh in their new EDRMS.

I would welcome your feedback on what I’ve written in this post and the approach that you, or your customer, has taken regards document migration. Arlene Spence provides an interesting discussion on the handling of shared drives in her Scrubbing Content series of posts.

Advertisements

Responses

  1. Good post! Two points I’d make:
    – the ‘intelligence’ of auto-indexing engines has come on a long way over the last few years & they have the advantage that they never get tired! If you have a lot of content to auto-index I’d choose automation over people – on average the results will be better.
    – All your conmments also apply to migrating paper to ECM but more so, as the costs are even higher. The reality check is essential to get the right cut off as the more papere you can archive (or destroy!) the bigger the savings.

  2. Thanks Mark. Auto-indexing and classification tools have certainly come along leaps and bounds, especially useful when linked to a taxonomy. I also agree that it mostly applies to paper files too – I initially had paper included in the first draft of this post but then dropped it so as to keep the focus on shared folders as a follow-up to my previous post, as I would have needed to go off on a few tangents to cover paper.

    Best regards,

    Adrian

  3. It’s a shame your blog seems to have died. This is a great resource.

    • Hi Melbs,

      I know, I’ve not written anything on this blog for nearly a year which is terrible. I’ve simply been struggling for time due to workload. However, I do have many ideas for new posts and will be back again soon. Thanks for the “nudge”.

      Regards,

      Adrian

      • Nice article. I’m writing my dissertation that has something to do with this subject. I share your opinion but i would like to find some academic articles to support my opinion. Can you give me some advice on which articles i could use?

        Best Regards,

        Lars


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Categories

%d bloggers like this: