ECM Document Migration – Part 3: Analysis and Design – Knowing Thy Repositories

To a large degree, the success of a migration project will depend on how well understood the current source repository and  documents are, in addition to how well the target repository has been architected and implemented.  Naturally, these cannot be considered in isolation, so effectively bridging the two repositories and correctly mapping migrated documents between them is the final key to a winning migration analysis and design.

Understanding the full scope of document migration analysis and framework design parameters starts with completing a comprehensive inventory and assessment of the production source repository.

  • Source repository content taxonomy analysis and documentation provides a great opportunity to review and refine the organization’s existing classification and storage schemes.  The taxonomy document should include detailed information on document counts, image types, storage used, age, metadata, data value analysis, and the like.  These technical details on the population of documents are most useful if represented in a business context, preferably broken down by document class and other relevant line-of-business process categories.

  • Source repository document features determine what functionality has been implemented for the source documents and data.  Some features are fairly obvious and easy to spot, such as versioning and annotations.  Other features can be well-hidden and only used by a small, though important, set of documents; custom security, linking, and redaction spring to mind.

  • Records retention policy (sometimes referred to as a file plan) can play a major role in defining the scope of documents migrating to a new ECM production system.   It’s usually not a good idea to migrate documents that are already (or soon to be) eligible for offline archival or destruction.  To do so would unnecessarily increase administrative demand on records management resources by doubling the number of impacted documents to track, now across multiple systems.  Performing the archival and purge functions based on having already met retention criteria for the source repository can potentially exclude a significant percentage of documents to be migrated.  This leads in turn to a shorter, less costly migration and less burden on the records disposition process after the migration. 

  • Compliance, Risk Management, Governance form part of the Records management umbrella and are either layered on top of or sometimes in addition to records retention.  Regardless, documents and data may have specialized functionality, formatting, and processing in order to support legal, corporate, and governmental requirements. Some examples include use of redaction, e-Discovery and lifecycle management.

  • Source repository architectural blueprints should already exist and be updated with all changes since inception.  If this is not the case, problems can arise with defining the new target architecture since baseline requirements for number of users, environments, testing, disaster recovery, and the like will be difficult to develop for the proposed target repository.  

  • Documented information on stakeholders, business processes, and application profiles with dependencies on the source repository is critical to the design of the migration process. Without comprehensive current state designs and profiles of existing processes and applications, the migration design will lack completeness and the document migration may miss critical requirements.  In particular, failure to identify and communicate with the stakeholders – especially the day to day maintainers and users of the system – will lead to unpleasant outcomes during the project.

Target repository architecture and taxonomy

With a full understanding of the source repository, one can turn to the target repository.  In the case of an existing target production repository, there should be the same full-on inventory discussed above for the source repository.  Alternatively, the architecture of a brand new target repository will be able to leverage the information gathered from the source repository assessment.  In each case, a fully designed and documented target repository architecture and taxonomy will help ensure readiness of the requisite environments, hardware, and software required for a successful migration.

While assessing the target repository, certain key considerations apply for the migration:

  • How will documents and data map from the source to the target repository?
  • Will document consolidation occur?
  • How do source document formats and data need to be transformed for use in the target system?
  • How will metadata clean-up processes be applied?
  • What document and data functionality differences exist between the repositories?
  • What new document and data related requirements exist for the target repository?
  • How will security map between the repositories?
  • Achieving seamless integration of records retention information and processes
  • Requirements for retaining and mapping internal source document properties like document creation date, document identifier, and so on   

IT Management Checklist for Planning an ECM Document Migration

In order to make the preceding sections more concrete, it helps to have a checklist!  Who doesn’t love a checklist?

The following IT-centric list details the numerous inputs to be considered for the migration project planning, analysis, and design.  Some of these topics may be not directly applicable to your current or planned production scenarios, but all are worthy of full consideration since plans and requirements often change.    

  • Physical Document Handling and Electronic Capture Requirements
    • Transporting physical documents and data
    • Centralized capture
    • Distributed capture – software and processes for endpoint workstations, Multi-Function devices
    • Advanced capture –  software and processes for classification and auto-indexing
  • Infrastructure Requirements (Source, Target repository)
    • Remote Access
    • Migration workstations / VMs
    • Environments involved (DEV, TEST, PROD, etc…)
    • Media availability/storage
    • Specialized media hardware (e.g. optical jukebox)
    • Federation
    • Storage (migration DB, images, data extracts, media)
    • Database
    • Network
    • Users and access
    • Hours of migration operation
    • Maintenance schedules
    • System upgrades
    • Administrative (backups, anti-virus, firewalls, caches, etc…)
  • Document and Metadata Requirements (Source , Target repository)
    • Source repository data access, formatting, metadata fields of interest
    • External (non-repository) data sources data access, formatting, fields of interest, additional processing (e.g.  Account lookup, paid out information)
    • Mapping from source repository metadata to target repository metadata
    • Data cleanup
    • Document exclusion
    • Records management – retention; compliance; risk management; governance
    • Prioritization of documents
    • Image format transformations
    • Documents with large content (>100 MB)
    • Documents with mixed image types
    • Special data types (multi-value, binary, object)
    • Annotations
    • Versioning
    • Preservation of system data (document ids, user ids, dates)
    • Other features (custom objects, custom security, linking, folders, redaction, life cycle management, etc.)
  • Development and testing (Source, Target repository)
    • Sample data / images
    • Software requirements (e.g. C#/ Visual Studio, Java/Eclipse)
    • Source and target imaging system client software / SDKs
    • Migration verification scope and methodology
  • Migration Operations - Execution
    • Monitoring
    • Ramp up and throughput testing
    • Migration workload versus repository resources and bandwidth
    • Exception handling
    • Existence of search indexes
    • Data updates for source documents during migration
    • Handling new source documents added during migration
  • Project management
    • Status reporting (periodic, exceptions, final reconciliation)


The outcome of the migration analysis and design would include the following documents:

  • A list of contacts for the business users of the documents and data
  • An up to date inventory, assessment, and system architecture of the source repository
  • An up to date inventory, assessment, and system architecture of the target repository (as applicable)
  • A completed document taxonomy for source and target repositories plus a mapping from source to target repository
  • A complete records management file plan
  • Other required compliance, legal, governance, and risk management documentation
  • Migration framework requirements


Imagine Solutions ECM Migration Services can help make your next ECM Migration project very successful!


About the author

Sean Leino, Senior Systems Engineer

Sean Leino works for Imagine Solutions, Inc. as the Principal Conversion Specialist.  He has over 10 years of experience planning and executing document migrations for dozens of clients, ranging from small departmental repositories up to high volume, Fortune 100 enterprise systems with counts of a billion images and more.  Imagine Solutions, maker of Encapture: Distributed Capture System and an IBM Premier Business Partner, provides the full range of ECM services including migration, software, consulting, solutions, and platform services.