cloud.net

Tuesday, June 7, 2011

Migrating documents to SharePoint the easy way

OK, easy is relative, but it if you don't mind making a couple XML configuration files, it's as easy as easy gets with Folders2SP PowerShell script.

Folders2SP Codeplex Project

Before writing the script I evaluated a number of products, but I found non offered the flexibility I wanted and at ~$2500 I reasoned I could do better; hence Folders2SP. Sure it doesn't have a UI and it doesn't support Permission migration... but it's free, 42kb, doesn't soil your production servers, no custom web services (for item level permissions) and you can extend it to your hearts content... so you can add a permissions method.

Description

Folder2SP.ps1 is a PowerShell script which helps you migrate documents from SharePoint and folder structures to SPS2010 while preserving versions, metadata and converting folder names to managed metadata terms.
Use case: You have documents in a MOSS library and in a network share folder structure which need to be migrated to a SPS 2010 library. You want to use the words in the folders as taxonomy terms in your SPS2010 library and perform regular expressions on the file name/path.

Case setup

In this example we're going to get some documents from MOSS document library "Documents", and along with the sample documents included in this package upload them to a "Codeplex" document library in a SPS2010 site.
  1. Create a working directory either on the SPS2010 server or a network share Eg: C:\Folders2SP.
  2. Extract the contents of this project to your working directory. You should have Folders2SP.ps1, run.ps1, codeplex.xml and a "files" directory with some more directories and text files.
  3. Create a destination document library in you SPS2010 site eg: Codeplex.
  4. Add the required fields to the document library eg: Year (choice with 2008-2011), Enterprise Metadata field (site settings > Enterprise Metadata and Keywords Settings), LibMeta1 (Managed Metadata single), LibMeta2(Metadata multi), LibMeta3(Metadata multi) and connect them to an appropriate termset or term.
    • Modify the example codeplex.xml. Use your own MOSS library (works with a SPS2010 document library too).
    • Lookup site: the site you want to get the documents from. The script uses web services. To interrogate the source site I recommend you try my FireWS script , cos it rocks.
      library: the library to get the files from.
      view: if you want specific files, enter the view Guid here.
      rootfolder: is the folder to look in... default is the name of the library.
    • Target site: is the only attribute you need to change for this demo.
    • The xml file is commented so you know the options.

Process overview

  1. Run PowerShell as Administrator (make sure your execution policy is set to remote) from your SPS2010 server, run.ps1 includes Folder2SP.ps1 and calls the function Folder2SP passing Codeplex.xml as the configuration file.
  2. Folder2SP loads the config file, assigns a bunch of variables and creates a log file.
  3. Goes to http://spsdev/documents and starts traversing the library... if you get access denied, it's your fault, try running the scripts from C:\ or use a local MOSS library.
  4. Downloads all files it finds to the "Folders" path
  5. Loops through all the files in Folders path and (if replicatefolders=true) recreates the folder hierarchy on the SPS document library.
    • Connects the Lookup site
    • Gets Metadata
    • Downloads versions to the working directory
    • Uploads the file to the Target document library
    • Loops through each field and check if it's configured in the xml config file. This is an inefficient way of doing it, but the script started out as just a way to copy all field data from the Lookup to the Target.
    • If it's a taxonomy field, it looks up the value, if it can't find the value it creates it. There are a couple options here, so read the comments in the example xml config file.
    • It then loops through the "Field" "Value"s and performs logic associated with Value type. I've only created 3 Value types "FileDate" uses properties from the file, "RegEx" takes an expression and matches on the full file name, and "Lookup" uses a value from the Web Service call.
    • If there's a user field it will 1st attempt to find the user, if it can't find them it will attempt to add them, if that fails it uses the account who's running the script.
    • Sets the new item value.
  6. At the end you should have a migration report, and all the files from the Lookup site and your Folder path in the SPS2010 library with the metadata.


To do

  1. View Fields ($wsFields) for the web service call should be moved to the XML config file.
  2. Permissions could be also migrated if a item level permissions web service were available on the Lookup server.
  3. Improve error handling, and more consistent variable naming.
  4. Improve xml config file
  5. Add more field type handlers