Links to Unstructured Data blogs worth taking a look at
Posted by Steve Akers on Wed, Mar 25, 2009 @ 11:03 AM
I recently came across some blog posts that you may find interesting and/or helpful. They each reference real-world problems in the areas of compliance or unstructured data management. They don't focus on the same topic, but they do refer to the same problem--why managing unstructured data assets is difficult or important or both. And, they all relate, in one way or another, to the concepts of data volume complexity, location complexity, and format complexity mentioned in my last blog post.
The first is an excellent blog entry that explains some issues around SharePoint servers. These powerful and useful collaboration servers are causing a number of problems within enterprise environments. They present a great way of aggregating content, but as they grow in popularity, the amount of content spread across a large number of servers creates litigation discovery issues and compliance risk. As employees move from project to project, the amount of data they place in SharePoint environments grows. This ever expanding content presents a risk to litigation discovery professionals who need to find all copies of existing content and compliance officers who need to ensure that all of these large repositories comply with content directives (content that is confidential is secured; certain topics are not present in community archives; and proper retention policies are in place for content that must be retained.
Both the volume and location complexity axes of the problem exist with SharePoint. There is a lot of data in multiple locations and it is tedious if not impractical (or impossible) to search. The only way to know what content exists in the SharePoint "universe" is to use an intelligent classification device that can organize and present it to reviewers in a useful way (The Digital Reef "single pane of glass" approach provides a single view of all data stored within a SharePoint environment).
This Byte and Switch blog, detailing Mimosa functionality, is interesting from my perspective because Mimosa and Digital Reef used together could provide the capability to implement "content specific archives" so that content of certain types could be archived intelligently.
This next blog entry from IT Toolbox is focused on information management. This post has been around for a while, but it is still valid because the classification and categorization problems standing in the way of ILM are just being solved today. We (Digital Reef) feel that the missing link in the ILM strategies of most organizations is that they don't know what data/content exists in their environments. Meaningful ILM requires intelligence and scale. We are working with some customers to solve these problems now. More on that topic in future blog posts.
The last blog entry, Solving the Enterprise Search Dilemma, is from Tony Asaro. It points out how the old technology of keyword search cannot be used to implement the vision of efficient SharePoint management or anything as aggressive as ILM.