ECM & Image Capture in SharePoint 2010 Q & A

 

This webinar included several demonstrations and discussed:

  • The new features and capabilities in SharePoint 2010 that make it a true ECM platform
  • 2010's Managed Metadata Service and other improvements for content findability
  • SharePoint's enhanced Records Management capabilities that transform how your organization manages content
  • Going from paper to digital with capture tools that integrate with SharePoint
  • How to manage and control document capture and workflows in SharePoint
  • Best practice and avoiding common pitfalls

Whether you are new to SharePoint or upgrading from a previous version, gain a thorough understanding of how to utilize the ECM features in SharePoint 2010, in conjunction with document capture technologies, to create a comprehensive ECM solution with SharePoint 2010 as the foundation.

Webinar presented by:
SharePoint Certified Master Candidate and MVP '04 Jim Duncan
and
Marketing and Communications Manager for KnowledgeLake, Kevin Ells

  • Question: Will a recorded version of the webinar be made available?

Answer: Yes, it is available here and the slides, here.
Note that some people experience issues viewing the demos from the presentation; we are working with LiveMeeting support on a resolution.

  • Question: Are the Records Management features available in the Standard Edition of SharePoint Server 2010?

Answer: Yes, all of the Records Management features presented in the webinar are available with SharePoint Server 2010 Standard licensing.

  • Question: Is this product that you're referring to, KnowledgeLake, something SMBs can afford? Please give minimums to make this financially feasible.

Answer: The KnowledgeLake solution can be implemented in a various levels of complexity. Hundreds of SMB’s have purchased the solution.

  • Question: What tools are available to protect Office and non-Office documents?

Answer: Windows Information Rights Management is available for Microsoft Office document formats and LiveCycle Rights Management is available for Adobe PDF documents.

  • Question: Will KnowledgeLake Capture work with any scanner brand?

Answer: Any scanner that supports TWAIN or ISIS drivers, industry standards. You can also use MFP or network enabled copiers.

  • Question: Do you have experience in integration between SharePoint and Documentum?

Answer: Both ShareSquare and KnowledgeLake have built solutions that work with Documentum and also provided migrations from Documentum into SharePoint.

  • Question: Do you have the ability to pull an image from an application, similar to On Base-application enabler?

Answer: The KnowledgeLake Connect product provides the ability to scan documents, capture electronic files and search from Line of Buisness applications.

  • Question: Do have any experience tying captured content, such as invoices, in to QuickBooks ledgers (e.g. to tie invoices to checks register entries)?

Answer: As indicated above, the KnowledgeLake Connect product provides the ability to scan documents, capture electronic files and search from Line of Buisness applications. The content is stored in SharePoint and can be refrenced by the LOB application.

  • Question: How do you manage access control for documents?

Answer: For documents stored in SharePoint, the built-insecurity model is used for access control to documents. Permissions can be applied at the Site Collections, Site, Library, Folder or individual Document level. Best practices dictate that permissions are kept a broad as possible as permissions set at the individual document level can quickly become too much to manage effectively.

  • Question: We have multiple remote branches that would need to add 100 documents / month. This would be done by multiple users at each branch. How do you cost effectively license your product to do this?

Answer: There are a number of options for remote capture including dedicated scanning stations or use of MFP devices. Depending on the requirements different solutions can be recommended that provide the most cost effective implimentation.

  • Question: With the barcode, can you save specific barcodes in a template to add to following pages to automatically saved with the metadata attached?

Answer: Yes, please contact us to discuss your requirements.

  • Question: Besides capture, what is 2010 lacking to be a complete DMS/EMC solution without the need for third party software?

Answer: SharePoint search can be improved with KnowledgeLake Imaging. This solution provides index based searching webpart that can be added to any site or page. 3rd party workflow can also be added for more complex workflow than the out of the box SharePoint offering. If you need more robust records management there are solutions for that also.

  • Question: Have you done performance testing on bulk scanning if the SharePoint server is located on the west and high volume scanning is located on the east? What are the results?

Answer: KnowledgeLake Branch Capture server is designed for management of remote capture. It can be used for a number of tasks including throlling of bandwidth and scheduled transfers.

  • Question: How does SharePoint/KnowledgeLake connect to Exchange/Outlook

Answer: SharePoint Lists and Libraries can be ‘connected’ to Outlook so that they appear in Outlook as a folder. Outlook will synchronize content with SharePoint on a default (but customizable) schedule. SharePoint Workspace 2010 can also be used for Offline scenarios. KnowledgeLake Connect provides the ability to capture both emails and attachments into SharePoint with metadata and content types.

  • Question: What tools/guidance are available to assist in migration from other ECM products (Open Text, Kwiktag, eDrawer, Documentum)?

Answer: In addition to help from ShareSquared and KnowledgeLake, there are third-party tools available to assist with migration from other legacy ECM products.

  • Question: Does annotation work on Office docs?

Answer: No.

  • Question: Does RMS work with the Standard edition of Microsoft Office?

Answer: The ability to create content or e-mail messages that have restricted permission by using IRM is available in Microsoft Office Professional Plus 2010, and in the stand-alone versions of Microsoft Excel 2010, Microsoft Outlook 2010, Microsoft PowerPoint 2010, Microsoft InfoPath 2010, and Microsoft Word 2010. IRM content that is created in Office 2010 can be viewed in Microsoft Office 2003, the 2007 Microsoft Office system, or Office 2010. (Quoted from “Plan for Information Rights Management in Office 2010” – emphasis added)

  • Question: Is it possible to scan from a desktop scanner directly to Office365?

Answer: KnowledgeLake Capture will have a solution for Office365 in the 4th Quarter of 2010

  • Question: Can you elaborate more on auditing, security, access control (e.g. one customer cannot see another customer’s documents)?

Answer: When you send a document to a content organizer, the document’s version history is erased. To keep a history of who changed a document and when each change was made, you will need an audit log. We recommend that you enable the auditing policy in all site collections that contain active document libraries. For more information about the auditing policy, see Governance overview (SharePoint Server 2010). (Quoted from “Planning for eDiscovery“) Access control will depend significantly on your Information Architecture and how you structure the locations where content is stored. Permissions can be set at many different locations in SharePoint (Site Collection, Site, Library, Folder, Document) and will be inherited by default. Correctly architecting the locations where content is stored such that they align with your planned security boundaries (i.e. one Site Collection per customer) will allow you to follow the best practice of defining permissions at the highest boundary possible while still meeting your security and access control requirements.

  • Question: Does the KnowledgeLake Imaging Viewer require a plug-in?

Answer: KnowledgeLake Imaging Viewer uses Silverlight technology. No plug-in is required if the client has Silverlight enabled in their browser.

  • Question: Do you need to use a traditional bar code or can you use a data matrix or QR code?

Answer: You can use a variety of barcodes including standard and QR codes.

  • Question: Is it possible to send an Outlook email directly to SharePoint?

Answer: Yes, SharePoint supports Incoming Email whereby you define an email address for each List or Library that you want to receive messages. Sending content to SharePoint then becomes as easy as addressing messages to the email address of the List or Library.

  • Question: Our current implementation of managing documents with SharePoint versions on both metadata and content changes. We use minor and major versioning, but we end up publishing so that metadata only changes get applied and can be used for processing. We want versioning to be content changes only. Is this possible?

Answer: Not without a custom or third-party solution. For example, Event Receivers could be created and attached to the Libraries that will perform the metadata edits via custom code that doesn’t create a new version or the document.

  • Question: Are content type hubs across farms? or is it for each farm?

Answer: You can have one content type hub per farm, but only site collections within a farm can subscribe to the content types published by that farm’s content type hub. Content types cannot be published across a farm boundary.

  • Question: Does the Document ID field exist in every list? or only in document libraries?

Answer: The Document ID service is only implemented for libraries. There is no comparable service available for Lists. List items are assigned a unique ID, but it is only unique within a list and does not persist if the item is moved.

  • Question: Who has access to the drop off library?

Answer: Security for the drop off library is handled just like any other library and can be modified to limit access. The one exception is that the drop off library in each site is used by the officialfile.asmx web service. This web service can be configured in Central Administration as a Custom SendTo destination for a site. When used in this way a user doesn’t need permission to drop a file in the drop off library.

  • Question: Does the Document Set get a unique ID similar to physical documents?

Answer: Yes, Document Sets are a content type just like a regular document and receive a Document ID when created and stored in a library. Each document inside a Document Set will also get a unique ID so, you can refer to them collectively using the Document Set ID or individually using their own Document ID.

  • Question: How accurate is OCR?

Answer: It really depends on the type of document that is scanned. To provide an answer, consider 300 DPI TIFF Group 4 letter type documents; For this document type accuracy on the 4 top OCR engines will be 99.9. This is as accurate as triple pass data entry. To determine accuracy of OCR you need a sample set of documents and representative test runs on various products for your scenario.

  • Question: Any favorites for a self-paced training program for someone new to SP?

Answer: Check out the resources section of the SharePoint 2010 web site. The SharePoint team also maintains an active BLOG which is a good place to start.

  • Question: Are the Imaging and Data Capture features in the demos part of SP 2010, or is a 3rd party product required for this?

Answer: What we demonstrated was some features in SharePoint 2010 that facilitate more readily the ability to do document imaging and full ECM. As for the actual input of images or converted documents it requires a document scanner (most likely), and a third party imaging tool or data capture application. Because of the number of variables associated with document capture, this is exactly what you want because it’s hard to come up with a one size fits all document capture solution. SharePoint 2010 makes integrating third party tools easier and more efficient than ever before.

  • Question: Can this or other capture products automatically feed SharePoint from a fax received with a fax modem?

Answer: At a high level the answer is yes. In the end it depends on your fax server software. Most of these applications either already have a hook into SharePoint OR they have all the functionality required to create one. Because these systems output images to the file system it’s possible, even without a direct hook, to integrate and store the images in SharePoint 2010 using the Content Organizer feature.

  • Question: How do you section a large document based on some specific information?

Answer: Document separation can be achieved in several ways with varying degrees of complexity and accuracy. Most commonly companies who are batch scanning multi-page documents will do document separation with blank pages or barcode pages. In the demo we used barcode pages to separate each way bill. On the most complex application, separation can be done based on document classification, what type of document it is, or keywords pulled off the document using OCR. Document separation is used more often than not in scanning applications of significant volume.

  • Question: Can someone explain how SP 2010 can scale to very large document volumes on the order of millions differently from 2007?

Answer: The problem with storing millions of documents in SharePoint has never been storage; it’s always been retrieval. In SharePoint 2007 if you tried to retrieve a list of too many documents at one time, usually around 2,000, your system would bog down and time out. SharePoint 2010 has added several improvements that alleviate this problem. First, large list throttling makes it possible for administrators to cap the number of documents returned by any request to a content database. Where a poorly formed request in 2007 would time out, the same request in 2010 will simply return the first X documents requested and inform the user that they have exceeded the threshold. The user can then refine their request to limit it further. Other features like the new Content Organizer have been added to facilitate easy segmentation of large libraries into organized folders making larger libraries manageable.

  • Question: Can SharePoint 2010 be used as an email repository for archiving?

Answer: Yes, the drop off library demoed in the webinar can be email enabled like any other document library. Emails sent to that library will be filed based on metadata associated with the incoming email by the Content Organizer.

  • Question: Do you need Office 2010 in order to use all of the SharePoint 2010 features? or will Office 2007 work?

Answer: Office 2010 is more tightly integrated with many of these features than previous versions of Office. However, Office 2010 is not required for features like the drop off library or Document IDs to be used. Even a manually uploaded file is given a Document ID. That said, some of these features may not be directly accessible in previous versions of Office.

  • Question: Does Microsoft plan on adding any kind of Outlook-to-SharePoint connectivity?

Answer: Synchronization of content between Outlook and SharePoint is already available in SharePoint 2007. If you are referring to the use of SharePoint to archive emails from Outlook, this is also available today by configuring the Journaling features in Outlook and pointing them at a SharePoint drop off library. This does require some manual configuration, but it is very possible using the current technology.

  • Question: Does SharePoint 2010 have the ability for Email Management & Archiving?

Answer: Yes, please see previous answers about email enabling the drop off library of a site.

  • Question: Can the barcode be on a separator page, and does the barcode identify the vendor or any other metadata?

Answer: It’s completely up to you. In the demo, the barcode provided no information but rather was used simply for document separation. All of the data was extracted via OCR. If you have barcodes that contain valuable information then the barcode can be used both for separation and data extraction. The great thing about barcodes is they are very accurate and can be read at any angle.

  • Question: Can permissions be assigned to content types?

Answer: Permissions are assigned at the Library or Item level. There is no built in provision for applying permissions to a content type. However, relatively simple event receivers can be built that assign permissions based on the content type of a document.

  • Question: Can you assign metadata to sites?

Answer: Sites have metadata associated with them just like other objects in SharePoint. If, however, you are asking if metadata can be assigned to ALL items in a site the way that metadata assigned to a Document Set applies to all documents in the set, then no, you can’t assign metadata to a site in that manner.

  • Question: For high volume, someone said the system can be configured to store scanned documents to file server; where can I find more info on this?

Answer: All the documents stored in SharePoint libraries are stored in content databases. A new SharePoint 2010 feature called Remote Blob Storage (RBS) makes it possible to store the binary large objects (Blobs) that are the files in a separate database system specifically designed for large blob storage. But you cannot store SharePoint document files on a network file share. That design was tried in SharePoint 2001 and caused innumerable problems because users could access, move, and delete content without going through SharePoint.

  • Question: I see you used a Kodak product for the image capture process; is this the vendor you recommend? If yes, why is that? I've come across another 3rd party vendor called KnowledgeLake, what are your thoughts of their product?

Answer: The simple answer is because Kodak was a sponsor of the webinar. There are many capture products in the market. KnowledgeLake is one that we also specialize in. These products all have their strengths and weaknesses as related to business process, work flow, price and capture quality. It really depends on the types of documents you are processing and the depth of capture/OCR you want to achieve.

  • Question: Is SharePoint 2010’s rich media support like the "external blob interface" usable with SharePoint 2007? If we are not going to 2010 soon, which vendor's external blob interface will be compatible with 2010 rich media support?

Answer: The new rich media support does not specifically involve the new remote blob storage (RBS,) which is similar to the external blob storage (EBS) feature in SharePoint 2007. Rich media support is focused on providing web parts and library enhancements that make viewing and managing rich media available in SharePoint. RBS and EBS are specifications that provide similar functionality but are different implementations. EBS will continue to be supported in SharePoint 2010. RBS support, which does not require a third party vendor involvement, has been added.

  • Question: Is this functionality you are showing part of SharePoint Foundation, or is it part of the full MOSS?

Answer: Most of the ECM features we demonstrated like the Document ID Service, Managed Metadata Service and the Content Organizer are not included in SharePoint Foundation. They require SharePoint Server 2010 standard edition or above.

  • Question: What happens to current documents in SP 2007 with regards to the new Document IDs in 2010? i.e. When you upgrade to 2010, does SharePoint go back and assign document IDs to existing documents?

Answer: Document libraries and content types in SharePoint 2010 do not include Document IDs by default. When you enable the Document ID feature a periodic timer job is enabled that applies Document IDs to existing documents that do not already have them. This would be true for documents that were brought into SharePoint 2010 through the upgrade process as well.

  • Question: With the additional space required of the new image capture features, what is the suggestion now for how large a content database should be? Should these databases be kept less than 100GB? For better performance, should multiple site collections with separate databases be created or is this no longer an issue in SharePoint 2010?

Answer: It has always been a best practice to manage the size of your SharePoint content database. The 100GB size is an often quoted preference, but the real determination is the capability of your backup/restore system. Content databases can easily grow well beyond a 100GB size with no real reduction in performance. However, if you can’t restore a failed database in an emergency then it's too big. 100GB is a good rule of thumb, but the real size limit is very specific to your environment, your usage pattern, and your service level requirements.

  • Question: FYI: Regarding the download files from the webinar, including ampersands in the filename requires us to rename the file to save it into SharePoint file shares (and the error message given gives no clue as to the fix)

Answer: Sorry about that! You are correct that because SharePoint stores metadata like the file name in SQL there are more restrictions on the use of special characters when uploading a file to SharePoint. Live Meeting doesn’t have the same restrictions so we didn’t catch that when we uploaded the file.

  • Question: How do you redline the images?

Answer: I believe you are referring to annotation capabilities, which is what is essentially done when you redline an image. This can happen either at the point of capture or afterwards inside SharePoint. The method you choose is based on who you want to perform the annotation, i.e. is it a tracking and administration task at the point of scan or something the person consuming the document is responsible for doing? Most capture applications have the ability to annotate images prior to export or do what is can stamping. There are also tools built into SharePoint as web parts that allow you to view and annotate images after they are stored.

  • Question: In SharePoint 2010, are documents still stored in BLOBS or can we now use file storage?

Answer: All content in SharePoint 2010 is still stored in SQL content databases. You cannot store them on a network file share. That design was tried in SharePoint 2001 and caused innumerable problems because users could access, move, and delete content without going through SharePoint.

  • Question: What is the different between document source and document volume?

Answer: Document source is the mode in which you receive a document, e.g. email attachment, fax, physical mail etc. Document volume is how many documents you image on a monthly basis (typically). Both are critical determinants in choosing an appropriate capture product.

  • Question: Can the drop off library be processed manually? Can a person pick the destination if all the rules fail?

Answer: If the rules in the Content Organizer cannot determine where to put a document, it is left in the drop off library. For example, if insufficient metadata is available to make a storage decision, the document remains in the drop-off library. In that case a user may edit the properties of the document in the drop off library and once saved the content organizer will store it based on the added information. You can also choose to not add information and store it manually too.

  • Question: Regarding the Managed Metadata Service; the demo showed "single value" pick lists. Does 2010 support picking multiple values from a given list for a specific doc? e.g. "this document relates to Cincinnati AND Columbus”.

Answer: Managed metadata keywords can be entered as a semicolon delimited list in a managed metadata field, so you can enter multiple values.

  • Question: Regarding the Managed Metadata Service; What happens if the data for a particular record in a lookup list changes? Do the documents linked to that record also reflect the changed data, or do the documents reflect the old (unchanged) data? e.g. a "department" lookup has a value called "Personnel", but then 2 years later that needs to be changed to "Human Resources" - Do all the documents categorized to "Personnel" need to be updated somehow?

Answer: I suspect what you are asking is if the keyword in the managed metadata service changes are those changes applied to all records tagged with the keyword. Such changes are only applied if you edit the metadata of the item itself. Otherwise the old keyword will still display in the list. You can however overwrite existing keyword entries on a document with new entries.