Thursday, March 8, 2012

Identify issues & risks in your SharePoint – Part 1

One important and regular task of a SharePoint administrator is to identify issues/risks in his SharePoint environment. Most probably the first task of a newly recruited SharePoint administrator would be to understand and report the current status of the SharePoint system. The objective of this article is to describe different ways to identify issues or risks. Some techniques and tools mentioned here can be used for regular monitoring and some of them can be used when you encounter a symptom of an issue. I will split the article in to two parts based on the SharePoint version.

Part 1: Identify issues and risks in WSS 3.0 and SharePoint 2007 (MOSS)
Part 2: Identify issues and risks in SharePoint Server 2010 and SharePoint Foundation

Identify issues and risks in WSS 3.0 and SharePoint 2007 (MOSS)

Although those SharePoint versions are very old, they are still being used in small to large scale organizations. So there is a good chance that you will have to manage a WSS 3.0 or MOSS system in your organization. By the way there is a good chance that those versions are selected for new installations over SharePoint 2010 due to various business decisions (e.g.: Standards, Compatibility, Cost, etc.…) .

So let’s assume we have an older version of SharePoint and we need to monitor for issues and risks. Following section will provide different tools that we can use to identify problems in WSS 3.0 or MOSS systems.

1. Site Collection Usage summary

image

I believe this should be the starting point of our analysis. You have to do this exercise for every site collection in your farm. By doing this you will get a better idea of what are the heavily used and what are not being accessed at all.

Most important sections in this report are current storage used, the maximum storage allowed (site collection quota), number of users and recent bandwidth use. By monitoring first 2 sections we can identify if there is a risk of our SharePoint to be out of service due to lack of storage. If the rate of storage used over time (change of storage used/ time period) is alarmingly high we may reduce the maximum file upload size temporary until proper solution is in place (This should be done after monitoring Storage space allocation). The network utilization section denotes the number of megabytes per day of network utilization attributable to sites in the site collection. We can detect network related risks/issues if this figure is too high compared to the available bandwidth.

2. Storage space allocation

This feature is only available if a quota template is applied to the site collection. If the current storage is reaching the quota you can check what has taken the most of storage. Following are some practical scenarios where this report is extremely helpful

There was an instance in one of our SharePoint systems where the recycle bin had occupied more than 50GB.

If you see large document libraries, you need to take special care in order to overcome possible performance compromise. You can use this Microsoft guideline as a framework to tackle this kind of situation.

image

3. Event viewer

We can get a list of recent errors, warnings and notification related to SharePoint using this console. Most of the SharePoint related items will be available in the Application category.

image

4. SharePoint ULS

SharePoint Unified Logging Service (ULS) writes SharePoint events to trace logs located in “C:\Program Files\Common Files\Microsoft Shared\Web Server Extensions\12\LOGS\” path. As a general practice we can first monitor Windows application events in Event Viewer. If there is a SharePoint related event we can further study in SharePoint trace logs. There are some third party tools to view ULS logs in a user friendly manner.

We can control the level of Windows and ULS logs using event throttling settings. I will describe that in a separate post.

5. SharePoint diagnostic tool

This tool can be downloaded from Microsoft using this URL. You can setup this tool according to this guideline. It does the analysis based on several areas and finally provides a report.

clip_image010[4]

6. Microsoft Performance Monitor

Various performance counters are available in the performance monitor. We can select more relevant counters to do our analysis

clip_image012[4]

Following are some relevant performance counters. You can select some more counters as well

Counter

Threshold

Target

Logical Disk / % Free Space

15%

Storage

Physical Disk / % Idle Time

20%

Storage

Memory / Cache Byte

300MB

Memory

Memory / % of Committed Bytes

80%

Memory

Memory / Available Mbytes

205MB

Memory

Processor / % Processor Time

80%

Processor

7. Microsoft Operations Manager 2005 (MOM)

This is little bit advanced but the recommended approach for WSS as well as MOSS. WSS and MOSS Management Packs (MP) are available to download for this purpose. If your SharePoint is a multi-server farm this will be the practical monitoring tool as it is required to collect data from each and every server of the farm. MOM has predefined performance counters. So it will track and respond to events based on those performance counters. Furthermore this is capable of alerting the administrator if it detects a problem.

8. Microsoft System Center Operations Manager (SCOM)

This can also be used to monitor WSS 3.0 and MOSS, since this is applicable to SharePoint Server 2010 as well, I will describe this in the next part of the article.

Those are some of the techniques we use regularly to monitor WSS 3.0 and MOSS 2007 and identify issues and risks. Hope this will help someone.