Developing a “Get Current” Strategy to Simplify Planning and Improve Performance and Reliability


Intended Audience
Business Decision Makers,
IT Professionals, and IT Executives
Products & Technologies
·   System Center Configuration Manager
·   Microsoft SharePoint 2010
·   SQL Server Reporting Services

Introduction

Microsoft IT (MSIT) wanted to optimize operations, simplify planning, and improve performance and reliability by reducing the number of versions of retail software installed and managed in the production environment. MSIT developed a “Get Current” strategy to define hardware, operating systems, and software standards and to measure and publish rates of compliance.
To support the “Get Current” strategy, MSIT developed a technical solution that aggregated server information from diverse sources and published the information in easily consumable formats to help IT leadership, business owners, and individual teams objectively measure success.
Why “Get Current”?
MSIT found that they were spending a great deal of time addressing outages in the enviroment that had known but unapplied fixes. They also found it increasingly difficult to maintain multiple sets of support groups for all of the versions of software still running in the environment. Because it often takes twice as long to resolve support issues related to non-compliance, the rate of increase in the volume of support incidences was becoming unmanageable.
The “Get Current” program removes an entire segment of errors—those due to non-compliant hardware and software. This allows outages to be more defined as known error types for current platforms. The goal was to reduce variability in the environment by ensuring that servers were running the latest versions of operating systems, server software, drivers, and security patches on compliant hardware. This would allow MSIT to introduce higher reliability, decrease downtime, and reduce the number of support tickets and costs.
MSIT also wanted to improve the service they provide to product groups through First and Best initiatives. MSIT has a role in helping product groups release quality enterprise software and solutions by deploying and running beta versions in the Microsoft environment. By providing feedback about their experience to product groups during First and Best initiatives, MSIT often drives improvements or helps to identify and resolves issues that Enterprise customers may face. MSIT cannot be succesful in deploying beta software if the environment is not running the latest released versions of product offerings.
Solution
In designing and implementing a “Get Current” strategy, MSIT first had to define their compliance standards. MSIT adopted the same standards, or product support lifecycle, as external customers who receive product support from Microsoft.
The MSIT compliance standard covers several areas or layers of compliance:
1.     SQL Server® version
2.     Operating system version
3.     Operating system security updates
4.     Network and storage drivers
5.     Hardware models
States of compliance then were defined and categorized into the following:
·         Current. The most recent version of operating system, SQL Server, or the newest hardware models available.
·         Supported. Operating system and SQL Server versions still being supported by MSIT, and hardware models that are not yet at end of life. These are considered compliant.
·         < 18 Mo. A term used to describe operating systems and SQL Server versions that will no longer be compliant within 18 months, but are currently considered compliant.
·         EOW (End of Warranty). A term used for hardware that is still compliant but no longer under warranty by the manufacturer.
·         Extended. A term used to describe operating systems and SQL Server versions that are no longer compliant by MSIT standards; however, they do receive security updates.
·         EOL (End of Life). A term used for operating systems, SQL Server, and hardware versions that are no longer compliant by MSIT standards and do not receive security updates.
·         Non-STD (Non-Standard). A term used for hardware that is non-compliant, and was never a standard purchase option.
Obtain Approval
The next phase involved one of the most important aspects for the success of the Get Current strategy: securing approval from the highest levels of IT ledership and governance.
MSIT presented their strategy for building business intellegence and predictability into an ongoing program that measures the environment against defined current standards and obtained sign-off for their proposed policy changes with IT executives. Once the iterative process was complete and all of the policy changes were ratified, MSIT was able to move forward and begin implementation.
Build a BI Engine
Once the standards were defined and the necessary approvals were in place, MSIT brought in a small team of business intelligence (BI) people to build a scanning engine that automated information collection and reporting. All server owners could have taken MSIT’s standards, manually measured their compliance, and reported it to IT leadership, but it would have required more management and resources. There also would have been the challenge of integrating and aggregating all of that information to be able to report it to IT executives in a meaningful format that would facilitate planning, measure progress, or drive action.
Several technologies and resources were available to the BI team for use in gathering source data:
·         IT Service Management Tool. Used to inventory servers to develop the Get Current master list of servers. The IT service management tool also provided the configuration item owners, in this case representing who owned the server. That information was used to align with the organizational taxonomy.
·         System Center Configuration Manager (SCCM). Used to deploy an agent on each server that collected detailed data, including information about the hardware and installed operating system, SQL Server version, drivers, service packs, patches, and security. The agent reported the collected date to a central SCCM server. One of the problems was that tool could only be deployed if the servers were organized in Active Directory® in the same group. In many cases at Microsoft, a team’s servers span several organizational groups including those in the production environment, research or test labs, or because information security requirements are assigned to a secure environment. Because the SCCM agent cannot be used for all of the groups, the BI team needed to engage with other teams to use other tools to cover the gap.
·         Service Health Checker. A central team that had access to most of the servers ran a script that collected much of the same information as the SCCM agent.
·         Information Security. For servers where neither the SCCM agent nor the Service Health Checker were able to access and extract the detailed data required because of security constraints, generic server information was provided by the Information Security team.
·         SQL Server Operations. Similar to the Service Health Checker, the SQL Server operations team ran a script that provided the detailed server information for the SQL servers.
·         Stay Current. A database created to map server ownership to the standard organizational structure. This central repository was where all of the information was aggregated and data logic was used to map the information from the different sources. Business owners' plans to maintain or achieve compliance also were stored in this repository so that progress could be measured against planned and achieved server upgrades.
One challenge the BI team needed to address was the creation of an organizational taxonomy. “Get Current” measures and manages different layers of standards compliance against different teams and, depending on the standard, the sources of data are different. Then the BI team took the output of the technologies together, grouped them, and aggregated them into a conforming organizational taxonomy. Creating an organizational taxonomy standardized the components for reporting to teams, organizations, and IT leadership. 
Measure, Report, and Manage Gaps to Compliance
Once MSIT began measuring what they had running in the production environment, they compared that to the standards. That was the key measurement for this program, providing a comparison of the “as is” and the “desired” states of compliance. That measurement exposed the gaps.
Each team that owned servers in the production environment had to visit the portal, identify servers that were not compliant, and then come up with a plan to upgrade or request exemptions for servers that were not to be upgraded. For example, MSIT would not necessarily upgrade a server that was due to be retired in the near future. IT leadership wanted to have those plans known so they could identify actual targets for compliance in fiscal planning. Because current standards evolve, there is no expectation that the environment will ever be 100 percent compliant with the standards, but IT leadership wanted to make sure the reasons for any non-compliance were known and available.
Reporting
Stakeholders required a consolidated reporting view of the current states of compliance against the standards, as well as visibility to team’s plans to bring the out-of-compliance servers up to the standard. MSIT built a SharePoint® portal that contained tracking details in various reports that are consumable by a large and diverse audience. IT leadership, business owners, service owners, and system administrators all rely on the reported information to help them start planning and to measure their progress over time.
Microsoft technology provided flexibility in how to provide the information in a manner that meets the varying needs of the different roles that make up the portal audience. Some 80 percent of the information that is required for scorecards, business reviews, and monthly presentations to IT leadership is needed in a stardard format. Those standard formats were built with SQL Server Reporting Services or Excel and made available through the SharePoint portal. The data is refreshed every weekend and current reports are available by 12:00 AM every Monday.
Some of the reports available include:
·         Compliance by Team
·         Data Gaps by Team
·         Compliance by Server
·         Deferral and Exemption Notes
·         Detailed Server Report
·         Compliance by Application
·         Compliance by Service
·         Server Details by Application
If someone requires information that is not in a standard report available on the portal, they can build ad-hoc reports to meet their special needs. Users can either use all of the exposed data on the portal, or they can connect to the back end to create their own report.
A repository of all the data, including the mappings and all of the interlinking of the data, can be exposed through a SQL Server view so that someone with knowledge of SQL Server can connect and do their own analysis or reports in Excel. Or they can use a BI analysis report that users can connect to with Excel to be able to pivot on analysis information.
Planning, Deferrals, and Exemptions
Business groups are held accountable to the standards and their compliance needs to be re-certified every year. Non-compliance requires either a deferral or an exception. A deferral includes the way in which the issue will be resolved in the current fiscal year, while an exception indicates how the issues will be resolved or planned for in the next fiscal year’s budget cycle. No deferrals or exemptions are granted for non-compliance of security standards or policies.
Benefits
“Get Current” has minimized risks to applications and services and has increased server reliability by keeping them compliant to MSIT supported standards. There are fewer support incidents and server downtime has been reduced by 50 percent. Reducing variability in the environment has optimized operations costs by minimizing the number of versions of retail software managed in MSIT data centers.
In fiscal year 2010, prior to “Get Current,” there was an average of 1.1 tickets per server per month. As this program progressed, MSIT saw a decrease in that average to about 0.25 for compliant systems.
As illustrated below, there was a direct correlation between the increase in number of compliant servers and a reduction in incident to asset ratios.  
Figure 1. Ticket trending as number of compliant servers increases
The red and blue lines reflect the number of tickets per asset (per month) generated for compliant and non-compliant production servers within one of the business units. The green line reflects the percentage increase in production servers that were compliant with regard to SQL Server versions, operating system, and hardware. As the rates of compliance increased, the number of tickets per asset decreased. With fewer incidents on compliant systems, there was more capacity available to address known errors on end-of-life (non-compliant) systems.

Conclusion
Developing an overall initiative that focuses on getting and keeping the environment acceptably compliant has removed the need for every team to create a business justification when seeking approval for their server upgrade plans. The investment and business justification has been built directly into the program itself.
With the creation of the BI engine and the reporting portal, teams are now more able to focus on creating their plans to become more compliant with the standards and can now focus on the execution of their plans.
For More Information
For more information about Microsoft products or services, call the Microsoft Sales Information Center at (800) 426-9400. In Canada, call the Microsoft Canada Order Centre at (800) 933-4750. Outside the 50 United States and Canada, please contact your local Microsoft subsidiary. To access information via the World Wide Web, go to:
This document is for informational purposes only. MICROSOFT MAKES NO WARRANTIES, EXPRESS OR IMPLIED, IN THIS SUMMARY. Microsoft, Active Directory, and SharePoint are either registered trademarks or trademarks of Microsoft Corporation in the United States and/or other countries. The names of actual companies and products mentioned herein may be the trademarks of their respective owners. 

No comments:

Post a Comment