Join guest speaker Paul Lanthier as he discusses how to sustain a reliability culture. Topics include:
- How to maintain focus after the project
- Transition from a project phase to an operational phase
- Developing a control plan
- Measure and manage sustainment
- Continuous improvement
- Rewards and recognition
- Celebrate success
Join guest speaker Paul Lanthier as he discusses how to sustain a reliability culture. Topics include:
Guest speaker Paul Lanthier discusses how to manage a successful reliability initiative. Topics include:
• Defining Goals
• Scoping the Project
• Organization/Communication Strategy
• Measuring Success
• Stages — Analyze/Implement/Clean Up/Apply/Living Program
• Asset vs. Site Focus
• Getting People on the Playing Field
• How to Reset the Bar while Maintaining Focus
Watch this short course to hear an expert’s take on it…
Most agree that Reliability Centered Maintenance (or at least RCM thinking) is the foundation of modern asset management. Yet many programs fail or never get off the ground successfully. So what is the key to success?
• What organizational structure is needed to ensure success?
• How do we manage change?
• Where do we start? Is planning and scheduling fixed first?
• What are the minimum tools needed to make this work?
• What are our transition and communication plans
• Who will actively own and sponsor the program?
• How do we measure our progression and success?
If you can answer these questions, you are on the right path to launching a successful reliability initiative. Listen in on a short course to hear an expert’s take on it…
Re-using, or templating, RCM content has been a contentious issue in the past due to concerns that the content will be applied without the requisite consideration given to operating context. However, with the appropriate guidelines in place, valuable time can be saved in the analysis phase through the re-use of existing RCM2 analysis content. The reusable content includes: Operating Context, Functions, Functional Failures, Failure Modes, Failure Effects, and Recommended Actions. Re-using content does in no way absolve the RCM Facilitator from still following the RCM2 process as diligently as if the new analysis was being conducted from scratch. However, the steps are slightly different. The following steps assume the analyses have been created in EXP Enterprise or EXP Professional and that RCM2 analysis content will be copied into a new RCM2 analysis. At a high level the process is as follows:
1. Develop the Operating Context – The Operating Context from the source analysis will form the basis of the Operating Context for the new analysis, with the differences of the new system, if any, added to the new Operating Context.
2. Copy the re-useable content – RCM2 analyses will for the most part be copied to RCM2 analyses. This allows the leveraging of all the information including Functions, Functional Failures, Failure Modes, Failure Effects and Recommended Actions.
3. Review the copied analysis with the area SMEs – The analysis meetings will consist of a review of the Operating Context with the SMEs (same team members as a regular analysis), a review of the Functions and Functional Failures (adding or changing any due to system differences), and then a review of the FMs and FEs as though a technical audit were being conducted (and again modifying any information as required due to system differences). If the team deems that the new asset is too different from the old one, then the review must be moved to a full fledge RCM2 analysis.
4. Present the findings to Management – The management report and presentation will be based on that from the source analysis with the differences of the new system being accounted for. The management presentation will be a review of the differences, unless the copied analysis originated from an area that had a different management team, in which case the entire analysis will be reviewed with the management team.
By following these steps, experience has shown that SME involvement can be significantly reduced (up to 70%), while maintaining a high quality analysis.
One of the common challenges faced when preparing to conduct Reliability Centered Maintenance analyses is obtaining the commitment to dedicate resources for the duration of the analysis. Managers are always looking for ways to reduce the involvement their most valued resources (equipment specialists, operators, tradespeople) will play in the RCM2 analysis. In today’s typically short staffed environment, the required resources are legitimately not available.
So how can we possibly conduct analyses on our most critical equipment when we can’t get the Subject Matter Experts to participate for the planned duration of the analysis? Or, are there any techniques we can use to reduce the SME’s involvement? More on SMEs and capturing their knowledge in the next post…
People often confuse reliability and availability. Simply put availability is a measure of the % of time the equipment is in an operable state while reliability is a measure of how long the item performs its intended function. We can refine these definitions by considering the desired performance standards.
Availability is an Operations parameter as, presumably, if the equipment is available 85% of the time, we are producing at 85% of the equipment’s technical limit. This usually equates to the financial performance of the asset. Of course quality and machine speed need to be considered in order to have a proper representation of how close we are to this technical limit. This is called OEE. Availability can be measured as: Uptime / Total time (Up time + Downtime).
Reliability is a measure of the probability that an item will perform its intended function for a specified interval under stated conditions. There are two commonly used measures of reliability:
- Mean Time Between Failure (MTBF), which is defined as: total time in service / number of failures
- Failure Rate (?), which is defined as: number of failures / total time in service.
A piece of equipment can be available but not reliable. For example the machine is down 6 minutes every hour. This translates into an availability of 90% but a reliability of less than 1 hour. That may be okay in some circumstances but what if this is a paper machine? It will take at least 30 minutes of run time to get to the point that we are producing good paper.
Generally speaking a reliable machine has high availability but an available machine may or may not be very reliable!
Reliability can be defined as “the probability that an item will perform its intended functions for a specified interval under stated conditions”. It can be measured by Mean Time Between Failures (MTBF) and predicted Reliability (R) = e-?t, where t is the time in service.
Availability can be defined as “the probability that an item is in an operable or committable state…” Availability is generally calculated as uptime/uptime+downtime or MTBF/MTBF+MDT where MDT = mean downtime.
So, what is the availability and reliability of a system over an 8 hour period, where the required operating period is 8 hours, and the system is up for 2 hours,
down for 1, back up for 2 hours, down for another hour, then up for 2 hours?
Answer: Availability = 6/6+2 = 75% ; Reliability = e 0.333 x 8 = .69 or 70%.
What is the significance? Watch for my next posting…
Recently on the RCM2 LinkedIn Group, an interesting discussion emerged on the differences between root cause and failure mode? Worth sharing…
Adhen Utomo, Mechanical Engineer at Kaltm Prima Coal in Indonesia, asked this question, “what is the difference between root cause and failure mode?”
Denis Marshment, long time RCM2 Practitioner and Director of Asset Dynamics Asia. Stated that “Root cause and failure modes are essentially the same thing. Both are causes of failure. The only difference is in the techniques that we use to identify them. With Root Cause Analysis (RCA) we are studying a failure event that has already occurred with the aim of preventing its re-occurrence. To do this we must understand all the contributing factors that led to the failure and identify the likely causes. We stop listing failure modes (causes) when it becomes possible to implement a suitable failure management policy – this is the “Root Cause”. RCA is concerned with a single failure event and is applied after the failure has occurred, so in this sense its scope is limited and it is a reactive approach.
The other technique that uses root causes/failure modes is the RCM approach. With RCM we try to identify all the likely causes of failure and their consequences for an asset or system and identify suitable failure management policies to address each failure mode. RCM is a proactive approach that is hopefully applied before the asset has failed with the objective being to mitigate or prevent the failure consequences. The output of RCM is a maintenance plan for the asset or system that covers all the likely failure modes.
Both approaches are extremely useful and quite complimentary.”
Steve Turner of OMCS added that he has a different understanding. “To me, if we are talking assets, there are failure modes which by definition from a dictionary, means the “manner for form” by which things fail. The word “cause” can mean the same thing. However when someone adds the word root to either failure mode or cause, they are meaning the failure mode or cause that was at the root of the problem. My understanding of causal relationships is that they are a continuum and they can have multiple trees. You can keep asking why forever and so you can never get to the root cause in reality…. there is no such thing. There are numbers of causes and events that end up causing failure. So with Dennis definition, his root cause is when we can put in a suitable failure management policy… Under his definition, Root Causes is a post failure thing. My definition is root cause can be established ahead of failures… RCA can be done in the future tense but it is not called by that name in futuristic studies.
I don’t want to add to the confusion, but some people use the term failure mechanism too and they say this term differs from failure mode. Maintenance is full of words that people use differently. You can see that Dennis has a different understanding to me… If I get into a technical conversation I often ask what they mean by the words they use so I understand the definitions being applied.”
Adhen Utomo, whom asked this question, realized that this company needs to decide and develop a dictionary related to maintenance since there is a lot confusion and debate for maintenance “word”. He agreed with Denis because “We have the same information from John Moubray, but there are a lot that might be we must search and digging for details of this issue in maintenance.”
Matt Thomnpson of Rio tinto added “The only thing I would like to add to the discussion is that athough failure modes ( from an equipment centric point of view) can often be thought of as causes not all causes can be thought of as failure modes. For example administrative causes or failures of procedures or management policies can be root causes of an RCA cause and effect chain but are not equipment or component failure modes. Root cause and failure mode are classified as two separate things.”
Matt had an example (however simple it is) to illustrate the differences:
EQUIPMENT: e.g. furnace tube boiler
FAILURE: (what happened) e.g. Catastrophic failure of the welded joint between the furnace tube and tube plate.
FAILURE MODE: (by definition is what the equipment or component failed from) e.g. Corrosion fatigue.
ROOT CAUSE/S: (by definition, what caused the failure mode to occur AND what can be changed to prevent re-occurrence. Remember there can be more than one!!)
e.g. Poor feed water treatment accelerated corrosion; Rapid firing, particularly from cold, increased thermal stress on the boiler; Over pressurization and temperature cycles.
The ineffectual definition of failure effects during an RCM2 analysis can be an important time waster and result in a weak analysis. When uncertain as to the objective of the resulting information, an inexperienced Facilitator can spend an inordinate amount of time elaborating failure effects to the minutest details while providing little of the information necessary in selecting the appropriate consequence mitigation strategy. Alternatively, the Facilitator may write cursory failure effects, making it difficult to select an appropriate strategy. This also leads to wasted time and a weak analysis.
As a basic rule, failure effects should be written such that the Facilitator, Auditor, and future Caretakers of the strategy can make a conclusive jump from the Information Worksheet to the Decision Worksheet. To do this:
- The failure effects written on the Information Worksheet must describe (in short and compact sentences) how the failure mode will result in its associated functional failure. (John Moubray referred to this as the evidence that failure has happened)
- The failure effects must provide all necessary information such that the auditor(s) and caretakers who were not present in the analysis can conclusively agree with the strategy or decide whether the decision is correct.
As such, they need enough information to:
- Conclusively recognize the consequences of failure.(HSEON).
- Select the appropriate technically feasible strategy.
- Judge the extent of the failure consequences in terms of cost or risk so that these failures can be compared and contrasted with the recommended strategy to decide if the task is worth doing.
This article was written and submitted by Aladon Network Member, Roy Korompis, President of PT Relogica, Indonesia.
Over the last couple of years we are seeing a growing interest from many Russian companies who are looking to use new methodologies and new maintenance techniques in various industries. One of the biggest interests is the RCM2 methodology and asset performance management systems.
Major catastrophic incidents that have occurred recently in Russia (e.g. Sayano-Shushenskaya Dam station) have increased the focus on safety in manufacturing facilities. It is evident that previous time based maintenance strategies do not guarantee reliable asset performance.
Our network member in Russia, EAM Systems (Moscow), is one of the leaders in Russian asset management market, specializing in enterprise asset management (EAM) solutions based on an IBM Maximo platform. EAM Systems is an IBM Premier Business Partner, Ivara Authorized Reseller and a member of Aladon Network of Reliability-Centered Maintenance (RCM) consultants.
Together we have developed many new initiatives and activities in contacts with the biggest Russian companies from 5 major key industry fields: utilities market, metallurgy, oil and gas, chemical and paper mill industries.
A marketing strategy was pointed to target production and maintenance professionals; that is Engineers, Production and maintenance managers and IT specialists.
We are building our relations with clients who are interested in improving their asset reliability and who are open to new methods and new software tools.
The main efforts are directed on a gain of client’s trust and attention, interest to decisions EAM+RCM. This practice proves that the correct approach and systematic work with «target audience #1» is a basis of successful start of the project. The most important is the accurate argument based on specific knowledge of production and maintenance, demonstration of positive results and advantages, and simplicity of use.
We tried to reach our clients in various ways – Internet webinars, direct mail with offering specific solutions for the client in asset management, reference visits to existing clients.
One of the most powerful tools in market developing is an Internet webinar for specific audiences. As a result of the May webinar for steel makers we arranged 3 day RCM2 courses in Mariupol and Enakievo for Metinvest Holding (Ukraine), a world-class mining and steel industrial company in Europe and worldwide. Also we have provided 3 day introductory RCM2 courses for SIBUR, the leader of petro chemistry in Russia and Eastern Europe, and SUEK, Siberian Coal Energy Company.
We had so many discussions with maintenance personnel on how to switch from existing reactive maintenance strategy to proactive. I have heard a lot of arguments like “This is not the Russian way, this is western way to do maintenance”, “We have a different culture and it should be taken into account”. The positive result of these discussions is that at the end of the courses we agreed that the reliable right maintenance does not have a culture, and we can build this maintenance program everywhere in Russia.
The next step in our working relationship with clients is a pilot project on the client’s site to develop real maintenance programs based on RCM2 methodologies. Soon the pilot project will start at Kemerovo Chemical Plant (SIBUR). During this project we have to prove that RCM2 is the key to developing right maintenance activities in a right time.
The very positive sign of growing interest to RCM practices is that SUEK Management Team decided to participate at Ivara Reliability Summit in Denver last September.
Over the last couple of months we had a lot of meetings with management teams of different companies. This became available only after diligent work to review companies’ profile, history, and existing maintenance practices to offer them solutions for improving their asset reliability. A major part of this job was done by our partner EAM Systems.
From my experience I can say that only diligent, professional work with clients can get successful results.
This article was written and submitted by Eugene Gavrikov, Reliability Consultant, Ivara Corporation