next up previous contents
Next: Construction of Bayesian Network Up: Results Previous: The OREDA Database

Construction of Casebase

The OREDA database contains around 33000 objects, and several hundred features. Some of the features are shown in Tab 5.1, which were extracted by Sintef for use in the NOEMIE project with the SQL search given in appendix B.


 

 
Table 5.1: Some of the Features in OREDA.
FAILURE_EVENT.F_ID M_ID INVENTORY.I_ID FM_CODE
FD_NO FAILURE_EVENT.REM_CODE FC_CODE OM_CODE
FAILURE_EVENT.EC_CODE F_DETECTED_DATE F_DETECTED_DAY F_DETECTED_month
F_DETECTED_year MAIN_EVENT.SU_CODE MAC_CODE MC_CODE
MAIN_EVENT.MC_CODE MAIN_EVENT.EC_CODE M_MAINT_DATE M_MAINT_DAY
M_MAINT_MONTH M_MAINT_YEAR M_MEC_MANHOUR M_EL_MANHOUR
M_INST_MANHOUR M_PATF_PERS M_ACTIVE_MAINT M_DOWNTIME
INVENTORY_OWNER_ID INVENTORY.INST_ID OP_CODE INVENTOR.EE_CODE
DC_CODE INVENTORY.EC.CODE I_CHECKED_BY I_CHECKED_DATE
I_TAG_NO I_SURV_START_DATE I_SURV_START_DATE I_SURV_END_DATE
I_SURV_END_DAY I_SURV_END_MONTH I_SURV_END_YEAR I_OPER_TIME
I_OPER_TIME_CODE I_NO_OF_STARTS I_INSTALLED_DATE I_INSTALLED_DAY
I_INSTALLED_MONTH I_INSTALLED_YEAR I_SCRAPPED_AT_END M_OTHER_MANHOUR
M_RES_DRILL_RIG M_RES_DIVING_VESSEL M_RES_SERVICE_VESSEL M_RES_DIVERS

For our use, we reduced the number of objects and features. This was done to:

We selected features that are relevant for compressors according to advises from an OREDA database expert.

The values COMP in the column INVENTORY.EC_CODE was replaced by #COMP# to be able to select only the compressors with the following Unix command:


cat oreda332.txt | grep #COMP# > comp.txt

This produced 4646 objects.

The database was then imported into a text editor which could handle large files, where it was prepared in Rosetta format by including a new line with statements if the features are strings, integers or floats. Missing values were given the value ``Undefined''. The data was then imported into the Data Mining tool Rosetta [23] for manual inspection. We found that the feature MC_CODE had only one value ``CORRECT'' so it was removed. This gave the features that are described in Tab 5.2. A description of some of the codes used as feature values is given in appendix G.


 
Table 5.2: Description of Features Used.
Feature Description
INVENTORY_INST_ID Installation identification. Unique number.
FM_CODE Failure mode.
FD_NO Failure description number.
FAILURE_EVENT_REM_CODE Failure remark.
FC_CODE Failure consequence code.
MAIN_EVENT_SU_CODE Subunit code.
M_DOWNTIME Downtime in hours.
DC_CODE Design class code.
I_OPER_TIME Operational time in hours.

 


The ranges of the features M_DOWNTIME, I_OPER_TIME and FD_NO were reduced in Rosetta, and the ranges of the features FC_CODE and FM_CODE were reduced with the perl script given in appendix C. This produced the casebase used in experiments, with the features in Tab 5.2.


next up previous contents
Next: Construction of Bayesian Network Up: Results Previous: The OREDA Database
Torgeir Dingsoyr
2/26/1998