See the Troubleshooting section for ideas on how to deal with these problems.
| E/W | Name | Condition on EPICS Var.'s |
| E | FRC Firmware Error |
TRDFVER != TRDF Ver. Ref. OR (*) SCLFVER != SCLF Ver. Ref. OR (*) BMVER != BM Ver. Ref. OR (*) PCI1VER != PCI-1 Ver. Ref. OR PCI2VER != PCI-2 Ver. Ref. OR PCI3VER != PCI-3 Ver. Ref. (Current Firmware Version references coming soon) (*) not yet avail in EPICS |
| E | FRC Status Error |
TRDFST
& TRDF bit mask > 0 OR SCLFST & SCLF bit mask > 0 OR BM0ST & BM1 bit mask > 0 OR BM1ST & BM1 bit mask > 0 OR PCI1ST & PCI-1 bit mask > 0 OR PCI2ST & PCI-2 bit mask > 0 OR PCI3ST & PCI-3 bit mask > 0 (Bit masks from individual status reg bits that signal errors) |
| W | FRC Status Warning | Defined as above but different bit masks |
| E | CTT Error | lCTT-ERR > 5 |
| W | CTT Warning | lCTT-ERR > 0 |
| E | Event Error |
lL1BXERR > 5 OR lTURNERR > 5 |
| W | Event Warning |
lL1BXERR > 0 OR lTURNERR > 0 |
| E/W | Name | Condition on EPICS Var.'s |
| E | BC Firmware Error |
BCVER != BC Ver. Ref. (*) (Current Firmware Version references coming soon) (*) not yet avail in EPICS ? |
| E | BC Status Error |
BCST
& TRDF bit mask > 0 (Bit masks from individual status reg bits that signal errors) |
| W | BC Status Warning | Defined as above but different bit masks |
| E/W | Class | Symptom | Known Cause | Solution |
| E | Firmware | Version | Incorrect download script | Call STT expert |
| E | TRDF | No CTT Data | Corrupted data from CTT |
CTT Problems Reseat STT cards/cables |
| E | TRDF | RR FIFO Full (Latched) | Corrupted data from CTT |
CTT Problems PCI Hang |
| W | TRDF | BOE or EOE Missing | Corrupted data from CTT | SCL Init |
| W | TRDF | BX or TURN Mismatch | Corrupted data from CTT | SCL Init |
| E | SCLF | SCL Mezz Data Err | Bad transaction with mezzanine card | SCL Init |
| E | SCLF | SCL Sync Err | loss of synch with SCL | SCL Init |
| E | PCI1/2 | L1 FIFO Full | Bad PCI transaction |
SCL Init PCI Hang if persistent |
| E | PCI1/2 | EOE PCI 33 | Bad PCI transaction (likely Master Abort) |
SCL Init PCI Hang if persistent |
| E | PCI1/2 | Timeout Latch | Bad PCI transaction |
SCL Init PCI Hang if persistent |
| W | PCI3 | ??? | SCL Init if persistent | |
| E | BM-0 | Get Done Timeout | internal L3 readout problem | SCL Init |
| E | BM-0 | Put Done Timeout | L3 readout hung in a board | SCL Init |
| E | BM-0 | L1 FIFO Full | BM not reading out - many possible causes | SCL Init |
| E | BM-0 | L2 FIFO Full | BM not reading out - many possible causes | SCL Init |
| E | BM-0 | PCI3 L3 FIFO Full | BM not reading out - many possible causes | SCL Init |
| E | BM-0 | L3 XFER Number FIFO Full | BM not reading out - many possible causes | SCL Init |
| E | BM-1 | L1/L2 Busy | generally indicates STT is hung | SCL Init |
| E | BM-1 | Overflow Error (L1/L2) | Busy was ignored by framework | SCL Init |
| E | BM-1 | STT Output FIFO Full |
BC output FIFO overflowed all subsq. events garbage |
SCL Init |
| W | BM-1 | L1/L2 Error | error sent back to SCL hub | keep track of occurences |
| E | Counts | > 25 Error Counts in last cycle |
SCL Init call expert if persistent |
|
| W | Counts | > 0 Error Counts in last cycle | SCL Init if persistent | |
| W | Trk-Cnt | all cumulative events in bin 0 | CTT data problem | ask for CTT Fix |
| E | Timer | SCL Int Count > huge value | system stuck in SCL Init | ??? |
| W | Timers | timer > 10*average | SCL Init if persistent |
| E/W | Class | Symptom | Known Cause | Solution |
| E | Status | DB Timeout | L3 Data lost in DB | STT Missing |
| E | Status | DB Busy/Error | a board is hung | STT Missing |
| E | Status | LM DB L3 Wait (when data not flowing) |
L3 readout is hung on a board | STT Missing |
| E | Status | PCI LM TSR6 & TSR7 = 1 | both should never be set | Call STT expert |
| W | Status | L1/L2 BX Errors > 1 | keep track of when these happen |
| No. | Bits | EPICs Name FRC Name |
Src:Class | Description |
| Variables from FRC Firmware | ||||
| 0-48 | 31..0 | lCTTTRnn CTT_Track |
TRDF:Last (hist) |
Histogram of No. of tracks in CTT data since last mon cycle. These are 16-bit counters |
| 49a | 15..0 | lCTT-ERR CTT_EVENT_ERR |
TRDF:Last | No. of CTT event errors (MISSING DATA, BOE, or EOE) since prev mon cycle |
| 49b | 31..16 | lL1BXERR L1_BX_ERR |
TRDF:Last | No. of L1 BX mismatches detected since prev mon cycle |
| 50a | 15..0 | lC-ERR-A CTT_EVENT_ERR _ACCUM |
TRDF:spec | No. of accumulated CTT event errors since prev. CPU_CLR issued to TRDF |
| 50b | 31..16 | lTURNERR TURN_ERR |
TRDF:Last | No. of L1 TURN mismatches detected since prev mon cycle |
| 51 | 31..0 | lC-DELAY MAX_CTT_DELAY |
TRDF:Last | Max delay btw L1 ACC and beginning of CTT data input (in units of 30 ns). This is an 8-bit counter. |
| 52 - 67 | 31..0 | lL1QULnn QUAL0_MON |
SCLF:Last (hist) |
Histogram of L1 Qualifier Bits since prev mon cycle. These are 16-bit counters. |
| 68 | 31..0 | lL1-PER L1_PERIOD |
SCLF:Last | No. of L1 Periods since prev mon cycle. This is a 16-bit counter. |
| 69 | 31..0 | lL2-REJ L2_REJECT |
SCLF:Last | No. of L2 Rejects since prev mon cycle. This is a 16-bit counter. |
| 70 | 31..0 | lL2-ACC L2_ACCEPT |
SCLF:Last | No. of L2 Accepts since prev mon cycle. This is a 16-bit counter. |
| 71 | 31..0 | lL2-PER L2_PERIOD |
SCLF:Last | No. of L2 Periods since prev mon cycle. This is a 16-bit counter. |
| 72 | 31..0 | lMONTIME RAW_MON_COUNT |
BM:Last | Time from previous monitoring cycle (in units of 30 ns). |
| 73 | 31..0 | lMON-LEN RAW_INT_COUNT |
BM:Event | Length of monitoring cycle (in units of 30 ns). |
| 74 | 31..0 | lL1-PROC L1_PROC_COUNT |
BM:Last | Cumulative time of L1 processing since prev mon cycle (in units of 30 ns). |
| 75 | 31..0 | lL2-PROC L2_PROC_COUNT |
BM:Last | Cumulative time of L2 processing since prev mon cycle (in units of 30 ns). |
| 76 | 31..0 | lSLV-RDY SLV_RDY_COUNT |
BM:Last | Cumulative time of L3 readout via SBC since prev mon cycle (in units of 30 ns). |
| xx | 31..0 | TRDFST TRDF_STATUS |
TRDF:Event | TRDF Status Registers (details) |
| xx | 31..0 | SCLFST SCLF_STATUS |
SCLF:Event | SCLF Status Registers (details) |
| xx | 31..0 | BM0ST BM0_STATUS |
BM:Event | BM-0 Status Registers (details) |
| xx | 31..0 | BM1ST BM1_STATUS |
BM:Event | BM-1 Status Registers (details) |
| xx | 31..0 | PCI1ST PCI1_STATUS |
PCI1:Event | PCI-1 bus Status Registers (details) |
| xx | 31..0 | PCI2ST PCI2_STATUS |
PCI1:Event | PCI-2 bus Status Registers (details) |
| xx | 31..0 | PCI3ST PCI3_STATUS |
PCI3:Event | PCI-3 bus Status Registers (details) |
| xx | 31..0 | TRDFVER TRDF_FIRMWARE_VER |
TRDF:Event | TRDF Firmware version |
| xx | 31..0 | SCLFVER SCLF_FIRMWARE_VER |
SCLF:Event | SCLF Firmware version |
| xx | 31..0 | BMVER BM_FIRMWARE_VER |
BM:Event | BM Firmware version |
| xx | 31..0 | PCI1VER PCI1_FIRMWARE_VER |
PCI1:Event | PCI1 Firmware version |
| xx | 31..0 | PCI2VER PCI2_FIRMWARE_VER |
PCI2:Event | PCI2 Firmware version |
| xx | 31..0 | PCI3VER PCI3_FIRMWARE_VER |
PCI3:Event | PCI3 Firmware version |
| xx | 31..0 | lSCL-INT SCL_INT_COUNT |
BM:Last | Cumulative time with SCL Init interrupt set since prev mon cycle (in units of 30 ns). Is this possible/sensible? |
| xx | 31..0 | lTSR6-7 TSR6_AND_TSR7 |
BM:Last | Cumulative time with both TSR6 and TSR7 = 1 since prev mon cycle (in units of 30 ns). |
| Variables Accumulated in the CPU | ||||
| xx-xx | 31..0 | cCTTTRnn | CPU:Cumul (hist) |
Cumulative version of lCTTTRnn |
| xx | 31..0 | cCTT-ERR | CPU:Cumul | Cumulative version of lCTT-ERR |
| xx | 31..0 | cL1BXERR | CPU:Cumul | Cumulative version of lL1BXERR |
| xx | 31..0 | cTURNERR | CPU:Cumul | Cumulative version of lTURNERR |
| xx | 31..0 | cC-DELAY | CPU:Cumul | Cumulative version of cC-DELAY |
| xx-xx | 31..0 | cL1QULnn | CPU:Cumul (hist) |
Cumulative version of cL1QULnn |
| xx | 31..0 | cL1-PER | CPU:Cumul | Cumulative version of lL1-PER |
| xx | 31..0 | cL2-REJ | CPU:Cumul | Cumulative version of lL2-REJ |
| xx | 31..0 | cL2-ACC | CPU:Cumul | Cumulative version of lL2-ACC |
| xx | 31..0 | cL2-PER | CPU:Cumul | Cumulative version of lL2-PER |
| xx | 31..0 | cMONTIME | CPU:Cumul | Cumulative version of lMONTIME |
| xx | 31..0 | cMON-LEN | CPU:Cumul | Cumulative version of lMON-LEN |
| xx | 31..0 | cL1-PROC | CPU:Cumul | Cumulative version of lL1-PROC |
| xx | 31..0 | cL2-PROC | CPU:Cumul | Cumulative version of lL2-PROC |
| xx | 31..0 | cSLV-RDY | CPU:Cumul | Cumulative version of lSLV-RDY |
| xx | 31..0 | cTSR6-7 | CPU:Cumul | Cumulative version of lTSR6-7 |