See the Troubleshooting section for ideas on how to deal with these problems.
E/W | Name | Condition on EPICS Var.'s |
E | FRC Firmware Error |
TRDFVER != TRDF Ver. Ref. OR (*) SCLFVER != SCLF Ver. Ref. OR (*) BMVER != BM Ver. Ref. OR (*) PCI1VER != PCI-1 Ver. Ref. OR PCI2VER != PCI-2 Ver. Ref. OR PCI3VER != PCI-3 Ver. Ref. (Current Firmware Version references coming soon) (*) not yet avail in EPICS |
E | FRC Status Error |
TRDFST
& TRDF bit mask > 0 OR SCLFST & SCLF bit mask > 0 OR BM0ST & BM1 bit mask > 0 OR BM1ST & BM1 bit mask > 0 OR PCI1ST & PCI-1 bit mask > 0 OR PCI2ST & PCI-2 bit mask > 0 OR PCI3ST & PCI-3 bit mask > 0 (Bit masks from individual status reg bits that signal errors) |
W | FRC Status Warning | Defined as above but different bit masks |
E | CTT Error | lCTT-ERR > 5 |
W | CTT Warning | lCTT-ERR > 0 |
E | Event Error |
lL1BXERR > 5 OR lTURNERR > 5 |
W | Event Warning |
lL1BXERR > 0 OR lTURNERR > 0 |
E/W | Name | Condition on EPICS Var.'s |
E | BC Firmware Error |
BCVER != BC Ver. Ref. (*) (Current Firmware Version references coming soon) (*) not yet avail in EPICS ? |
E | BC Status Error |
BCST
& TRDF bit mask > 0 (Bit masks from individual status reg bits that signal errors) |
W | BC Status Warning | Defined as above but different bit masks |
E/W | Class | Symptom | Known Cause | Solution |
E | Firmware | Version | Incorrect download script | Call STT expert |
E | TRDF | No CTT Data | Corrupted data from CTT |
CTT Problems Reseat STT cards/cables |
E | TRDF | RR FIFO Full (Latched) | Corrupted data from CTT |
CTT Problems PCI Hang |
W | TRDF | BOE or EOE Missing | Corrupted data from CTT | SCL Init |
W | TRDF | BX or TURN Mismatch | Corrupted data from CTT | SCL Init |
E | SCLF | SCL Mezz Data Err | Bad transaction with mezzanine card | SCL Init |
E | SCLF | SCL Sync Err | loss of synch with SCL | SCL Init |
E | PCI1/2 | L1 FIFO Full | Bad PCI transaction |
SCL Init PCI Hang if persistent |
E | PCI1/2 | EOE PCI 33 | Bad PCI transaction (likely Master Abort) |
SCL Init PCI Hang if persistent |
E | PCI1/2 | Timeout Latch | Bad PCI transaction |
SCL Init PCI Hang if persistent |
W | PCI3 | ??? | SCL Init if persistent | |
E | BM-0 | Get Done Timeout | internal L3 readout problem | SCL Init |
E | BM-0 | Put Done Timeout | L3 readout hung in a board | SCL Init |
E | BM-0 | L1 FIFO Full | BM not reading out - many possible causes | SCL Init |
E | BM-0 | L2 FIFO Full | BM not reading out - many possible causes | SCL Init |
E | BM-0 | PCI3 L3 FIFO Full | BM not reading out - many possible causes | SCL Init |
E | BM-0 | L3 XFER Number FIFO Full | BM not reading out - many possible causes | SCL Init |
E | BM-1 | L1/L2 Busy | generally indicates STT is hung | SCL Init |
E | BM-1 | Overflow Error (L1/L2) | Busy was ignored by framework | SCL Init |
E | BM-1 | STT Output FIFO Full |
BC output FIFO overflowed all subsq. events garbage |
SCL Init |
W | BM-1 | L1/L2 Error | error sent back to SCL hub | keep track of occurences |
E | Counts | > 25 Error Counts in last cycle |
SCL Init call expert if persistent |
|
W | Counts | > 0 Error Counts in last cycle | SCL Init if persistent | |
W | Trk-Cnt | all cumulative events in bin 0 | CTT data problem | ask for CTT Fix |
E | Timer | SCL Int Count > huge value | system stuck in SCL Init | ??? |
W | Timers | timer > 10*average | SCL Init if persistent |
E/W | Class | Symptom | Known Cause | Solution |
E | Status | DB Timeout | L3 Data lost in DB | STT Missing |
E | Status | DB Busy/Error | a board is hung | STT Missing |
E | Status | LM DB L3 Wait (when data not flowing) |
L3 readout is hung on a board | STT Missing |
E | Status | PCI LM TSR6 & TSR7 = 1 | both should never be set | Call STT expert |
W | Status | L1/L2 BX Errors > 1 | keep track of when these happen |
No. | Bits | EPICs Name FRC Name |
Src:Class | Description |
Variables from FRC Firmware | ||||
0-48 | 31..0 | lCTTTRnn CTT_Track |
TRDF:Last (hist) |
Histogram of No. of tracks in CTT data since last mon cycle. These are 16-bit counters |
49a | 15..0 | lCTT-ERR CTT_EVENT_ERR |
TRDF:Last | No. of CTT event errors (MISSING DATA, BOE, or EOE) since prev mon cycle |
49b | 31..16 | lL1BXERR L1_BX_ERR |
TRDF:Last | No. of L1 BX mismatches detected since prev mon cycle |
50a | 15..0 | lC-ERR-A CTT_EVENT_ERR _ACCUM |
TRDF:spec | No. of accumulated CTT event errors since prev. CPU_CLR issued to TRDF |
50b | 31..16 | lTURNERR TURN_ERR |
TRDF:Last | No. of L1 TURN mismatches detected since prev mon cycle |
51 | 31..0 | lC-DELAY MAX_CTT_DELAY |
TRDF:Last | Max delay btw L1 ACC and beginning of CTT data input (in units of 30 ns). This is an 8-bit counter. |
52 - 67 | 31..0 | lL1QULnn QUAL0_MON |
SCLF:Last (hist) |
Histogram of L1 Qualifier Bits since prev mon cycle. These are 16-bit counters. |
68 | 31..0 | lL1-PER L1_PERIOD |
SCLF:Last | No. of L1 Periods since prev mon cycle. This is a 16-bit counter. |
69 | 31..0 | lL2-REJ L2_REJECT |
SCLF:Last | No. of L2 Rejects since prev mon cycle. This is a 16-bit counter. |
70 | 31..0 | lL2-ACC L2_ACCEPT |
SCLF:Last | No. of L2 Accepts since prev mon cycle. This is a 16-bit counter. |
71 | 31..0 | lL2-PER L2_PERIOD |
SCLF:Last | No. of L2 Periods since prev mon cycle. This is a 16-bit counter. |
72 | 31..0 | lMONTIME RAW_MON_COUNT |
BM:Last | Time from previous monitoring cycle (in units of 30 ns). |
73 | 31..0 | lMON-LEN RAW_INT_COUNT |
BM:Event | Length of monitoring cycle (in units of 30 ns). |
74 | 31..0 | lL1-PROC L1_PROC_COUNT |
BM:Last | Cumulative time of L1 processing since prev mon cycle (in units of 30 ns). |
75 | 31..0 | lL2-PROC L2_PROC_COUNT |
BM:Last | Cumulative time of L2 processing since prev mon cycle (in units of 30 ns). |
76 | 31..0 | lSLV-RDY SLV_RDY_COUNT |
BM:Last | Cumulative time of L3 readout via SBC since prev mon cycle (in units of 30 ns). |
xx | 31..0 | TRDFST TRDF_STATUS |
TRDF:Event | TRDF Status Registers (details) |
xx | 31..0 | SCLFST SCLF_STATUS |
SCLF:Event | SCLF Status Registers (details) |
xx | 31..0 | BM0ST BM0_STATUS |
BM:Event | BM-0 Status Registers (details) |
xx | 31..0 | BM1ST BM1_STATUS |
BM:Event | BM-1 Status Registers (details) |
xx | 31..0 | PCI1ST PCI1_STATUS |
PCI1:Event | PCI-1 bus Status Registers (details) |
xx | 31..0 | PCI2ST PCI2_STATUS |
PCI1:Event | PCI-2 bus Status Registers (details) |
xx | 31..0 | PCI3ST PCI3_STATUS |
PCI3:Event | PCI-3 bus Status Registers (details) |
xx | 31..0 | TRDFVER TRDF_FIRMWARE_VER |
TRDF:Event | TRDF Firmware version |
xx | 31..0 | SCLFVER SCLF_FIRMWARE_VER |
SCLF:Event | SCLF Firmware version |
xx | 31..0 | BMVER BM_FIRMWARE_VER |
BM:Event | BM Firmware version |
xx | 31..0 | PCI1VER PCI1_FIRMWARE_VER |
PCI1:Event | PCI1 Firmware version |
xx | 31..0 | PCI2VER PCI2_FIRMWARE_VER |
PCI2:Event | PCI2 Firmware version |
xx | 31..0 | PCI3VER PCI3_FIRMWARE_VER |
PCI3:Event | PCI3 Firmware version |
xx | 31..0 | lSCL-INT SCL_INT_COUNT |
BM:Last | Cumulative time with SCL Init interrupt set since prev mon cycle (in units of 30 ns). Is this possible/sensible? |
xx | 31..0 | lTSR6-7 TSR6_AND_TSR7 |
BM:Last | Cumulative time with both TSR6 and TSR7 = 1 since prev mon cycle (in units of 30 ns). |
Variables Accumulated in the CPU | ||||
xx-xx | 31..0 | cCTTTRnn | CPU:Cumul (hist) |
Cumulative version of lCTTTRnn |
xx | 31..0 | cCTT-ERR | CPU:Cumul | Cumulative version of lCTT-ERR |
xx | 31..0 | cL1BXERR | CPU:Cumul | Cumulative version of lL1BXERR |
xx | 31..0 | cTURNERR | CPU:Cumul | Cumulative version of lTURNERR |
xx | 31..0 | cC-DELAY | CPU:Cumul | Cumulative version of cC-DELAY |
xx-xx | 31..0 | cL1QULnn | CPU:Cumul (hist) |
Cumulative version of cL1QULnn |
xx | 31..0 | cL1-PER | CPU:Cumul | Cumulative version of lL1-PER |
xx | 31..0 | cL2-REJ | CPU:Cumul | Cumulative version of lL2-REJ |
xx | 31..0 | cL2-ACC | CPU:Cumul | Cumulative version of lL2-ACC |
xx | 31..0 | cL2-PER | CPU:Cumul | Cumulative version of lL2-PER |
xx | 31..0 | cMONTIME | CPU:Cumul | Cumulative version of lMONTIME |
xx | 31..0 | cMON-LEN | CPU:Cumul | Cumulative version of lMON-LEN |
xx | 31..0 | cL1-PROC | CPU:Cumul | Cumulative version of lL1-PROC |
xx | 31..0 | cL2-PROC | CPU:Cumul | Cumulative version of lL2-PROC |
xx | 31..0 | cSLV-RDY | CPU:Cumul | Cumulative version of lSLV-RDY |
xx | 31..0 | cTSR6-7 | CPU:Cumul | Cumulative version of lTSR6-7 |