TPL Communication Timeout and Data Loss during Multi-Port Initialization (MC33665 + MC33774)

cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

TPL Communication Timeout and Data Loss during Multi-Port Initialization (MC33665 + MC33774)

393 Views
BillWen
Contributor III

Hi NXP Community,

I am currently migrating a Battery Management System (BMS) project from RTD 3.0 to RTD 6.0 on an S32K358 platform. While the exact same hardware and logic worked perfectly in RTD 3.0, I am encountering critical communication issues in RTD 6.0 when operating multiple TPL ports via the MC33665 Gateway.

  • Port 1: 1x BJB (MC33774)

  • Port 2: 3x CMUs (MC33774 in Daisy Chain)

  • Communication: SPI @ 8MHz, TPL @ 8MHz

    OS: FreeRTOS

     

    In RTD 6.0, the enumeration process fails immediately at the first device (BCC_Device_Addr = 1),Bms_TD_Send(&TD_Cmu, &PhyError); return OK,

    Td_Wait(&TD_Cmu); will cause PHY_TS_TIMEOUT (it should be PHY_TS_FINISHED), do you have any idea on it?

     

    I attached our MC33774 initial function code, it's work on RTD3.0.0 but not on RTD6.0.0.

     

    Btw, all of 33665 MCAL setting is the same with RTD3.0.0 one, please help us check.

     

    Thanks

    BR, BillWen

0 Kudos
Reply
2 Replies

304 Views
PetrS
NXP TechSupport
NXP TechSupport

Hi,

To move this forward, I think it would help a lot if you could share a bit more concrete data:

  1. Please specify the exact project you are migrating

  2. Could you provide SPI bus measurements (logic analyzer or scope) for the failing case?
    Ideally tied to the exact code path (e.g. which SPI API and configuration).

  3. If possible, please capture the same SPI transaction on the RTD300-based project that is working and compare it against the new RTD.
    Differences in CS timing, inter-frame gaps, clock behavior and missing frames could point directly to the root cause.

Side-by-side measurements between RTD300 and the new RTD would likely make the root cause very obvious.

BR, Petr

0 Kudos
Reply

314 Views
BillWen
Contributor III

Hi :

I have some idea, maybe you can check this.

Environment:

S32K358, BMS GEN2 SDK 0.9.1, S32K3 RTD 6.0 (S32K3_RTD_6_0_0_D2506)

PHY: TPL33665, DUAL_SPI_MASTER_SLAVE, ICU sideband (EIRQ16)

BCC: MC33774A (TPL3 variable) + MC33772C (TPL2 fixed 48L)

RTOS: FreeRTOS

Previously working on RTD 3.0 (S32K3_RTD_3_0_0_P01_D2303)

Problem:
After migrating to RTD 6.0, all PHY TDs complete with PHY_TS_TIMEOUT instead of PHY_TS_FINISHED, even when valid response data is received (ResponseMsgNumAct > 0). The timeout handler also always reports PHY_NO_ERROR, making it impossible to distinguish successful transactions from real failures.

RTD 3.0 (correct): Successful → Status=FINISHED, PhyError=NO_ERROR | Failed → Status=TIMEOUT, PhyError=HW_ERROR
RTD 6.0 (buggy): Both cases → Status=TIMEOUT, PhyError=NO_ERROR

Root Cause (2 bugs in Phy_665a driver):

Race condition in IcuReqQueueLowNotification()  bRxExpectedFlag is cleared before DSpiRequestQueueLowIrq() is called. If SPI Slave RX fires in between, the response is ignored and SpiFinishStatusUpdate() is never called → GPT timeout always fires.

SpiTimeoutStatusUpdate() always called with PHY_NO_ERROR — even when no device responds. RTD 3.0 correctly reported PHY_HW_ERROR.

Workarounds applied:

Check ResponseMsgNumAct == 0 instead of Status == PHY_TS_FINISHED to detect real failures

Check only PhyError for write-only TDs

Split large multi-device TDs into per-device TDs to avoid GPT timeout truncation

 

Thanks

BR, BillWen

0 Kudos
Reply
%3CLINGO-SUB%20id%3D%22lingo-sub-2347775%22%20slang%3D%22en-US%22%20mode%3D%22CREATE%22%3ETPL%20Communication%20Timeout%20and%20Data%20Loss%20during%20Multi-Port%20Initialization%20(MC33665%20%2B%20MC33774)%3C%2FLINGO-SUB%3E%3CLINGO-BODY%20id%3D%22lingo-body-2347775%22%20slang%3D%22en-US%22%20mode%3D%22CREATE%22%3E%3CP%3EHi%20NXP%20Community%2C%3C%2FP%3E%3CP%3EI%20am%20currently%20migrating%20a%20Battery%20Management%20System%20(BMS)%20project%20from%20%3CSTRONG%3ERTD%203.0%3C%2FSTRONG%3E%20to%20%3CSTRONG%3ERTD%206.0%3C%2FSTRONG%3E%20on%20an%20%3CSTRONG%3ES32K358%3C%2FSTRONG%3E%20platform.%20While%20the%20exact%20same%20hardware%20and%20logic%20worked%20perfectly%20in%20RTD%203.0%2C%20I%20am%20encountering%20critical%20communication%20issues%20in%20RTD%206.0%20when%20operating%20multiple%20TPL%20ports%20via%20the%20%3CSTRONG%3EMC33665%20Gateway%3C%2FSTRONG%3E.%3C%2FP%3E%3CUL%3E%3CLI%3E%3CP%3EPort%201%3A%201x%20BJB%20(MC33774)%3C%2FP%3E%3C%2FLI%3E%3CLI%3E%3CP%3EPort%202%3A%203x%20CMUs%20(MC33774%20in%20Daisy%20Chain)%3C%2FP%3E%3C%2FLI%3E%3CLI%3E%3CP%3E%3CSTRONG%3ECommunication%3A%3C%2FSTRONG%3E%20SPI%20%40%208MHz%2C%20TPL%20%40%208MHz%3C%2FP%3E%3CP%3E%3CSTRONG%3EOS%3A%3C%2FSTRONG%3E%20FreeRTOS%3C%2FP%3E%3CBR%20%2F%3E%3CP%3EIn%20RTD%206.0%2C%20the%20enumeration%20process%20fails%20immediately%20at%20the%20first%20device%20(BCC_Device_Addr%20%3D%201)%EF%BC%8CBms_TD_Send(%26amp%3BTD_Cmu%2C%20%26amp%3BPhyError)%3B%20return%20OK%2C%3C%2FP%3E%3CP%3ETd_Wait(%26amp%3BTD_Cmu)%3B%20will%20cause%26nbsp%3BPHY_TS_TIMEOUT%20(it%20should%20be%20PHY_TS_FINISHED)%2C%20do%20you%20have%20any%20idea%20on%20it%3F%3C%2FP%3E%3CBR%20%2F%3E%3CP%3EI%20attached%20our%20MC33774%20initial%20function%20code%2C%20it's%20work%20on%20RTD3.0.0%20but%20not%20on%20RTD6.0.0.%3C%2FP%3E%3CBR%20%2F%3E%3CP%3EBtw%2C%20all%20of%2033665%20MCAL%20setting%20is%20the%20same%20with%20RTD3.0.0%20one%2C%20please%20help%20us%20check.%3C%2FP%3E%3CBR%20%2F%3E%3CP%3EThanks%3C%2FP%3E%3CP%3EBR%2C%20BillWen%3C%2FP%3E%3C%2FLI%3E%3C%2FUL%3E%3C%2FLINGO-BODY%3E%3CLINGO-SUB%20id%3D%22lingo-sub-2349202%22%20slang%3D%22en-US%22%20mode%3D%22CREATE%22%20translate%3D%22no%22%3ERe%3A%20TPL%20Communication%20Timeout%20and%20Data%20Loss%20during%20Multi-Port%20Initialization%20(MC33665%20%2B%20MC33774)%3C%2FLINGO-SUB%3E%3CLINGO-BODY%20id%3D%22lingo-body-2349202%22%20slang%3D%22en-US%22%20mode%3D%22CREATE%22%3E%3CP%3EHi%2C%3C%2FP%3E%0A%3CDIV%3E%0A%3CP%3ETo%20move%20this%20forward%2C%20I%20think%20it%20would%20help%20a%20lot%20if%20you%20could%20share%20a%20bit%20more%20concrete%20data%3A%3C%2FP%3E%0A%3COL%3E%0A%3CLI%3E%0A%3CP%3EPlease%20specify%20the%20exact%20project%20you%20are%20migrating%3C%2FP%3E%0A%3C%2FLI%3E%0A%3CLI%3E%0A%3CP%3ECould%20you%20provide%20SPI%20bus%20measurements%20(logic%20analyzer%20or%20scope)%20for%20the%20failing%20case%3F%3CBR%20%2F%3EIdeally%20tied%20to%20the%20exact%20code%20path%20(e.g.%20which%20SPI%20API%20and%20configuration).%3C%2FP%3E%0A%3C%2FLI%3E%0A%3CLI%3E%0A%3CP%3EIf%20possible%2C%20please%20capture%20the%20same%20SPI%20transaction%20on%20the%20RTD300-based%20project%20that%20is%20working%20and%20compare%20it%20against%20the%20new%20RTD.%3CBR%20%2F%3EDifferences%20in%20CS%20timing%2C%20inter-frame%20gaps%2C%20clock%20behavior%20and%20missing%20frames%20could%20point%20directly%20to%20the%20root%20cause.%3C%2FP%3E%0A%3C%2FLI%3E%0A%3C%2FOL%3E%0A%3CP%3ESide-by-side%20measurements%20between%20RTD300%20and%20the%20new%20RTD%20would%20likely%20make%20the%20root%20cause%20very%20obvious.%3C%2FP%3E%0A%3CP%3EBR%2C%20Petr%3C%2FP%3E%0A%3C%2FDIV%3E%3C%2FLINGO-BODY%3E%3CLINGO-SUB%20id%3D%22lingo-sub-2349023%22%20slang%3D%22en-US%22%20mode%3D%22CREATE%22%20translate%3D%22no%22%3ERe%3A%20TPL%20Communication%20Timeout%20and%20Data%20Loss%20during%20Multi-Port%20Initialization%20(MC33665%20%2B%20MC33774)%3C%2FLINGO-SUB%3E%3CLINGO-BODY%20id%3D%22lingo-body-2349023%22%20slang%3D%22en-US%22%20mode%3D%22CREATE%22%3E%3CP%3EHi%20%3A%3C%2FP%3E%3CP%3EI%20have%20some%20idea%2C%20maybe%20you%20can%20check%20this.%3C%2FP%3E%3CP%3E%3CSTRONG%3EEnvironment%3A%3C%2FSTRONG%3E%3C%2FP%3E%3CP%3ES32K358%2C%20BMS%20GEN2%20SDK%200.9.1%2C%20S32K3%20RTD%206.0%20(S32K3_RTD_6_0_0_D2506)%3C%2FP%3E%3CP%3EPHY%3A%20TPL33665%2C%20DUAL_SPI_MASTER_SLAVE%2C%20ICU%20sideband%20(EIRQ16)%3C%2FP%3E%3CP%3EBCC%3A%20MC33774A%20(TPL3%20variable)%20%2B%20MC33772C%20(TPL2%20fixed%2048L)%3C%2FP%3E%3CP%3ERTOS%3A%20FreeRTOS%3C%2FP%3E%3CP%3EPreviously%20working%20on%20RTD%203.0%20(S32K3_RTD_3_0_0_P01_D2303)%3C%2FP%3E%3CP%3E%3CSTRONG%3EProblem%3A%3C%2FSTRONG%3E%3CBR%20%2F%3EAfter%20migrating%20to%20RTD%206.0%2C%20all%20PHY%20TDs%20complete%20with%3CSPAN%3E%26nbsp%3B%3C%2FSPAN%3EPHY_TS_TIMEOUT%3CSPAN%3E%26nbsp%3B%3C%2FSPAN%3Einstead%20of%3CSPAN%3E%26nbsp%3B%3C%2FSPAN%3EPHY_TS_FINISHED%2C%20even%20when%20valid%20response%20data%20is%20received%20(ResponseMsgNumAct%20%26gt%3B%200).%20The%20timeout%20handler%20also%20always%20reports%3CSPAN%3E%26nbsp%3B%3C%2FSPAN%3EPHY_NO_ERROR%2C%20making%20it%20impossible%20to%20distinguish%20successful%20transactions%20from%20real%20failures.%3C%2FP%3E%3CP%3E%3CSTRONG%3ERTD%203.0%20(correct)%3A%3C%2FSTRONG%3E%3CSPAN%3E%26nbsp%3B%3C%2FSPAN%3ESuccessful%20%E2%86%92%20Status%3DFINISHED%2C%20PhyError%3DNO_ERROR%20%7C%20Failed%20%E2%86%92%20Status%3DTIMEOUT%2C%20PhyError%3DHW_ERROR%3CBR%20%2F%3E%3CSTRONG%3ERTD%206.0%20(buggy)%3A%3C%2FSTRONG%3E%3CSPAN%3E%26nbsp%3B%3C%2FSPAN%3EBoth%20cases%20%E2%86%92%20Status%3DTIMEOUT%2C%20PhyError%3DNO_ERROR%3C%2FP%3E%3CP%3E%3CSTRONG%3ERoot%20Cause%20(2%20bugs%20in%20Phy_665a%20driver)%3A%3C%2FSTRONG%3E%3C%2FP%3E%3CP%3E%3CSTRONG%3ERace%20condition%20in%3CSPAN%3E%26nbsp%3B%3C%2FSPAN%3EIcuReqQueueLowNotification()%3C%2FSTRONG%3E%3CSPAN%3E%26nbsp%3B%3C%2FSPAN%3E%E2%80%94%3CSPAN%3E%26nbsp%3B%3C%2FSPAN%3EbRxExpectedFlag%3CSPAN%3E%26nbsp%3B%3C%2FSPAN%3Eis%20cleared%20before%3CSPAN%3E%26nbsp%3B%3C%2FSPAN%3EDSpiRequestQueueLowIrq()%3CSPAN%3E%26nbsp%3B%3C%2FSPAN%3Eis%20called.%20If%20SPI%20Slave%20RX%20fires%20in%20between%2C%20the%20response%20is%20ignored%20and%3CSPAN%3E%26nbsp%3B%3C%2FSPAN%3ESpiFinishStatusUpdate()%3CSPAN%3E%26nbsp%3B%3C%2FSPAN%3Eis%20never%20called%20%E2%86%92%20GPT%20timeout%20always%20fires.%3C%2FP%3E%3CP%3E%3CSTRONG%3ESpiTimeoutStatusUpdate()%3CSPAN%3E%26nbsp%3B%3C%2FSPAN%3Ealways%20called%20with%3CSPAN%3E%26nbsp%3B%3C%2FSPAN%3EPHY_NO_ERROR%3C%2FSTRONG%3E%3CSPAN%3E%26nbsp%3B%3C%2FSPAN%3E%E2%80%94%20even%20when%20no%20device%20responds.%20RTD%203.0%20correctly%20reported%3CSPAN%3E%26nbsp%3B%3C%2FSPAN%3EPHY_HW_ERROR.%3C%2FP%3E%3CP%3E%3CSTRONG%3EWorkarounds%20applied%3A%3C%2FSTRONG%3E%3C%2FP%3E%3CP%3ECheck%3CSPAN%3E%26nbsp%3B%3C%2FSPAN%3EResponseMsgNumAct%20%3D%3D%200%3CSPAN%3E%26nbsp%3B%3C%2FSPAN%3Einstead%20of%3CSPAN%3E%26nbsp%3B%3C%2FSPAN%3EStatus%20%3D%3D%20PHY_TS_FINISHED%3CSPAN%3E%26nbsp%3B%3C%2FSPAN%3Eto%20detect%20real%20failures%3C%2FP%3E%3CP%3ECheck%20only%3CSPAN%3E%26nbsp%3B%3C%2FSPAN%3EPhyError%3CSPAN%3E%26nbsp%3B%3C%2FSPAN%3Efor%20write-only%20TDs%3C%2FP%3E%3CP%3ESplit%20large%20multi-device%20TDs%20into%20per-device%20TDs%20to%20avoid%20GPT%20timeout%20truncation%3C%2FP%3E%3CBR%20%2F%3E%3CP%3EThanks%3C%2FP%3E%3CP%3EBR%2C%20BillWen%3C%2FP%3E%3C%2FLINGO-BODY%3E