-
Notifications
You must be signed in to change notification settings - Fork 7.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
assert failed: twai_handle_tx_buffer_frame twai.c:183 (p_twai_obj->tx_msg_count >= 0) if CONFIG_TWAI_ERRATA_FIX_TX_INTR_LOST=y (IDFGH-8204) #9697
Comments
Tested with ESP-IDF v5.0-dev-4379-g36f49f361c and same Error. |
if
then it does NOT crash. if
then it does crash |
It seems
is the Reason for the Crashing. |
Stumbled across the same problem. This did not happen in prior (CAN not TWAI) libs. I do not think there is a solution when using the Arduino environment right? |
I am having the same issue. In my case I powered up my device without any power on the other side of the CAN transceiver (ISO1042 Isolated). Is there any update on this issue? |
@Dazza0 any news ? |
@diplfranzhoepfinger @abombay @f-hoepfinger-hr-agrartechnik By disturbing the CAN/TWAI Bus, you are likely generating errors that trigger the HW errata conditions. The errata fixes have been fixed on master and backported all the way back to ESP-IDF v4.2.x. However, these errata workarounds are not enabled by default until ESP-IDF v5.0 onwards. If you are using an ESP-IDF version older than v5.0, please enable all of the @EmbeddedDevver Arduino should also have these workarounds enabled starting from v2.0.5 |
I have the same problem. My fix is to not use the twai_initiate_recovery() function, but instead of this uninstall, install and start the TWAI driver back. This fix may help you until the problem is fixed in the TWAI library. |
@diplfranzhoepfinger @StehlikPhotoneo @abombay @f-hoepfinger-hr-agrartechnik I suspect what's happening is the In #ifdef CONFIG_TWAI_ERRATA_FIX_TX_INTR_LOST
if ((interrupts & TWAI_LL_INTR_TI || hal_ctx->state_flags & TWAI_HAL_STATE_FLAG_TX_BUFF_OCCUPIED) && status & TWAI_LL_STATUS_TBS) {
#else to #ifdef CONFIG_TWAI_ERRATA_FIX_TX_INTR_LOST
if ((interrupts & TWAI_LL_INTR_TI || hal_ctx->state_flags & TWAI_HAL_STATE_FLAG_TX_BUFF_OCCUPIED)
&& status & TWAI_LL_STATUS_TBS
&& !(state_flags & TWAI_HAL_STATE_FLAG_BUS_OFF)) {
#else does this end up resolving the issue? |
i will check ! |
I am also having this issue. All of my TWAI_ERRATA fixes are enabled. I'm using ESPIDF 4.4.4 with a ESP32MINI-01 rev3 device. I will try Dazza0 code patch right now and let you know. |
Unfortunately this doesn't fix it. I should note that I am using single shot mode with no tx queue. I handle retries myself. This is the only function I use to send out CAN messages. My transmit function is as follows: IRAM_ATTR esp_err_t tts_twai_send_message(twai_message_t* msg, uint32_t paceTime100us, uint32_t timeoutMS)
{
#define MAX_TX_RETRIES (16)
uint32_t alerts;
uint32_t paceTimeTicks;
esp_err_t err;
int retryCount = MAX_TX_RETRIES;
xSemaphoreTake(_semTxLock, portMAX_DELAY);
msg->flags = TWAI_MSG_FLAG_SS;
TickType_t tickTimeout = xTaskGetTickCount() + pdMS_TO_TICKS(timeoutMS);
if (twai_read_alerts(&alerts, 0) == ESP_OK)
{
#if TWAI_REPORT_AND_HANDLE_ALERTS
tts_twai_report_alerts(alerts);
#endif
}
RetryTx:
err = twai_transmit(msg, pdMS_TO_TICKS(100));
if (err != ESP_OK)
{
ESP_LOGE(EXAMPLE_TAG, "twai_transmit failed! Error=%08X", err);
alerts = 0;
twai_read_alerts(&alerts, pdMS_TO_TICKS(100));
#if TWAI_REPORT_AND_HANDLE_ALERTS
tts_twai_report_alerts(alerts);
#endif
if (err == ESP_ERR_INVALID_STATE)
{
if (twai_initiate_recovery() == ESP_OK)
{
ESP_LOGW(EXAMPLE_TAG, "Bus recovery initiated..");
// wait for bus to recover
while (xTaskGetTickCount() <= tickTimeout)
{
if (twai_read_alerts(&alerts, pdMS_TO_TICKS(100)) == ESP_OK)
{
#if TWAI_REPORT_AND_HANDLE_ALERTS
tts_twai_report_alerts(alerts);
#endif
if (alerts & TWAI_ALERT_BUS_RECOVERED)
{
if (--retryCount <= 0)
{
ESP_LOGE(EXAMPLE_TAG, "Max TX retry count exceeded!");
goto Done;
}
err = twai_start();
ESP_LOGW(EXAMPLE_TAG, "Bus Recovered. twai_start err=%i", err);
vTaskDelay(1);
goto RetryTx;
}
}
}
ESP_LOGE(EXAMPLE_TAG, "Timeout waiting for bus recovery");
}
else
{
ESP_LOGW(EXAMPLE_TAG, "TX failed for unknown reason");
if (--retryCount <= 0)
{
ESP_LOGE(EXAMPLE_TAG, "Max TX retry count exceeded!");
goto Done;
}
vTaskDelay(1);
}
}
else
{
ESP_LOGE(EXAMPLE_TAG, "TX failed Error=%08X", err);
if (--retryCount <= 0)
{
ESP_LOGE(EXAMPLE_TAG, "Max TX retry count exceeded!");
goto Done;
}
vTaskDelay(1);
goto RetryTx;
}
// this is caused by the loss of a TX interrupt. (see errata)
// it should not occur anymore as there is a work-a-round in place deep in the TWAI drvier stack.
//ESP_LOGE(EXAMPLE_TAG, "tts_twai_send_message1 Error=%08X", err);
//twai_stop();
//vTaskDelay(pdMS_TO_TICKS(20));
//twai_start();
//vTaskDelay(pdMS_TO_TICKS(20));
//err = twai_transmit(msg, pdMS_TO_TICKS(100));
}
else
{
// TX was successful. we should receive either an idle alert or some error promptly
// clear any alerts
if (twai_read_alerts(&alerts, pdMS_TO_TICKS(1000)) == ESP_OK)
{
#if TWAI_REPORT_AND_HANDLE_ALERTS
tts_twai_report_alerts(alerts);
#endif
if (alerts & TWAI_ALERT_TX_FAILED)
{
if (--retryCount <= 0)
{
ESP_LOGE(EXAMPLE_TAG, "Max TX retry count exceeded!");
err = ESP_ERR_TIMEOUT;
goto Done;
}
vTaskDelay(1);
goto RetryTx;
}
}
// honor pacetime
paceTimeTicks = pdMS_TO_TICKS(((TickType_t)paceTime100us + (TickType_t)9) / (TickType_t)10);
if (paceTimeTicks > 0)
{
vTaskDelay(paceTimeTicks);
}
else
{
uint64_t timeEnd = esp_timer_get_time() + (paceTime100us * 100);
while (timeEnd > esp_timer_get_time())
{
// taskYIELD() is used to request a context switch to another task. However, if there are
// no other tasks at a higher or equal priority to the task that calls taskYIELD() then
// the RTOS scheduler will simply select the task that called taskYIELD() to run again.
taskYIELD();
}
}
}
Done:
xSemaphoreGive(_semTxLock);
if (retryCount != MAX_TX_RETRIES)
{
//ESP_LOGI(EXAMPLE_TAG, "TX took %d retries", MAX_TX_RETRIES - retryCount);
}
return err;
} |
If I use this: #ifdef CONFIG_TWAI_ERRATA_FIX_TX_INTR_LOST
//
if ((interrupts & TWAI_LL_INTR_TI || hal_ctx->state_flags & TWAI_HAL_STATE_FLAG_TX_BUFF_OCCUPIED) && status & TWAI_LL_STATUS_TBS && p_twai_obj->tx_msg_count) {
#else it solves the problem. This is not a good fix but it works in my case because I'm not using any kind of TX queuing. It does provide more proof that the problem is with this errata "fix". FYI, You have to manipulate some code to gain access to Hopefully this will help someone else find a proper fix. :) |
I'm running into this after 40 minutes of runtime (more than 120,000 CAN messages sent). |
@igrr - any comments? |
@andrew-elder Have you tried code changes from Dazza0 above #9697 (comment). |
@wanckl I have not yet tired the fix from #9697 (comment). I will try it. I am using default setting. ie not using
|
@wanckl - the test failed after an hour with
The suggested change checked bus off, but that isn't expected to happened in my setup. The CAN bus is hardwired to another device that is continuously sending messages to the ESP32. Thank you for your help so far. Any other suggestions? |
Have you tried the fix I posted above? To use it, you have to disable the TX queue. It is not a good fix but it does solve the problem. I have been using it in a production product for a while now. |
@travis012 - I have not tried the fix you posted above. I will do so. |
@travis012 - an implementation very close to yours works for me. ie, I no longer observe the assert() error. I don't observe that @wanckl - does espressif have plans to release a fix for this? |
Is it better than mine? Can you use the TX queue? I'd like to improve what I have in our production devices. Please share your fix. |
No, it's not better than yours. I removed the sem for example because I have a single thread sending messages. I have a hardwired CAN connection to another device, so there is never a BUS-OFF condition, so twai_initiate_recovery() always returns ESP_ERR_INVALID_STATE if I force it to trigger. |
Any update on this? having the same issue |
i have to check. |
Has anyone come up with a solution? The BUG has just turned 2 years old, only in this thread. Hey @diplfranzhoepfinger , help us out here, haha! I just encountered this issue in my project. If the BUG has not been fixed yet, I would appreciate help with a safe way to restart my MCU to try to work around the problem. My system allows me to reset all peripherals ‘hanging’ on the CAN bus. I mention this because I noticed that the MCU’s return when restarted by |
Hi @willianaraujo , here's an example of how i did it on esp-idf v4.4.7. Using twai_initiate_recovery() causes crash. twai_general_config_t g_config = TWAI_GENERAL_CONFIG_DEFAULT(CAN_TX_PIN, CAN_RX_PIN, TWAI_MODE_NORMAL);
twai_timing_config_t t_config;
port_err_t start_driver(int can_speed)
{
switch (can_speed)
{
case 25000:
t_config = TWAI_TIMING_CONFIG_25KBITS();
break;
case 50000:
t_config = TWAI_TIMING_CONFIG_50KBITS();
break;
case 100000:
t_config = TWAI_TIMING_CONFIG_100KBITS();
break;
case 125000:
t_config = TWAI_TIMING_CONFIG_125KBITS();
break;
case 250000:
t_config = TWAI_TIMING_CONFIG_250KBITS();
break;
case 500000:
t_config = TWAI_TIMING_CONFIG_500KBITS();
break;
case 800000:
t_config = TWAI_TIMING_CONFIG_800KBITS();
break;
case 1000000:
t_config = TWAI_TIMING_CONFIG_1MBITS();
break;
default:
t_config = TWAI_TIMING_CONFIG_500KBITS();
ESP_LOGE(tagCanDriver, "Default speed is NOT STANDARD!!!, using fallback to 500KBITS");
break;
}
twai_filter_config_t f_config = TWAI_FILTER_CONFIG_ACCEPT_ALL();
esp_err_t install_result = twai_driver_install(&g_config, &t_config, &f_config);
if (install_result != ESP_OK)
{
ESP_LOGE(tagCanDriver, "twai_driver_install failed: %s", esp_err_to_name(install_result));
return PORT_FAIL;
}
if (twai_start() != ESP_OK)
{
ESP_LOGE(tagCanDriver, "twai_start failed");
return PORT_FAIL;
}
ESP_LOGI(tagCanDriver, "Driver started with speed %d", can_speed);
return PORT_OK;
}
port_err_t stop_driver()
{
esp_err_t error = twai_stop();
if (error != ESP_OK && error != ESP_ERR_INVALID_STATE)
{
ESP_LOGE(tagCanDriver, "Failed to stop can, error was %s", esp_err_to_name(error));
return PORT_FAIL;
}
else if (error == ESP_ERR_INVALID_STATE)
{
ESP_LOGI(tagCanDriver, "Driver was not started");
}
error = twai_driver_uninstall();
if (error != ESP_OK && error != ESP_ERR_INVALID_STATE)
{
ESP_LOGE(tagCanDriver, "Failed to uninstall can, error was %s", esp_err_to_name(error));
return PORT_FAIL;
}
else if (error == ESP_ERR_INVALID_STATE)
{
ESP_LOGI(tagCanDriver, "Driver was not installed");
}
ESP_LOGI(tagCanDriver, "Driver Stopped");
return PORT_OK;
}
bool handleBusStatus(twai_status_info_t &status_info)
{
// Print the status info
ESP_LOGI(tagCanDriver, "State: %d, RX Error Counter: %d, TX Error Counter: %d, RX Queue Length: %d, TX Queue Length: %d",
status_info.state, status_info.rx_error_counter, status_info.tx_error_counter, status_info.msgs_to_rx, status_info.msgs_to_tx);
// Based on state, do something
if (status_info.state == TWAI_STATE_RUNNING)
{
ESP_LOGI(tagCanDriver, "CAN bus is running");
if (status_info.msgs_to_rx > 0)
{
ESP_LOGI(tagCanDriver, "Queued %d packages, ready to process...", status_info.msgs_to_rx);
}
}
else if (status_info.state == TWAI_STATE_STOPPED)
{
stop_driver();
delay(100);
start_driver(api.sensors.can.serviceSpeed);
ESP_LOGE(tagCanDriver, "Driver restarted");
}
else if (status_info.state == TWAI_STATE_BUS_OFF)
{
ESP_LOGE(tagCanDriver, "CAN bus is in bus-off state");
stop_driver();
delay(100);
start_driver(api.sensors.can.serviceSpeed);
ESP_LOGE(tagCanDriver, "Driver restarted");
}
else if (status_info.state == TWAI_STATE_RECOVERING)
{
ESP_LOGI(tagCanDriver, "CAN bus is recovering (does not really work)");
stop_driver();
delay(100);
start_driver(api.sensors.can.serviceSpeed);
ESP_LOGE(tagCanDriver, "Driver restarted");
ESP_LOGE(tagCanDriver, "twai_stop() and twai_driver_uninstall() called");
}
else
{
ESP_LOGE(tagCanDriver, "Unknown CAN bus state");
stop_driver();
delay(100);
start_driver(api.sensors.can.serviceSpeed);
ESP_LOGE(tagCanDriver, "Driver restarted");
}
return true;
} |
oof! I thought enabling these errata workarounds would improve stability.
Could someone pls explain what that errata workaround tried to improve? The help says:
Just curious what could I be missing by not having that errata workaround? Does it lead to a few missed Tx can frames? |
Answers checklist.
IDF version.
v4.4.2
Operating System used.
Linux
How did you build your project?
Eclipse IDE
If you are using Windows, please specify command line type.
No response
Development Kit.
Atom M5
Power Supply used.
USB
What is the expected behavior?
Not Chrashing
What is the actual behavior?
Instead it crash
assert failed: twai_handle_tx_buffer_frame twai.c:183 (p_twai_obj->tx_msg_count >= 0)
Steps to reproduce.
Debug Logs.
More Information.
No response
The text was updated successfully, but these errors were encountered: