Skip to content

Conversation

@3djc
Copy link
Collaborator

@3djc 3djc commented Dec 31, 2025

Based in conversation between ELRS and ETX teams, it has been determined that CRSF external module speed needs to be limited to 3.75 Mbps for F4 radios.

While plenty of users THINK they have a very well working setup, since no obvious error message is present, there are however a significant number of invisible CRC errors leading to telemetry frames been lost, without any way for the user to be aware of that.

Analysis by ELRS team:
image

This is very present for F4 radio, but much less so for newer H7 radio, so the decision has been made to limit it just for F4 class radio

This PR will prevent 5.25 to be selected on those radio, and also limits the value to 3.75Mbps for existing models.

@elecpower you might want a companion side for it too, for consistency (albeit the radio will limit speed even if a model was edited to 5.25 with companion).

PS: don't shoot the messenger !!!!

@elecpower
Copy link
Collaborator

@elecpower you might want a companion side for it too, for consistency (albeit the radio will limit speed even if a model was edited to 5.25 with companion).

@3djc you wish is my command

@CapnBry
Copy link
Contributor

CapnBry commented Dec 31, 2025

It is true. For external modules, 3.75M baud has an incoming (to handset) error rate in the ballpark of roughly 1 in 1M, where at 5.25M it is around 1 in 50K. They're all missing bytes, not bit errors or otherwise corrupted data, and the UART register overflow error bit is set.

For internal modules, 5.25M works fantastically, with an error rate orders of magnitude lower than even the external at 3.75M. I'm not even sure you'll even see 1 error in a full day of data at the fastest ELRS link rate.

@3djc
Copy link
Collaborator Author

3djc commented Dec 31, 2025

@elecpower you might want a companion side for it too, for consistency (albeit the radio will limit speed even if a model was edited to 5.25 with companion).

@3djc you wish is my command

Thanks a lot !

@pfeerick pfeerick changed the title fix(F4 radio): limit CRSF external speed to 3.75Mbps for F4 radio fix(f4): limit CRSF external speed to 3.75Mbps Jan 1, 2026
Copy link
Member

@pfeerick pfeerick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code and rationale looks good, haven't tried it yet though.

PS: If you have an issue with the messenger, you gotta go through me first! :-P

But seriously, we can't allow for known bad settings to be selectable... just like the default hardware related settings should generally "just work" for everyone where possible.

pfeerick

This comment was marked as outdated.

@pfeerick pfeerick added this to the 2.12.0 milestone Jan 5, 2026
@richardclli
Copy link
Collaborator

richardclli commented Jan 5, 2026

What radios are tested for this problem? I am thinking we may need to do more analysis to check for the root cause first before doing some quick fix change.

@gismo2004
Copy link

I am running 3.75 on my V16 since quite some time without issues. Setting it higher leads to problems with elrs.lua and also telemetry issues from time to time.

So V16 can be seen as tested I think.

@richardclli
Copy link
Collaborator

I am running 3.75 on my V16 since quite some time without issues. Setting it higher leads to problems with elrs.lua and also telemetry issues from time to time.

So V16 can be seen as tested I think.

Did you tested with Internal ELRS module or external ELRS module? I spot that many radios did not have RX DMA enabled for external S.Port, and this maybe the root cause of unstable ELRS telemetry instead.

@gismo2004
Copy link

Yes, my understanding was that this is only a problem with external Moduls. However I must admit, that I haven't tested with older firmware as initially when I got my V16 I can't think of problems in that regards and I always was running the highest rate back then. But this could also be an always present problem but not visible for some reason

@richardclli
Copy link
Collaborator

Yes, my understanding was that this is only a problem with external Moduls. However I must admit, that I haven't tested with older firmware as initially when I got my V16 I can't think of problems in that regards and I always was running the highest rate back then. But this could also be an always present problem but not visible for some reason

Then the RX DMA maybe a root cause then, in 2.11, the horus target, only the X12S has setup the TELEMETRY RX DMA properly. Let me see if it is possible to setup the RX DMA for external module first.

@richardclli
Copy link
Collaborator

Checked, v16 did not have telemetry rx dma enabled because the dma1 stream5 is occupied by the dac sound.

However, Flysky nv14/el18/pl18u did not use dac for sound, so should be ok to use telemetry dma rx.

@richardclli
Copy link
Collaborator

@3djc What is the most convenient way to check the CRC errors? I want to check if enabling RX DMA can solve the problem.

@3djc
Copy link
Collaborator Author

3djc commented Jan 6, 2026

Those test have been conducted by ELRS team, but you can replicate something similar by doing a specific firmware with this trace enabled. It does require fairly long runtime as it is cyclic and may not manifest quickly

@CapnBry
Copy link
Contributor

CapnBry commented Jan 6, 2026

Here's the diff we were using which hacks in a crsfErrCnt along with a ms-since-last-update (should be centered around 240ms) and puts them out in the SD Log file. The actual number of dropped packets is usually more than this reported value, due to the parser not being able to recover a valid packet which follows a packet with a dropped byte (see #6919)
tlmlogtime.patch

@pfeerick
Copy link
Member

pfeerick commented Jan 11, 2026

Here's the diff we were using which hacks in a crsfErrCnt along with a ms-since-last-update (should be centered around 240ms) and puts them out in the SD Log file. The actual number of dropped packets is usually more than this reported value, due to the parser not being able to recover a valid packet which follows a packet with a dropped byte (see #6919) tlmlogtime.patch

When you've been testing this against STM32F4 radios, has this been with older versions of EdgeTX than 2.12 / nightly? Just want to rule out any possible recent (recent being 6-7 months 🤪 ) timing issues that may or may not have been just fixed.

@CapnBry
Copy link
Contributor

CapnBry commented Jan 11, 2026

It's been a mix versions ranging from 2.11.2/3 and whatever was main at the time. The testing wasn't particularly focused on EdgeTX code and therefore no statistical comparisons by version were made.

The "Telemetry Lost" callout has been something we've spent a lot of time interacting with frooooommmm I'd say July 2025 to Dec 2025 as we were dealing with some really rarely occurring connection loss in our 4.0 code so we all had many handsets running 2h-24h endurance tests and using the callout as an indicator as a potential failure in our code for review. Once those issues are resolved (?) there was still the occasional callout which lead to the instrumenting to assess the error rate, maybe starting in late Oct / early Nov.

I can't say if there's a notable increase or decrease in dropped bytes in any particular versions that we've used in testing. I wasn't really too aware of internal changes that had been made to the CRSF-handling code and was only vaguely familiar with some of the internal changes with regards to UART processing that you all have done. I didn't know there was that many changes being made in this area and just sort of assumed it hadn't really had enough change to warrant comparative evaluation. That's ignorance on my part though if there's modern branches that would display different behavior.

@3djc
Copy link
Collaborator Author

3djc commented Jan 11, 2026

There wasn't, just that for the last 6 month or so, f4 where running without cache, so with decreased performance, that have either created or amplified some issues. But yes, we have been fighting with that issue for much longer

@pfeerick
Copy link
Member

ranging from 2.11.2/3 and whatever was main at the time.

That's perfect. I was just making sure there was no confusion, as it was main/3.0 which had the issue since #6183 was merged. Since 2.11 was in the mix, this is completely unrelated, and the underlying premise of the need for the PR is still sound.

3djc and others added 4 commits January 21, 2026 02:08
The limitation is intentional due to CRC errors, not a temporary workaround. It would be implicitly removed when all STM32F4 support is eventually removed in the future - so is not a todo item in of itself.
@pfeerick pfeerick force-pushed the 3djc/limit-ext-crsf-baudrate branch from 93775d1 to 1d260e1 Compare January 21, 2026 02:44
@pfeerick
Copy link
Member

@elecpower Can you re-add the equivalent of this checkbox for me (added in #6932)? My QT6 Designer is messing around with all the formatting, wants to remove a lot of the alignment stuff, so I opted to leave the file as it is when you last changed it in this PR 😆 Went in the first ... column? ... of options.

         <widget class="QCheckBox" name="enableAETR">
          <property name="text">
           <string>Enable AETR</string>
          </property>
         </widget>
        </item>

@elecpower
Copy link
Collaborator

@elecpower Can you re-add the equivalent of this checkbox for me (added in #6932)? My QT6 Designer is messing around with all the formatting, wants to remove a lot of the alignment stuff, so I opted to leave the file as it is when you last changed it in this PR 😆 Went in the first ... column? ... of options.

         <widget class="QCheckBox" name="enableAETR">
          <property name="text">
           <string>Enable AETR</string>
          </property>
         </widget>
        </item>

Done

@pfeerick pfeerick added the bug 🪲 Something isn't working label Jan 24, 2026
@pfeerick
Copy link
Member

pfeerick commented Jan 24, 2026

@elecpower What do you think about this? The value was/is 5.25M, so it will stay blank because the value is ... invalid... and the value written unchanged. Should it be caught on read and automatically bumped down?

image

@elecpower
Copy link
Collaborator

In this case the code allowed this no longer valid value so the code should fix.

Best place to fix is in the yaml decode with a version referenced comment.

Do you want me to do the commit or will you?

@pfeerick
Copy link
Member

That was what I was thinking. If you could add it that would be great as I can continue with testing a few more PRs :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug 🪲 Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants