RapidSetupX Crashing because of Recorders

edco · May 13, 2025, 8:20pm

We are seeing RapidSetupX lose connection with a server with the following error being repeatedly logged. The controller and network is started in advance by our app and then we attempt to connect to the server. Immediately after connection, the server disconnects and we get the error below

5/5/2025 5:07:11 PM Error Status(StatusCode=“ResourceExhausted”, Detail=“Server Threadpool Exhausted”) Grpc.Core.RpcException Line 0 at RSI.Shared.Services.Rmp.RmpGrpcServiceClient.RecorderAsync(RecorderRequest request)
at RSI.Shared.Services.Rmp.RecorderService.<.ctor>b__0_0(Object request)
at CallSite.Target(Closure, CallSite, Object)
at RSI.Shared.Services.Rmp.RapidService.CallRpcAndUpdateAsync(Object request)
at CallSite.Target(Closure, CallSite, Object)
at RSI.Shared.Services.Rmp.RapidService.RequestAsync(Object configOrActionMessage, MessageType messageType)
at RSI.Shared.Services.Rmp.RapidService.ActionSetAsync(Object actionMessage)
at RapidSetupX.Models.Scope.ScottPlotScopeX.Records_UpdateFromServerAsync()

Because the error mentioned Recorders, I commented out the section of our app that starts recorders and afterwards the error no longer materializes. Our app is starting ~7 recorders and trying to log around 200 metrics. The sampling period and collection period does not affect whether the error happens. Although I would like to sample at 10ms.

There are no errors in the RapidServer logs nor any errors being logged by our app. The failure seems to be limited to RapidSetupX.

jp · May 13, 2025, 8:49pm

We have seen this particular message before. It typically shows up when the server is overwhelmed with too many concurrent requests or tasks and its thread pool cannot accommodate additional work. For this, we will try to improve the RPC throttling to prevent server exhaustion.

But the main issue seems to be that RapidSetupX is automatically trying to read your recorder data, and it’s polling all 7 recorders. In testing with two instances of RapidSetupX running, we observed that the second instance intermittently “steals” data records, resulting in incomplete data sets, as shown below:

This is unintended behavior, and we are prioritizing a fix to address it. In the meantime, we recommend avoiding running RapidSetupX while recorders are active to prevent data conflicts. Thank you for bringing this to our attention, and we’ll provide updates on the resolution soon.

edco · May 13, 2025, 9:02pm

Thank you @jp.

Do you know the maximum number of recorders/metrics I can use safely without triggering this failure mode?

We were planning on using recorders to grab data and then use prometheus to export that data to telegraf. I don’t need or want to see the metrics in RapidSetupX but in this use case, the recorders would be running continuously and therefore we would not be able to use RapidSetupX at all.

If you don’t expect to have a fix within a couple weeks, we have a couple other options:

reduce the number of metrics we are logging to avoid this error
find an alternative way to log metrics to telegraf without recorders
accelerate the development of our own gui so we don’t need to use RapidSetupX

jp · May 13, 2025, 9:39pm

Hey @edco, I’ve identified and resolved the issue, and I’ve submitted a pull request. I’ll check in with the developers tomorrow to discuss the timeline for the next release and let you know.

Sorry for the inconvenience

jp · May 13, 2025, 9:43pm

The issue here stems from RapidSetupX performing an unintended operation. Regardless of whether one or multiple recorders are active, RapidSetupX currently attempts to fetch data from them simultaneously with your application, causing potential data conflicts and overwhelming the server

todd_mm · May 14, 2025, 2:23pm

Does this only happen if the customer application is using recorders that RapidSetupX would want to use, say recorders 0-n?
For example, in my application, I reserve the first 4 recorders for the (old) Motion Scope tool and don’t use them. How many recorders will RapidSetupX poll? All of them?

todd_mm · May 14, 2025, 2:27pm

I haven’t noticed this problem when using RapidSetupX and my application, and my application always has recorders running.

jp · May 14, 2025, 4:00pm

Hello @todd_mm,

Here’s a breakdown of the current behavior:

RapidSetupX connects to RapidServer.
It retrieves all object counts from the controller.
Note: At this point, your recorders may have already started recording.
For each recorder, RapidSetupX creates a corresponding UI object.
- During creation, it attaches a listener to the IsRecordingChanged event.
When IsRecording changes for any recorder, RapidSetupX attempts to start plotting.
- This was designed to capture recorders that may have been auto-started via a motion trigger.

The issue:
This behavior lacks additional verification to determine how the recorder was triggered. Specifically, we should distinguish between motion triggers initiated via the RapidSetupX UI versus those initiated externally (e.g., through RapidCode or another application). The missing logic here is what leads to the unintended plotting or data highjacking.

My guess on you not seeing the issue:
Maybe your recorders are already started when you start RapidSetupX and the state does not change while you have RapidSetupX running? If this is the case, the IsRecordingChanged event would never trigger and we would not try to highjack your data.

jp · May 15, 2025, 4:02pm

While the issue above may not be exactly the one you’re experiencing, @edco, your stack trace suggests it’s related.

So let me try to explain more clearly:

your error:

Error Status(StatusCode="ResourceExhausted", Detail="Server Threadpool Exhausted")
...
at RapidSetupX.Models.Scope.ScottPlotScopeX.Records_UpdateFromServerAsync()

This is happening because RapidSetupX is calling Records_UpdateFromServerAsync simultaneously for all your recorders. The server can only handle a limited number of concurrent requests, and when that limit is exceeded, it throws a ResourceExhausted error.

At the same time, RapidSetupX is also sending a ping request to check if the server is alive. If the server is already overwhelmed, this ping fails too—causing RapidSetupX to disconnect.

In the next release, a couple of improvements have been made to RapidSetupX:

RapidSetupX will not disconnect on the first failed ping, but only after the third consecutive failure.
RapidSetupX will no longer try to automatically read recorder data from all recorders simultaneously.
(This directly addresses the ResourceExhausted issue caused by the parallel calls to Records_UpdateFromServerAsync().)

Future:
We are working on better ways to benchmark gRPC request limits and plan to document them clearly to help you all understand and manage these limitations more effectively.

scott · May 19, 2025, 9:26pm

Fixed and released in 10.6.8, available on the portal.