question

GoldinJonathon-6344 avatar image
0 Votes"
GoldinJonathon-6344 asked asergaz answered

ProvisioningDeviceClient leaking file descriptors on failure to provision

I have been running into file descriptors building up over time during connection/disconnect cycles using the Python Azure SDK. (azure-iot-device v2.5.0) I've narrowed down at least one of those leaks to the ProvisioningDeviceClient process.
I have been using:
ProvisioningDeviceClient.create_from_x509_certificate() or ProvisioningDeviceClient.create_from_symmetric_key()
then calling:
await client.register()

in order to reliably cause a failure to provision, I block the device from the Azure IoTCentral portal from the manage device menu:
146136-image.png
then allow my system to repeatedly try to re-provision.
and, not unexpectedly, receive:
ClientError('Unexpected failure') caused by ServiceError('Query Status operation returned a failed registration status with a status code of 200')

what this results in is lsof output that looks like:

python3 7082 root 990u IPv4 4183756 0t0 TCP localhost:39219 (LISTEN)
python3 7082 root 991u IPv4 3752021 0t0 TCP localhost:46158->localhost:41733 (ESTABLISHED)
python3 7082 root 992u IPv4 3752022 0t0 TCP localhost:41733->localhost:46158 (ESTABLISHED)
python3 7082 root 993u IPv4 2219704 0t0 TCP localhost:38190->localhost:39981 (ESTABLISHED)
python3 7082 root 994u IPv4 2219705 0t0 TCP localhost:39981->localhost:38190 (ESTABLISHED)
python3 7082 root 995u IPv4 2228111 0t0 TCP <USER>:38373->23.96.222.45:https (ESTABLISHED)
python3 7082 root 996u IPv4 2228107 0t0 TCP localhost:36778->localhost:34745 (ESTABLISHED)
python3 7082 root 997u IPv4 2228108 0t0 TCP localhost:34745->localhost:36778 (ESTABLISHED)
python3 7082 root 998u IPv4 2233216 0t0 TCP <USER>:40459->20.49.99.105:https (ESTABLISHED)
python3 7082 root 999u IPv4 2224629 0t0 TCP localhost:38436->localhost:45177 (ESTABLISHED)
python3 7082 root 1000u IPv4 2224630 0t0 TCP localhost:45177->localhost:38436 (ESTABLISHED)
python3 7082 root 1001u IPv4 2242130 0t0 TCP <USER>:37297->20.49.99.105:https (ESTABLISHED)
python3 7082 root 1002u IPv4 2239931 0t0 TCP localhost:35740->localhost:44409 (ESTABLISHED)
python3 7082 root 1003u IPv4 2239932 0t0 TCP localhost:44409->localhost:35740 (ESTABLISHED)
python3 7082 root 1004u IPv4 2246264 0t0 TCP <USER>:55565->20.49.99.105:https (ESTABLISHED)
python3 7082 root 1005u IPv4 2244990 0t0 TCP localhost:33896->localhost:41849 (ESTABLISHED)
python3 7082 root 1006u IPv4 2244991 0t0 TCP localhost:41849->localhost:33896 (ESTABLISHED)
python3 7082 root 1007u IPv4 2888225 0t0 TCP <USER>:34721->20.49.99.105:https (ESTABLISHED)
python3 7082 root 1008u IPv4 2257048 0t0 TCP localhost:41728->localhost:37823 (ESTABLISHED)
python3 7082 root 1009u IPv4 2257049 0t0 TCP localhost:37823->localhost:41728 (ESTABLISHED)
python3 7082 root 1010u IPv4 2262215 0t0 TCP <USER>:37229->20.49.99.105:https (ESTABLISHED)
python3 7082 root 1011u IPv4 2262213 0t0 TCP localhost:57022->localhost:35511 (ESTABLISHED)
python3 7082 root 1012u IPv4 2263728 0t0 TCP localhost:35511->localhost:57022 (ESTABLISHED)
python3 7082 root 1013u IPv4 2482462 0t0 TCP <USER>:52517->20.49.99.105:https (ESTABLISHED)
python3 7082 root 1014u IPv4 2271152 0t0 TCP localhost:37584->localhost:37143 (ESTABLISHED)
python3 7082 root 1015u IPv4 2271153 0t0 TCP localhost:37143->localhost:37584 (ESTABLISHED)
python3 7082 root 1016u IPv4 2277756 0t0 TCP <USER>:51283->20.49.99.105:https (ESTABLISHED)
python3 7082 root 1017u IPv4 2280846 0t0 TCP localhost:55030->localhost:45309 (ESTABLISHED)
python3 7082 root 1018u IPv4 2280847 0t0 TCP localhost:45309->localhost:55030 (ESTABLISHED)
python3 7082 root 1019u IPv4 2278378 0t0 TCP <USER>7:43427->20.49.99.105:https (ESTABLISHED)
python3 7082 root 1020u IPv4 2285742 0t0 TCP localhost:42732->localhost:36379 (ESTABLISHED)
python3 7082 root 1021u IPv4 2285743 0t0 TCP localhost:36379->localhost:42732 (ESTABLISHED)
python3 7082 root 1022u IPv4 3342593 0t0 TCP localhost:44599 (LISTEN)
python3 7082 root 1023u IPv4 3341441 0t0 TCP localhost:54166->localhost:44599 (ESTABLISHED)


I was wondering if this is expected and how to work around it? if this goes on long enough, my program fails and is no longer able to create new file descriptors
Is there some cleanup that should happen when provisioning fails?



azure-iot-sdkazure-iot-dps
image.png (1.3 KiB)
· 2
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

Hello @GoldinJonathon-6344 can I kindly ask you try to reproduce the same using the latest version of the SDK 2.9.0 ? I would also recommend you open a bug in the SDK github repo. You can reference there this thread.

Thank you!


1 Vote 1 ·

Thanks @GoldinJonathon-6344 for opening a bug in the SDK. Here is the related link for others to follow: https://github.com/Azure/azure-iot-sdk-python/issues/899

0 Votes 0 ·

1 Answer

asergaz avatar image
0 Votes"
asergaz answered

Hello @GoldinJonathon-6344 from the related thread in github, Carter have found the solution to your issue. Please can you mark it as answer here as well? Thanks a lot!

Solution Summary: "I suspect the issue is likely that you're instantiating the client again every time you retry, thus opening a new fd without allowing the previous one to close. Try moving the client instantiation outside of the loop."

147764-image.png



image.png (88.3 KiB)
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.