Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Myactuator status broadcast #1

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

ioio2995
Copy link
Contributor

@ioio2995 ioio2995 commented Jun 6, 2024

Hello,

I have been using your SDK since late 2023 and have also adapted a hardware interface for ROS2_Control. Given the high quality of your code, I took the liberty of forking it and adapting a controller that publishes motor_status 1 and motor_status 2 information on a topic myactuator_rmd_status, similar to the joint_state_broadcaster.

The main modifications are as follows:

  • ROS Interface myactuator_rmd_interface: Contains the message structures used by the myactuator_rmd_broadcaster topic.
  • ROS2_Control Controller myactuator_rmd_state_broadcaster: Functions similarly to the joint_state_broadcaster. It queries the state_interfaces of the hardware interface to publish this data on the /myactuator_rmd_broadcaster topic.
  • ROS2_Control Hardware Interface myactuator_rmd_hardware: Slightly modified to retrieve and return additional information from motor_status 1 and 3.

The published data includes:

  • rmd_brake
  • rmd_current
  • rmd_current_phase_a
  • rmd_current_phase_b
  • rmd_current_phase_c
  • rmd_error_code
  • rmd_temperature
  • rmd_voltage

Having adapted this code to your main branch, I am submitting this pull request.

Thank you for considering this contribution. I am available for any questions or clarifications.

Best regards,

@2b-t
Copy link
Owner

2b-t commented Jun 6, 2024

Thank you, @ioio2995 !
I will have a look at it in more detail over the coming weekend.
Sorry, I did not realize that you had already written a hardware interface leveraging my driver. I had planned to write a ros2_control hardware interface from the beginning but after writing the CAN driver SDK I did not have much time for a few months to work on it. I have written this ROS 2 integration pretty much on two evenings and it is still a work in progress (I still have a few optimizations and fixes to apply to it, e.g. currently the controller will send the motor to the position 0.0 at the beginning which probably is not what one would want and I will implement a simple low-pass filter for torque and velocity as they are pretty noisy otherwise). If you have any additional input, it is highly appreciated!

From what I have seen so far working with the actuator - at least on my system - any request to the actuator takes around 1 millisecond. I am unsure if that is related to my kernel configuration (see $ cat /boot/config-$(uname -r) | grep HZ, e.g. CONFIG_HZ_1000=y), the CAN adapter (would like to eliminate this one from my set-up as these are unreliable any way) that I am using of the actuator itself. This means that the maximum control frequency is limited by the number of CAN commands sent to the actuator in a single control loop (this is why I put the CAN communication in an asynchronous thread, so that I am at least not blocking the controller). So currently in my implementation only one command is sent to the actuator and therefore - at least theoretically - the hardware interface could run at 1kHz, with the two additional reads that you have added this would drop to 333Hz. I have not had the time yet to benchmark what in practice the maximum attainable control frequency in ros2_control and whether this would be a limitation. I think what might be desirable would be to be able to turn publishing this information on and off (so somebody is not forced to run at a lower frequency due to the additional reads) or reading this information at a lower fixed frequency (e.g. 2, 5 or 10 Hz). Additionally for outputting the information we might as well leverage the ros2_control GPIOs (see here for an example) instead of the dedicated myactuator_rmd_broadcaster. This is also the way how the Universal Robots driver does it (see here).

@ioio2995
Copy link
Contributor Author

ioio2995 commented Jun 7, 2024

Hello,

Thank you, @ioio2995 !
I will have a look at it in more detail over the coming weekend.
Sorry, I did not realize that you had already written a hardware interface leveraging my driver. I had planned to write a ros2_control hardware interface from the beginning but after writing the CAN driver SDK I did not have much time for a few months to work on it. I have written this ROS 2 integration pretty much on two evenings and it is still a work in progress (I still have a few optimizations and fixes to apply to it, e.g. currently the controller will send the motor to the position 0.0 at the beginning which probably is not what one would want and I will implement a simple low-pass filter for torque and velocity as they are pretty noisy otherwise)

Yes, indeed I had the same constraint. Unfortunately, the problem remains complex because, given that the Hall effect sensor magnet is placed on the motor shaft (input shaft), it does not undergo the reduction. It works perfectly when the controller is powered, but if the output shaft is manipulated by more than 43 degrees (motor off), for the X8pro V2 which undergoes a 1/9 reduction for example, we lose motor turns and the output shaft is offset when powered back on. A solution was implemented for the single turn with the V3 versions and the second encoder (provided that myActuator support responds to our requests for an updated firmware). The workaround I put in place is the systematic reset of the encoder via a 0x63 command. I did some tests with robot_localization and an IMU which are promising. But it remains a workaround.

From what I have seen so far working with the actuator - at least on my system - any request to the actuator takes around 1 millisecond. I am unsure if that is related to my kernel configuration (see $ cat /boot/config-$(uname -r) | grep HZ, e.g. CONFIG_HZ_1000=y), the CAN adapter (would like to eliminate this one from my set-up as these are unreliable any way) that I am using of the actuator itself. This means that the maximum control frequency is limited by the number of CAN commands sent to the actuator in a single control loop (this is why I put the CAN communication in an asynchronous thread, so that I am at least not blocking the controller). So currently in my implementation only one command is sent to the actuator and therefore - at least theoretically - the hardware interface could run at 1kHz, with the two additional reads that you have added this would drop to 333Hz. I have not had the time yet to benchmark what in practice the maximum attainable control frequency in ros2_control and whether this would be a limitation. I think what might be desirable would be to be able to turn publishing this information on and off (so somebody is not forced to run at a lower frequency due to the additional reads) or reading this information at a lower fixed frequency (e.g. 2, 5 or 10 Hz).

I have taken your proposals into account, indeed activating the retrieval of motor_status1 and 3 on demand is a good approach. I had initially considered creating a second thread, but reading the CAN interface obviously poses problems. I tried to make progress, but without real success.

Additionally for outputting the information we might as well leverage the ros2_control GPIOs (see here for an example) instead of the dedicated myactuator_rmd_broadcaster. This is also the way how the Universal Robots driver does it (see here).

I am not too convinced of the use of GPIOs: for using the brake it would be perfect, but for retrieving the various hardware indicators of the motor, a low-speed broadcast still seems the best solution.

Also, while going through your code and modifying the command_mode_switch functions, I realized that a hardware interface represents an actuator. How do you plan to manage multiple actuators for a single CAN port in the future (the ability to chain actuators is one of the advantages of CAN)? I think this is a real plus and indispensable in my use case.

I have also created a branch on my side for implementing timeout management, but it lacks passing the value as an argument in the launch file. I will do this over the weekend if my kids give me a break.

For your information, I am a former system engineer who came to ROS to stay connected with hardware, so I am not a developer, but I am doing my best to keep learning.

English is not my strong suit, so please excuse any mistakes I might make.

Best regards

@2b-t
Copy link
Owner

2b-t commented Jun 7, 2024

Hmmm, I did not know about the Hall effect sensor being only motor-sided. That sounds like a very limiting design decision that seems to have no advantages to me.
Yes, writing to the same CAN participant from two different threads can indeed cause problems and similar I have no good solution for sending commands for multiple participants yet. I would have to think through thread-safety and the way that I have implemented the CAN driver class. If you have any ideas let me know. The way that I have initially designed it was with one CAN driver per CAN participant and therefore also with one hardware interface per actuator. My use case is fairly simple and I only got a single actuator so that is what I mainly designed it for. I think for most use cases this should be fine..
What are you working on if I may ask? What is your use case?
Don't worry, your English is fine. I am also not a native speaker. I just happen to live in the UK now.

@2b-t
Copy link
Owner

2b-t commented Jun 7, 2024

Regarding the noisy torque: Just implemented a simple low-pass filter of the torque and the velocity. The filter coefficients can be adjusted through another parameter inside the URDF.
In the following example I commanded the motor through its velocity interface and used an filter coefficient alpha of 0.07, which according to sample_frequency/cutoff_frequency = (1-alpha)*2*pi/alpha corresponds to a ratio of sampling to cut-off frequency of around 125, meaning any oscillations with a higher frequency than around 8Hz will be filtered out. The velocity is much less noisy, so it could do with less filtering.
In the future I might implement a higher order Butterworth filter or leverage an existing library like irr1 for this purpose.

Without low-pass filter With low pass filter
Without low-pass filter With low-pass filter

@ioio2995
Copy link
Contributor Author

ioio2995 commented Jun 8, 2024

Getting out of bed and between two episodes of Paw Patrol, I implemented a new logic for retrieving additional status data. Instead of using a second thread, I opted for a sleep-wake mechanism. What do you think?

I also moved the initialization of variables related to the use of extra-statuses to the on_init function to avoid their loading by the resource_manager if not used.

I tested your implementation of the low-pass filter and it's night and day, excellent improvement!

To merge our branches, could you provide your requirements? It might be useful to create a story to centralize this discussion.

My GitHub profile is still empty, but I will update it. The project I'm working on is a two-wheeled robot with an articulated tail ending in an omnidirectional wheel. The tail adds an extra level of mobility. Although I am working alone to showcase it, the project will remain open source. https://github.com/ioio2995/rhacobot

Finally, what do you think of my second PR? I noticed it conflicts with the master branch, I will fix it so you can merge it if it suits you.

@2b-t
Copy link
Owner

2b-t commented Jun 9, 2024

Hi again @ioio2995,
I have performed some tests with my X8 Pro V2 on different Linux kernels (different real-time patches and tick timer granularity up to 1000 Hz) in combination with the MKS CANable V1.0 USB-to-CAN adapter that I am using (not a huge fan of it for these kind of applications). I have tracked the latency between sending a CAN message to the actuator and receiving its response using Wireshark. In all my tests the latency was around 1.7-2 ms (meaning reading all three motor statuses would decrease the frequency that the hardware interface can run at from around 500 Hz to around 150 Hz). I have ordered a PEAK-System PCAN-PCI card and will test it again once I received the card.
What does your CAN-communication set-up like? Would you mind testing what the latency between the sent and the received CAN frames for you is?

@ioio2995
Copy link
Contributor Author

Hello,

I have just done some tests on two X8PRO3-H.
After adapting my Ros2_Control configuration, I have partially switched to your interface. I am currently using it only with the DiffDriveController. For the tail part, given that I have physical limits on movement, I would first like to submit a new PR to add a motor encoder reset. Perhaps I will find another workaround; I will think about it during the week.
I have also done some latency tests, and after analysis via a Python script, I found better results than you: it seems that I am at an average of 1.1 ms.

lorl@ros:~$ python3 average_time_difference.py dump_X8pro-H.csv 2
Emitter Messages (ID: 0x142):
No. Time Source Destination Protocol Length Info ID
0 1 0.000000 NaN NaN CAN 32 ID: 322 (0x142), Length: 8 322
4 5 0.019325 NaN NaN CAN 32 ID: 322 (0x142), Length: 8 322
11 12 0.023845 NaN NaN CAN 32 ID: 322 (0x142), Length: 8 322
16 17 0.025923 NaN NaN CAN 32 ID: 322 (0x142), Length: 8 322
20 21 0.029142 NaN NaN CAN 32 ID: 322 (0x142), Length: 8 322
... ... ... ... ... ... ... ... ...
183686 183687 122.209922 NaN NaN CAN 32 ID: 322 (0x142), Length: 8 322
183690 183691 122.212137 NaN NaN CAN 32 ID: 322 (0x142), Length: 8 322
183694 183695 122.214169 NaN NaN CAN 32 ID: 322 (0x142), Length: 8 322
183698 183699 122.216221 NaN NaN CAN 32 ID: 322 (0x142), Length: 8 322
183702 183703 122.218485 NaN NaN CAN 32 ID: 322 (0x142), Length: 8 322

[45898 rows x 8 columns]
Receiver Messages (ID: 0x242):
No. Time Source Destination Protocol Length Info ID
1 2 0.000119 NaN NaN CAN 32 ID: 578 (0x242), Length: 8 578
6 7 0.019550 NaN NaN CAN 32 ID: 578 (0x242), Length: 8 578
13 14 0.024612 NaN NaN CAN 32 ID: 578 (0x242), Length: 8 578
17 18 0.026812 NaN NaN CAN 32 ID: 578 (0x242), Length: 8 578
22 23 0.029838 NaN NaN CAN 32 ID: 578 (0x242), Length: 8 578
... ... ... ... ... ... ... ... ...
183684 183685 122.208931 NaN NaN CAN 32 ID: 578 (0x242), Length: 8 578
183688 183689 122.211077 NaN NaN CAN 32 ID: 578 (0x242), Length: 8 578
183692 183693 122.213305 NaN NaN CAN 32 ID: 578 (0x242), Length: 8 578
183696 183697 122.215344 NaN NaN CAN 32 ID: 578 (0x242), Length: 8 578
183700 183701 122.217432 NaN NaN CAN 32 ID: 578 (0x242), Length: 8 578

[45897 rows x 8 columns]
Average time difference between CAN ID 0x142 and 0x242: 0.0011649465004684092
lor@ros:~$ python3 average_time_difference.py dump_X8pro-H.csv 1
Emitter Messages (ID: 0x141):
No. Time Source Destination Protocol Length Info ID
2 3 0.003129 NaN NaN CAN 32 ID: 321 (0x141), Length: 8 321
5 6 0.019442 NaN NaN CAN 32 ID: 321 (0x141), Length: 8 321
8 9 0.021505 NaN NaN CAN 32 ID: 321 (0x141), Length: 8 321
10 11 0.023246 NaN NaN CAN 32 ID: 321 (0x141), Length: 8 321
14 15 0.025092 NaN NaN CAN 32 ID: 321 (0x141), Length: 8 321
... ... ... ... ... ... ... ... ...
183683 183684 122.208456 NaN NaN CAN 32 ID: 321 (0x141), Length: 8 321
183687 183688 122.210395 NaN NaN CAN 32 ID: 321 (0x141), Length: 8 321
183691 183692 122.212703 NaN NaN CAN 32 ID: 321 (0x141), Length: 8 321
183695 183696 122.214697 NaN NaN CAN 32 ID: 321 (0x141), Length: 8 321
183699 183700 122.216819 NaN NaN CAN 32 ID: 321 (0x141), Length: 8 321

[45954 rows x 8 columns]
Receiver Messages (ID: 0x241):
No. Time Source Destination Protocol Length Info ID
3 4 0.003799 NaN NaN CAN 32 ID: 577 (0x241), Length: 8 577
7 8 0.020718 NaN NaN CAN 32 ID: 577 (0x241), Length: 8 577
9 10 0.022259 NaN NaN CAN 32 ID: 577 (0x241), Length: 8 577
12 13 0.023952 NaN NaN CAN 32 ID: 577 (0x241), Length: 8 577
15 16 0.025816 NaN NaN CAN 32 ID: 577 (0x241), Length: 8 577
... ... ... ... ... ... ... ... ...
183685 183686 122.209469 NaN NaN CAN 32 ID: 577 (0x241), Length: 8 577
183689 183690 122.211498 NaN NaN CAN 32 ID: 577 (0x241), Length: 8 577
183693 183694 122.213761 NaN NaN CAN 32 ID: 577 (0x241), Length: 8 577
183697 183698 122.215764 NaN NaN CAN 32 ID: 577 (0x241), Length: 8 577
183701 183702 122.217919 NaN NaN CAN 32 ID: 577 (0x241), Length: 8 577

[45954 rows x 8 columns]
Average time difference between CAN ID 0x141 and 0x241: 0.0011618012903991206

I am using a homemade USB-to-CAN adaptation (https://github.com/ioio2995/twai_slcan) using an ESP32.
In my test configuration, I also use an Innomaker USB to CAN converter to sniff the CAN bus.

@ioio2995
Copy link
Contributor Author

Indeed, the use of extra_status will add additional latency. But let’s not forget that my latest commits have made the request frequency much more flexible and they are no longer dependent on the cycle_time. Therefore, nothing prevents us from sending an extra_status request every second.

@2b-t
Copy link
Owner

2b-t commented Jun 10, 2024

Thanks a lot! The data looks much better than mine. 1.1ms is not too bad.
Yes, indeed, with a variable frequency/turning it off it should not be too much of an issue. Nonetheless I would like to wait for the PCAN-PCI card to arrive so that I can properly test it.

@ioio2995
Copy link
Contributor Author

This afternoon, I conducted some additional tests to evaluate my CAN controller and observe latencies by increasing the number of controllers on the bus. I then realized that the data I had provided you did not exactly match the values you had measured.

My Current Test Protocol:

On my 500K CAN bus, I currently have:

  • Two X8Pro-H controllers
  • A CAN controller connected to an UP Squared Pro running ROS
  • A second CAN controller connected to my PC running Wireshark

Benchmarking Setup

To benchmark the setup, I started by using a Python script. Here is a brief description:

  • Multithreading: The script uses threads to send messages in parallel on multiple CAN IDs.
  • Commands Sent: The script sends the same command to different CAN IDs specified as parameters.
  • Average Response Time Calculation: The script calculates and displays the average response time in milliseconds for successful responses from each CAN ID, independently of the others.
  • Final Message: The script sends a final message with data 0x80.

I tested this script for commands A2, B1, 9A, 9B, and 9C. Here are the statistics obtained:

S:~$ python3 bench_can.py can0 5000 A2 141 142 
CAN ID 0x141: Sent 5000 messages in 10.9831 seconds
Speed: 455.24 messages/second
Failures: 0
Average response time: 2.13 milliseconds
Final message sent.
SocketcanBus was not properly shut down
CAN ID 0x142: Sent 5000 messages in 10.9866 seconds
Speed: 455.10 messages/second
Failures: 0
Average response time: 2.13 milliseconds
Final message sent.
SocketcanBus was not properly shut down
All threads have finished execution.

:~$ python3 bench_can.py can0 5000 B1 141 142 
CAN ID 0x141: Sent 5000 messages in 9.3662 seconds
Speed: 533.83 messages/second
Failures: 0
Average response time: 1.81 milliseconds
Final message sent.
SocketcanBus was not properly shut down
CAN ID 0x142: Sent 5000 messages in 9.3670 seconds
Speed: 533.79 messages/second
Failures: 0
Average response time: 1.81 milliseconds
Final message sent.
SocketcanBus was not properly shut down
All threads have finished execution.

:~$ python3 bench_can.py can0 5000 9a 141 142 
CAN ID 0x141: Sent 5000 messages in 9.3859 seconds
Speed: 532.72 messages/second
Failures: 0
Average response time: 1.81 milliseconds
Final message sent.
SocketcanBus was not properly shut down
CAN ID 0x142: Sent 5000 messages in 9.3866 seconds
Speed: 532.67 messages/second
Failures: 0
Average response time: 1.82 milliseconds
Final message sent.
SocketcanBus was not properly shut down
All threads have finished execution.

:~$ python3 bench_can.py can0 5000 9b 141 142 
CAN ID 0x142: Sent 5000 messages in 9.4348 seconds
Speed: 529.95 messages/second
Failures: 0
Average response time: 1.82 milliseconds
Final message sent.
SocketcanBus was not properly shut down
CAN ID 0x141: Sent 5000 messages in 9.4358 seconds
Speed: 529.89 messages/second
Failures: 0
Average response time: 1.83 milliseconds
Final message sent.
SocketcanBus was not properly shut down
All threads have finished execution.

:~$ python3 bench_can.py can0 5000 9c 141 142 
CAN ID 0x142: Sent 5000 messages in 9.3875 seconds
Speed: 532.62 messages/second
Failures: 0
Average response time: 1.81 milliseconds
Final message sent.
SocketcanBus was not properly shut down
CAN ID 0x141: Sent 5000 messages in 9.3882 seconds
Speed: 532.58 messages/second
Failures: 0
Average response time: 1.82 milliseconds
Final message sent.
SocketcanBus was not properly shut down
All threads have finished execution.

Unfortunately, they are quite different from the statistics I had sent you yesterday.

Revised Wireshark Analysis

I therefore reviewed my Wireshark analysis and rethought my calculation method. Here is a brief description of the Python script:

  • Extraction of CAN IDs: The extract_id function uses a regular expression to extract CAN IDs from the Info columns.

  • Loading and Filtering Data: The script loads the CSV file and filters the rows corresponding to the specified CAN IDs.

  • Time Difference Calculation:

    • Slave (0x14 and 0x24): The script calculates the time differences between 0x14* frames and the first 0x24* frame following each 0x14* frame to obtain the slave processing time. The values are converted to milliseconds.
    • Master (0x14 and 0x14): The script calculates the time differences between each successive pair of 0x14* frames. The values are converted to milliseconds.
  • Calculation of Master Processing Time: The master processing time is calculated by subtracting the slave processing time from the average difference of successive 0x14* frames.

Although far from perfect, the results show that the latency times I had communicated to you yesterday are actually only the motor processing times.

Initial Statistics

Here are the statistics extracted during the use of the benchmark scripts I used previously:

:~$ python3 average_time_difference2.py dump_141-142-A2.csv 2
...
Average time difference between CAN ID 0x142 and 0x242 (ms): 1.1827995892821421
Average time difference between successive CAN ID 0x142 messages (ms): 2.1972100702000055
Processing time of the maître (ms): 1.0144104809178633

:~$ python3 average_time_difference2.py dump_141-142-B1.csv 2
...
Average time difference between CAN ID 0x142 and 0x242 (ms): 0.8184356534693074
Average time difference between successive CAN ID 0x142 messages (ms): 1.8730909254000012
Processing time of the maître (ms): 1.0546552719306939

:~$ python3 average_time_difference2.py dump_141-142-9A.csv 2
...
Average time difference between CAN ID 0x142 and 0x242 (ms): 0.822098690261942
Average time difference between successive CAN ID 0x142 messages (ms): 1.8770660853999976
Processing time of the maître (ms): 1.0549673951380556


:~$ python3 average_time_difference2.py dump_141-142-9B.csv 2
...
Average time difference between CAN ID 0x142 and 0x242 (ms): 0.8214619678064333
Average time difference between successive CAN ID 0x142 messages (ms): 1.8864998700000017
Processing time of the maître (ms): 1.0650379021935685

:~$ python3 average_time_difference2.py dump_141-142-9C.csv 2
...
Average time difference between CAN ID 0x142 and 0x242 (ms): 0.8145388714257121
Average time difference between successive CAN ID 0x142 messages (ms): 1.876995691199999
Processing time of the maître (ms): 1.0624568197742867

Analysis of a ros_control Session

And here is a brief analysis of a ros_control session using the DiffDriveController:

:~$ python3 average_time_difference2.py dump_move-idle_Ros_control.csv 2
...
Average time difference between CAN ID 0x142 and 0x242 (ms): 1.184284166243108
Average time difference between successive CAN ID 0x142 messages (ms): 2.6757625363270385
Processing time of the maître (ms): 1.4914783700839305

You can find the Python scripts used for the benchmark and the analysis of the CSV exports here: https://github.com/ioio2995/rhacobot/tree/master/utils

Feel free to give me your feedback or modify the scripts so that we can use identical tools and methodologies in our tests.

@2b-t
Copy link
Owner

2b-t commented Jun 11, 2024

Hmmm, I see. Interesting...
My conclusion from this would be that the propagation time and actuator's latency combined are around 0.8 ms. The USB-to-CAN adapter, Linux kernel and the code are responsible for the remaining part of the latency which seems to be around 1 ms.

My set-up is different: I run ros2_control on the very same computer as the packet sniffer. This way I measure the round-trip delay (including the network stack) similar to your first approach (just not inside the process but instead with Wireshark). As said my suspicion is that the main culprit for the latter is the USB to CAN adapter. I do not quite trust them for anything real-time related.

@ioio2995
Copy link
Contributor Author

Although the X8-Pro V2 and V3 do not share the same controller board (MC-X-300-O / MC-X-500-O), their SIP is the same and is an SPD1188 from Spintrol. I am attaching an excerpt of the PGA motor specifications, which provides the manufacturer's settling times:
Capture d'écran 2024-06-12 082959

http://xuanzhi.webhh.net/index/index/product2/id/29.html?lang=en

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants