Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

openamp: add new ops wait_tx_buffer() support #347

Closed
wants to merge 1 commit into from

Conversation

GUIDINGLI
Copy link
Contributor

@GUIDINGLI GUIDINGLI commented Feb 7, 2022

openamp: add new ops wait_tx_buffer() support

This can avoid looping check tx buffer

Copy link
Contributor

@edmooring edmooring left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs a lot more documentation explaining what the new operation must do, and what values it returns.

lib/include/openamp/remoteproc.h Outdated Show resolved Hide resolved
lib/include/openamp/remoteproc.h Outdated Show resolved Hide resolved
@@ -333,6 +333,8 @@ static void *rpmsg_virtio_get_tx_payload_buffer(struct rpmsg_device *rdev,
metal_mutex_release(&rdev->lock);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to hold the lock, or re-lock the device during rpmsg_virtio_wait_tx_buffer()?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need, you can assume the simplest user's implementation of wait_tx_buffer() is metal_sleep_usec().

@@ -333,6 +333,8 @@ static void *rpmsg_virtio_get_tx_payload_buffer(struct rpmsg_device *rdev,
metal_mutex_release(&rdev->lock);
if (rp_hdr || !tick_count)
break;
if (rpmsg_virtio_wait_tx_buffer(rvdev) >= 0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks wrong. If it fails, the program waits twice, once in rpmsg_virtio_wait_tx_buffer(), and then in metal_sleep_usec().

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If wait fails, that means hasn't wait, then we still need do metal_sleep_usec(), or will busyloop.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please could you explain why you don't use rpmsg_trysend API instead that set wait to false and so tick_count to 0?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is only one difference between a looping query mechanism and a waiting mechanism.
Of course, a waiting mechanism is more suitable for low power device.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My point here is that I don't understand why you don't use "rpmsg_trysend" in an application loop instead.
you would generate your loop without waiting mechanism.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But, rpmsg_send/rpmsg_get_tx_payload_buffer is a public api, we can't limit how other team write the application.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If wait fails, that means hasn't wait, then we still need do metal_sleep_usec(), or will busyloop.

So a function with "wait" in its name doesn't wait sometimes? This needs to be documented in the Doxygen comments. There needs to be enough documentation that a new user of the library can understand why they might need to implement this operation, and how to implement it properly.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My point here is that I don't understand why you don't use "rpmsg_trysend" in an application loop instead.
you would generate your loop without waiting mechanism.
I agree with @arnopo. I don't understand why this needs to be in the library, rather than in application code.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If wait fails, that means hasn't wait, then we still need do metal_sleep_usec(), or will busyloop.

Yes, wait fail mean it can't finish the work in some special case, so the common code need do the fallback action. In most case, wait success mean the buffer available and the framework can try again immediately.

So a function with "wait" in its name doesn't wait sometimes? This needs to be documented in the Doxygen comments. There needs to be enough documentation that a new user of the library can understand why they might need to implement this operation, and how to implement it properly.

Yes.

Copy link
Collaborator

@arnopo arnopo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if rpmsg_trysend can do the job please detail what is missing

lib/include/openamp/remoteproc.h Outdated Show resolved Hide resolved
@@ -333,6 +333,8 @@ static void *rpmsg_virtio_get_tx_payload_buffer(struct rpmsg_device *rdev,
metal_mutex_release(&rdev->lock);
if (rp_hdr || !tick_count)
break;
if (rpmsg_virtio_wait_tx_buffer(rvdev) >= 0)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please could you explain why you don't use rpmsg_trysend API instead that set wait to false and so tick_count to 0?

@GUIDINGLI
Copy link
Contributor Author

There is only one difference between a looping query mechanism and a waiting mechanism.
Of course, a waiting mechanism is more suitable for low power device.

@edmooring
Copy link
Contributor

An operation that simply checked if there is a tx buffer available could be useful to application writers. For example, there could be a special thread that monitored buffer usage, and if there were no buffers available, did some sort of maintenance on the buffer pool, or restarted the application. The point is that the handling of the out of buffers situation is best done by the application programmer.

@xiaoxiang781216
Copy link
Collaborator

xiaoxiang781216 commented Feb 20, 2022

An operation that simply checked if there is a tx buffer available could be useful to application writers.

It's still a busy loop.

For example, there could be a special thread that monitored buffer usage, and if there were no buffers available, did some sort of maintenance on the buffer pool, or restarted the application. The point is that the handling of the out of buffers situation is best done by the application programmer.

Are you sure this should be done by the application programmer, not BSP programmer?!
The team write rpmsg service is different from the team port OpenAMP to the new hardware. How to know the buffer available just like how to receive the remote request is the platform/hardware dependent. On the other hand, the service team could write a general rpmsg service which can work on any hardware platform with the correct OpenAMP porting.

I remember that there is a work group to define the general rpmsg service, could you tell me how the work group can avoid the busy polling in a platform dependent way by your propose.

@xiaoxiang781216
Copy link
Collaborator

xiaoxiang781216 commented Feb 20, 2022

From the architecture perspective, the busy polling should be replaced by waiting a semaphore in OpenAMP framework, just like how Linux side do the same thing, but this require to change libmetal and open-amp. From our upstream experience, all my team member don't want to make any big change to libmetal or open-amp.

@arnopo
Copy link
Collaborator

arnopo commented Feb 21, 2022

From the architecture perspective, the busy polling should be replaced by waiting a semaphore in OpenAMP framework, just like how Linux side do the same thing, but this require to change libmetal and open-amp. From our upstream experience, all my team member don't want to make any big change to libmetal or open-amp.

Behavior depends on:

  • OS : linux offers capability of an active wait creating a work queue, bare metal and some OS can do that. In bare metal using rpmsg_send is not a good idea as it will block all the system.
  • hardware: Polling vrings without any inter processor signaling is an implementation that makes semaphore unusable.
  • rpmsg services: in multi service mode, some services can want to sleep waiting TX buffer is available, some want to inform application, some wants to have the priority, ....
    Supporting all platforms, OSes, services usecases in the library itself seems not a good idea as too complex.

As you mentioned a rpmsg service layer is something that makes sense in OSes. But this layer is quite OS/project dependent. Make it generic would also introduce new design tradeoffs.
This layer exists in Zephyr.
Look like implementing such kind of layer in NuttX would allow you to avoid such cross dependencies, managing this use case and #238, with out impacting other OpenAMP lib users.

Focusing on the PR itself:
What about exposing RPMSG_TICKS_PER_INTERVAL as project config, has done for RPMSG_BUFFER_SIZE?
If set to 0 => polling with no wait.

@xiaoxiang781216
Copy link
Collaborator

xiaoxiang781216 commented Feb 21, 2022

From the architecture perspective, the busy polling should be replaced by waiting a semaphore in OpenAMP framework, just like how Linux side do the same thing, but this require to change libmetal and open-amp. From our upstream experience, all my team member don't want to make any big change to libmetal or open-amp.

Behavior depends on:

  • OS : linux offers capability of an active wait creating a work queue, bare metal and some OS can do that. In bare metal using rpmsg_send is not a good idea as it will block all the system.
  • hardware: Polling vrings without any inter processor signaling is an implementation that makes semaphore unusable.
  • rpmsg services: in multi service mode, some services can want to sleep waiting TX buffer is available, some want to inform application, some wants to have the priority, ....

All these polices can be achieved with the wait_tx_buffer callback, without modifying the framework code.

Supporting all platforms, OSes, services usecases in the library itself seems not a good idea as too complex.

Yes, that's why this patch provide the callback to give the porting layer a chance to handle this situation in a better way other than the poll loop.

As you mentioned a rpmsg service layer is something that makes sense in OSes. But this layer is quite OS/project dependent. Make it generic would also introduce new design tradeoffs. This layer exists in Zephyr. Look like implementing such kind of layer in NuttX would allow you to avoid such cross dependencies, managing this use case and #238, with out impacting other OpenAMP lib users.

Yes, but the framework shouldn't forbid the porting layer to do some optimization if they want.
“Programming Perl” give a good suggestion about this topic:
Easy things should be easy, and hard things should be possible.

Focusing on the PR itself: What about exposing RPMSG_TICKS_PER_INTERVAL as project config, has done for RPMSG_BUFFER_SIZE? If set to 0 => polling with no wait.

First, we want to remove all poll loop for battery powered device. Second, the callback give many possibility other than simply sleep(e.g. wait semaphore, recursive dispatch...).

@arnopo
Copy link
Collaborator

arnopo commented Feb 21, 2022

More an library integration point that a feature point. From feature point of view be able to optimize the waiting is completly valid, that's why the rpmsg_trysend had been created.
The main advantage of your proposal is to simplify the support of different virtio backend. remoteproc_virtio is the default one some other such as static vrings(Zephyr) or virtio mmio can be used.

If @edmooring is ok let's move forward on your proposed implementation and see how smoothly this could be inserted.

lib/include/openamp/remoteproc.h Outdated Show resolved Hide resolved
lib/include/openamp/rpmsg_virtio.h Outdated Show resolved Hide resolved
lib/include/openamp/virtio.h Outdated Show resolved Hide resolved
lib/remoteproc/remoteproc_virtio.c Outdated Show resolved Hide resolved
lib/remoteproc/remoteproc_virtio.c Outdated Show resolved Hide resolved
lib/rpmsg/rpmsg_virtio.c Outdated Show resolved Hide resolved
@edmooring
Copy link
Contributor

An operation that simply checked if there is a tx buffer available could be useful to application writers.

It's still a busy loop.

For example, there could be a special thread that monitored buffer usage, and if there were no buffers available, did some sort of maintenance on the buffer pool, or restarted the application. The point is that the handling of the out of buffers situation is best done by the application programmer.

Are you sure this should be done by the application programmer, not BSP programmer?! The team write rpmsg service is different from the team port OpenAMP to the new hardware. How to know the buffer available just like how to receive the remote request is the platform/hardware dependent. On the other hand, the service team could write a general rpmsg service which can work on any hardware platform with the correct OpenAMP porting.

Yes, because the behavior that is needed can change, depending on the application. In many cases, the application/service developers work for a different company than the team that develops the BSP, and there are dozens of different teams from multiple organizations using the same BSP.

I remember that there is a work group to define the general rpmsg service, could you tell me how the work group can avoid the busy polling in a platform dependent way by your propose.

As @arnopo points out down below, this can be done at a higher level than the OpenAMP transport layer. Attempting to handle application-level policy decisions at the transport layer risks causing more problems later.

1 similar comment
@edmooring
Copy link
Contributor

An operation that simply checked if there is a tx buffer available could be useful to application writers.

It's still a busy loop.

For example, there could be a special thread that monitored buffer usage, and if there were no buffers available, did some sort of maintenance on the buffer pool, or restarted the application. The point is that the handling of the out of buffers situation is best done by the application programmer.

Are you sure this should be done by the application programmer, not BSP programmer?! The team write rpmsg service is different from the team port OpenAMP to the new hardware. How to know the buffer available just like how to receive the remote request is the platform/hardware dependent. On the other hand, the service team could write a general rpmsg service which can work on any hardware platform with the correct OpenAMP porting.

Yes, because the behavior that is needed can change, depending on the application. In many cases, the application/service developers work for a different company than the team that develops the BSP, and there are dozens of different teams from multiple organizations using the same BSP.

I remember that there is a work group to define the general rpmsg service, could you tell me how the work group can avoid the busy polling in a platform dependent way by your propose.

As @arnopo points out down below, this can be done at a higher level than the OpenAMP transport layer. Attempting to handle application-level policy decisions at the transport layer risks causing more problems later.

lib/include/openamp/remoteproc.h Outdated Show resolved Hide resolved
lib/include/openamp/remoteproc.h Outdated Show resolved Hide resolved
lib/include/openamp/remoteproc_virtio.h Outdated Show resolved Hide resolved
lib/include/openamp/remoteproc_virtio.h Outdated Show resolved Hide resolved
lib/include/openamp/virtio.h Outdated Show resolved Hide resolved
@GUIDINGLI
Copy link
Contributor Author

@arnopo as your suggestion, modify rproc_virtio_create_vdev, but now zephyr compile failed, how could I do ?

lib/include/openamp/remoteproc_virtio.h Show resolved Hide resolved
lib/include/openamp/rpmsg_virtio.h Show resolved Hide resolved
lib/include/openamp/rpmsg_virtio.h Outdated Show resolved Hide resolved
lib/include/openamp/virtio.h Show resolved Hide resolved
lib/remoteproc/remoteproc_virtio.c Show resolved Hide resolved
@@ -30,6 +30,19 @@ static void rproc_virtio_virtqueue_notify(struct virtqueue *vq)
rpvdev->notify(rpvdev->priv, vring_info->notifyid);
}

static int rproc_virtio_wait_notified(struct virtio_device *vdev,
struct virtqueue *vq)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

static int rproc_virtio_notify_wait(struct virtqueue *vq)

lib/remoteproc/remoteproc.c Show resolved Hide resolved
*
* return 0 means there is notify available, otherwise negative value.
*/
int (*wait_notified)(struct remoteproc *rproc, uint32_t id);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

notify_wait> should we move before line 414?

@GUIDINGLI
Copy link
Contributor Author

@edmooring @arnopo @jjmcdn
Could help review this ?

@GUIDINGLI
Copy link
Contributor Author

@edmooring @arnopo @jjmcdn
Could help review this ?

@GUIDINGLI GUIDINGLI force-pushed the mine branch 2 times, most recently from 056bdf3 to 16ed019 Compare July 29, 2022 17:18
@@ -347,6 +347,13 @@ static void *rpmsg_virtio_get_tx_payload_buffer(struct rpmsg_device *rdev,
metal_mutex_release(&rdev->lock);
if (rp_hdr || !tick_count)
break;

status = rpmsg_virtio_notify_wait(rvdev, rvdev->rvq);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

create a more generic function that implement also the metal_sleep_usec(RPMSG_TICKS_PER_INTERVAL);
something such as:

static inline int
rpmsg_virtio_wait_for_buffer(struct rpmsg_virtio_device *rvdev,
			 struct virtqueue *vq, int timeout)
{
	if (rvdev->vdev->func->notify_wait) {
			return rvdev->vdev->func->notify_wait(rvdev->vdev, vq, timeout) :
	} else {
			metal_sleep_usec(timeout);
			return RPMSG_SUCCESS;
	}
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we need pass param timeout to rpmsg_virtio_wait_for_buffer, and need tick_count--.
So I change the caller.

lib/include/openamp/rpmsg_virtio.h Outdated Show resolved Hide resolved
lib/include/openamp/remoteproc.h Outdated Show resolved Hide resolved
lib/include/openamp/remoteproc_virtio.h Outdated Show resolved Hide resolved
@@ -179,6 +192,7 @@ static const struct virtio_dispatch remoteproc_virtio_dispatch_funcs = {
.get_features = rproc_virtio_get_features,
.read_config = rproc_virtio_read_config,
.notify = rproc_virtio_virtqueue_notify,
.notify_wait = rproc_virtio_notify_wait,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned the feature is optional. seems better to declare it in remoteproc_virtio (similar to rpvdev_notify_func) and add a new API to set it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is a element of remoteproc_ops, if user don't want this feature, then don't set the remoteproc_ops->wait_notified

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is a element of remoteproc_ops, if user don't want this feature, then don't set the remoteproc_ops->wait_notified

That's true only if the remoteproc_ops is used. Here is an exemple of the use of rpmsg without remoteproc_ops:
https://github.com/zephyrproject-rtos/zephyr/blob/main/samples/subsys/ipc/openamp_rsc_table/src/main_remote.c#L228

Copy link
Contributor

@CV-Bowen CV-Bowen Oct 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@arnopo Hi, I want to continue this PR. You mean add new api to set the rpvdev->wait_notifed is add a new function called rproc_virtio_set_wait_notifed() or add a new argument in rproc_virtio_create_vdev() like this:

struct virtio_device *
rproc_virtio_create_vdev(unsigned int role, unsigned int notifyid,
			 void *rsc, struct metal_io_region *rsc_io,
			 void *priv,
			 rpvdev_notify_func notify,
                         rpvdev_wait_notify_func wait_notifed,
			 virtio_dev_reset_cb rst_cb)

Both are good for me, but I think we should set the notify and wait_notifed function in same way.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @CV-Bowen ,

I will need probably few time to remind all discussions on this PR, but concerning our point:

We are trying to not modify the existing API. As the wait_notifed is optional, I prefer the rproc_virtio_set_wait_notifed() API.

Please notice also that we added more Doxygen documentation, so this PR needs rebase and some refresh .

This can avoid looping check tx buffer

Signed-off-by: Guiding Li <liguiding1@xiaomi.com>
@CV-Bowen
Copy link
Contributor

@arnopo Hi, I add a new PR #518 for this feature, and did some modifications based on your comments in this PR. Could you take a look?

@arnopo
Copy link
Collaborator

arnopo commented Oct 16, 2023

I close this one to avoid confusion.

@arnopo arnopo closed this Oct 16, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants