Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

driver: tee: optee: implement OCALLs #72

Open
wants to merge 37 commits into
base: optee
Choose a base branch
from

Conversation

HernanGatta
Copy link

@HernanGatta HernanGatta commented Mar 18, 2020

Overview

This PR modifies the TEE and OP-TEE drivers to support OCALLs.

The main change introduced by this PR is the addition of the TEE_IOC_ECALL IOCTL. This IOCTL mirrors and expands the functionality provided by TEE_IOC_INVOKE to support duplex communication. Where the latter provides a one-way command-and-reply transport from CA to TA, the former allows for two-way command-and-reply from CA to TA, and from TA to CA.

Additionally, a shared memory flag, TEE_IOCTL_SHM_OCALL, is added. When this flag is specified, shared memory is registered with the kernel, but not with OP-TEE. While the TEE Supplicant already can perform this operation by talking to a different device node with a different call table, introducing this flag enables all CAs to do it as well.

Lastly, the driver can now check for an additional OP-TEE capability, TEE_OPTEE_CAP_OCALL, which indicates that OP-TEE was built with and supports OCALLs.

Related PRs: OP-TEE, OP-TEE Client, OP-TEE Examples.

ECALL IOCTL

The fundamental working principle behind in-thread OCALL handling is the ability for the IOCTL that carries the initial function invocation request to return before the function invocation is complete. That is, given a CA that invokes a function on a TA, if the TA requests an OCALL, the IOCTL through which the original function invocation request was passed must return prematurely and indicate to the CA that it is not the original function invocation that completed. Rather, the latter is still pending, but has been suspended to relay the OCALL request to the CA. Once the CA handles the OCALL request, it invokes the IOCTL again, whose contents contain the result of the OCALL.

Since OCALL requests may require passing memory references, and given that TA memory must not be exposed to the CA, a shared memory allocation may be required ahead of the actual OCALL. When this occurs, the original IOCTL returns once to relay the shared memory allocation request to the CA, whereupon the CA calls the IOCTL again with the result of the allocation. Once the OCALL proper is complete, another IOCTL return and invocation cycle is required to free the previously allocated memory.

One may regard this duplex communication as a communication protocol. Each return of the IOCTL must indicate whether the original function invocation request is complete, and when not, it must indicate to the CA what it must do on behalf of the TA, and carry sufficient information to restart the original function invocation, whose secure thread is awaiting RPC resumption.

OCALL Contexts

During an call to TEE_IOC_INVOKE, several resources are allocated. These include memory shared with OP-TEE to carry the command arguments, any memory allocated to handle RPCs, as well as the call waiter. Seeing as an OCALL request is passed via an RPC, it is also necessary to keep RPC resumption information, such as the secure thread Id that sent the RPC.

All this information is kept in struct optee_ocall_ctx, which is allocated when the TA requests an OCALL. The resulting pointer to this structure is given an Id using the kernel's IDR mechanism, and is kept alive until the original function invocation is complete.

Values

To carry the information required by the new functionality, two additional values are necessary in the IOCTL parameters, in contrast with TEE_IOC_INVOKE. One value must carry a function ID that indicates the current state of the ECALL-OCALL cycle, and one other value must carry an identifier for the current OCALL context.

These two values are named func and ocall_id in struct tee_ioctl_ecall_arg. The element cmd_id is used to indicate the CA to TA command Id as well as the TA to CA OCALL Id in the opposite direction.

Typical Flows

There are two flows an OCALL can take: one that does not require passing memory references, and one which does.

Flow 1: No Shared Memory

CA            -[TEE_IOC_ECALL/Func Invoke]-> TEE Driver
TEE Driver    -[dispatch]->                  OP-TEE Driver
OP-TEE Driver -[SMC/Func Invoke]->           OP-TEE
OP-TEE        -[dispatch]->                  TA
TA            -[TEE_InvokeCACommand]->       OP-TEE
OP-TEE        -[dispatch]->                  OCALL PTA
OCALL PTA     -[RPC/OCALL Req']->            OP-TEE Driver
OP-TEE Driver -[ret]->                       TEE Driver
TEE Driver    -[ret]->                       CA

CA services OCALL

CA            -[TEE_IOC_ECALL/OCALL Complete]-> TEE Driver
TEE Driver    -[dispatch]->                     OP-TEE Driver
OP-TEE Driver -[SMC/RPC Resume]->               OP-TEE
OP-TEE        -[dispatch]->                     OCALL PTA
OCALL PTA     -[ret]->                          TA
TA            -[ret]->                          OP-TEE
OP-TEE        -[ret]->                          TEE Driver
TEE Driver     -[ret]->                         CA

Flow 2: Shared Memory Requested

CA            -[TEE_IOC_ECALL/Func Invoke]->      TEE Driver
TEE Driver    -[dispatch]->                       OP-TEE Driver
OP-TEE Driver -[SMC/Func Invoke]->                OP-TEE
OP-TEE        -[dispatch]->                       TA
TA            -[TEE_InvokeCACommand]->            OP-TEE
OP-TEE        -[dispatch]->                       OCALL PTA
OCALL PTA     -[RPC/SHM Alloc Req']->             OP-TEE Driver
OP-TEE Driver -[ret]->                            TEE Driver
TEE Driver    -[ret]->                            CA

CA allocates shared memory

CA            -[TEE_IOC_ECALL/SHM Alloc Compl']-> TEE Driver
TEE Driver    -[dispatch]->                       OP-TEE Driver
OP-TEE Driver -[SMC/RPC Resume]->                 OP-TEE
OP-TEE        -[dispatch]->                       OCALL PTA
OCALL PTA     -[RPC/OCALL Req']->                 OP-TEE Driver
OP-TEE Driver -[ret]->                            TEE Driver
TEE Driver    -[ret]->                            CA

CA services OCALL

CA            -[TEE_IOC_ECALL/OCALL Complete]->  TEE Driver
TEE Driver    -[dispatch]->                      OP-TEE Driver
OP-TEE Driver -[SMC/RPC Resume]->                OP-TEE
OP-TEE        -[dispatch]->                      OCALL PTA
OCALL PTA     -[RPC/SHM Free Req']->             OP-TEE Driver
OP-TEE Driver -[ret]->                           TEE Driver
TEE Driver    -[ret]->                           CA

CA frees shared memory

CA            -[TEE_IOC_ECALL/SHM Free Compl']-> TEE Driver
TEE Driver    -[dispatch]->                      OP-TEE Driver
OP-TEE Driver -[SMC/RPC Resume]->                OP-TEE
OP-TEE        -[dispatch]->                      OCALL PTA
OCALL PTA     -[ret]->                           TA
TA            -[ret]->                           OP-TEE
OP-TEE        -[ret]->                           TEE Driver
TEE Driver    -[ret]->                           CA

Abnormal Termination

Seeing as OCALL requests are transmitted to the normal world via RPCs, the requesting secure world thread remains suspended, and is not freed for reuse until the OCALL is complete one way or another. Should the CA terminate while handling an OCALL, or a shared memory allocation/free request, it is imperative that that OCALL be completed, albeit with an error, so that the secure world thread be resumed and allowed to run to completion. Failure to do this results in the eventual exhaustion of secure world threads, not mention that their corresponding TAs remain resident in memory.

If an OCALL does not require any shared memory, a CA that terminates during OCALL handling will see its TEE context released once its reference count reaches zero, which happens as that process' resources are released by Linux.

On the other hand, if an OCALL requires shared memory, a reference to it will be acquired by the OP-TEE driver, which in turn increases the reference count on the TEE context. If the CA terminates prematurely, the reference count on the TEE context will never reach zero.

To break this cycle, a new function callback is added in to the TEE driver function table, pre_release. This function is called when Linux releases the TEE context, yet memory references are still outstanding. The callback implementation in the OP-TEE driver cancels all pending OCALLs, returning an error code to the secure world, and reducing the reference count on outstanding shared memory allocations.

Signed-off-by: Hernan Gatta hegatta@microsoft.com

@@ -494,3 +512,296 @@ void optee_handle_rpc(struct tee_context *ctx, struct optee_rpc_param *param,

param->a0 = OPTEE_SMC_CALL_RETURN_FROM_RPC;
}

enum optee_ocall_status process_shm_alloc(struct optee_msg_arg *arg,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a lot of repeated code in process_shm_alloc(), process_ocall() and process_shm_free(). Perhaps some common helper function can be used?

Copy link
Author

@HernanGatta HernanGatta Mar 27, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Each function has its specific logic, and it seems to me that breaking them apart would add more complexity.

Is there anything specific you'd like refactored?

case OPTEE_OCALL_STATUS_ERROR:
return -EINVAL;
case OPTEE_OCALL_STATUS_OK:
break;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

w is garbage from the stack when returning from an OCALL.

w is only needed while allocating a new thread in secure world. So during RPC and OCALL processing it's not needed any longer. We need to resume an eventual waiter once we're done with the secure world thread though.

Perhaps we can have one helper function which handles the initial call to secure world where the thread is allocated?

Copy link
Author

@HernanGatta HernanGatta Mar 27, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mm... As-is, w is indeed allocated on first entry to secure world and only let go of once the entire call is completed, regardless of how many OCALLs there may have been in between.

So, w is allocated and used upon function invocation, untouched during OCALL handling, then released only once the original function invocation completes.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, and that's OK except that the pointer value in w is picked up from the stack by chance. So either you need to save w somewhere when doing the goto exit_no_finalize below or free it and find another way of waking up eventual waiters.

drivers/tee/tee_core.c Show resolved Hide resolved
@HernanGatta
Copy link
Author

HernanGatta commented Mar 26, 2020

@jenswi-linaro, cleaning up this PR is next up after doing that for the OP-TEE one. Thank you for looking at it.

Design question: We could re-use the existing TEE_IOC_INVOKE if we pass one more parameter, i.e., five instead of four.

The only purely technical reason for adding a new IOCTL is to be able to pass the cmd_id and ocall_id values back to the CA, then have those echoed right back into the kernel during the next invocation, and so on.

I'm thinking that if we pass an extra "meta"-type parameter in TEE_IOC_INVOKE, we could do away with the additional IOCTL. If we did the same for TEE_IOC_OPEN_SESSION, we could also support OCALLs during session open, which I believe is something you pointed out we can't currently do.

Since the TEE stack already supports a dynamic number of parameters via num_params in both cases, we don't risk breaking the ABI.

I added the new IOCTL because I figured back then that doing what I proposed above would be a misuse of parameters (and I was also prototyping, so starting with a brand new IOCTL allowed me change anything I wanted), but then I remembered that there are meta parameters being used for session open between Linux and OP-TEE, and between the TEE Supplicant and Linux, so we might as well do the same thing for CAs.

At this stage, I'm leaning toward the option of scrapping the new IOCTL and using an additional parameter. Also, if we ever need to extend the number of values passed back and forth, we can just use another parameter, or repurpose an existing parameter, instead of adding a new / modifying an existing IOCTL.

What do you think?

@jenswi-linaro
Copy link

I'm reviewing and looking at different options in this PR. It's a bit complicated so it may take a few days.

@HernanGatta
Copy link
Author

HernanGatta commented Mar 30, 2020

I'm reviewing and looking at different options in this PR. It's a bit complicated so it may take a few days.

Thanks, no rush.

On my end, I'm working on an implementation of this that uses meta parameters instead. If I manage to do it cleanly enough, this should help enable OCALLs during session open as well. Additionally, I'm simplifying the OCALL handling logic, and trying to do so in such a way so as to leave the door open for zero-copy OCALLs.

@jenswi-linaro
Copy link

Meta parameters sounds good, I have a feeling that they could simplify things a bit.
When you update in this PR, feel free to squash commits etc. Rebasing could also be a good idea.

@HernanGatta
Copy link
Author

HernanGatta commented Apr 6, 2020

Just a heads-up, I'm still working on this. I did rewrite the changes to use meta parameters instead of a new IOCTL and it works great.

However, I've been doing some more testing with increasingly complex scenarios and this in-thread approach is ripe for deadlocks. I must admit I didn't see this coming when suggesting this approach in the previous iteration of this PR; I hadn't studied the way in which registered shared memory is handled in great detail (explanation follows).

Consider OP-TEE built with two threads (CFG_NUM_THREADS=2) and dynamic shared memory. You start a CA, initialize a context, open a session, and invoke a function. This function has a tmpref parameter, so as part of TEEC_InvokeFunction, the TEE Client API registers shared memory. Then, it calls into the invoke IOCTL, and we go into the TA. The TA performs an OCALL, the CA gets it, and crashes.

What happens?

When the kernel begins releasing file descriptors, it'll release the DMA buf underlying the registered shared memory, which makes a call into OP-TEE. OP-TEE attempts to free the MOBJ, but a reference is held to it by the function invocation. When this happens, OP-TEE blocks on the MOBJ free until other users of the MOBJ free their handle, which in this case is held by the call stack leading to the OCALL.

The OCALL is stuck, because the client crashed, so the two secure threads are now waiting on each other (by proxy, anyway): one is stuck waiting for the OCALL reply, and the other is stuck waiting for the MOBJ to be freed by the OCALL thread.

At this stage, there is no more progress in normal world because the kernel is releasing file descriptors and blocks on releasing the FD corresponding to the registered shared memory, which is in turn blocked by secure world performing an RPC to wait for the associated MOBJ lock to be released. Hence, we are not guaranteed to ever get to the pre_release hook, which would get called when the FD corresponding to the TEE context gets released, and the OCALL is never cancelled.

I tried an approach where I export a file descriptor each time we return from the IOCTL with a pending OCALL (and close it when we resume it). The idea is that if the CA crashes in between, the FD gets released, and as part of the FD release, we cancel the OCALL. But we can't guarantee the order in which the FD release gets processed, so the DMA buf release may occur before the OCALL FD release, and we block anyway.

We can ask users to crank up the number of secure threads, but then if you have more registered shared memory elements, and those FD's get closed ahead of the OCALL FD, you can still exhaust the number of secure world threads and lock up. Of course, this is also a problem is you have CFG_NUM_THREADS=1.

We have to have a way to make progress in the non-secure kernel if the CA crashes, and cancel pending OCALLs. I'm trying to cook up some synchronization/signaling mechanism to avoid deadlocking, but I'm starting to think it might be impossible.

@jenswi-linaro
Copy link

There's problems that needs to be addressed when the CA crashes while processing an OCALL.

  1. The OCALL must be completed with an error from OP-TEE point of view
    • This allows OP-TEE to decrease the MOBJ refcount
  2. SHMs should not be attempted to be freed while in use in OP-TEE
    • Can be avoided by increasing refcount on passed SHMs at invoke and later decreased again after the invoke is complete

This should make sure that the OP-TEE thread is freed before before SHMs in use with that thread are freed.

What do you think, is this enough?

@HernanGatta
Copy link
Author

HernanGatta commented Apr 6, 2020

Can be avoided by increasing refcount on passed SHMs at invoke and later decreased again after the invoke is complete

Good idea. What you suggested in conjunction with creating a file descriptor while an OCALL is in progress should do the trick.

Just increasing the refcount on passed SHMs in the driver won't cut it, because then those refcounts will never reach zero, the context will never get released, and the pre_release hook will never get called. Additionally, since the CA isn't around anymore, there's nothing to prompt the driver to decrease the refcounts, so we're stuck again (though it's no longer a deadlock).

But, if we combine increasing the refcounts with exporting an FD whose release function cancels the OCALL, if necessary, and decreases the refcounts, it should work such that the SHMs don't get released until the OCALL FD gets released, at which point the OCALL secure thread is unblocked, thereby avoiding the deadlock.

I just threw together a proof-of-concept and I'm working through it. Thank you for your suggestion!

EDIT: It seems to work. I'll be doing some more testing, and if things check out, I'll have design-wise cleanup to do first. The exporting of an FD on a pending OCALL looks to me to be a good candidate for a TEE-agnostic framework, à la tee_shm (assuming, of course, that you all think this is even a good idea to begin with; we'll find out during review one way or another).

Thanks again for your suggestion: tweaking the refcounts to ensure release ordering appears to have been the missing piece of the puzzle. I'll report later on any further findings.

@jenswi-linaro
Copy link

teedev_close_context() is supposed to be called when the file descriptor of the context is closed. How can some SHMs not freed prevent that? It seems to me that we should be able to use the context FD to clean up instead of adding yet another FD.

If the freeing of FDs is blocked in secure world I see how we deadlock, but with refcounts increased on the SHMs which we could block on that shouldn't happen. Instead should those SHMs be freed later when the invoke is done and the SHM refcounts are decreased.

@HernanGatta
Copy link
Author

teedev_close_context() is supposed to be called when the file descriptor of the context is closed. How can some SHMs not freed prevent that?

My mistake; I was thinking of the context release function proper, which gets called once the context's refcount reaches zero. You are correct, teedev_close_context does get called, regardless of any outstanding SHMs that still hold a reference. I'll try moving the OCALL release logic back to the pre_release hook.

If the freeing of FDs is blocked in secure world I see how we deadlock, but with refcounts increased on the SHMs which we could block on that shouldn't happen. Instead should those SHMs be freed later when the invoke is done and the SHM refcounts are decreased.

Indeed, that's the behavior I'm observing after I implemented your idea.

I'm currently observing sporadic deadlocks elsewhere; I'm trying to figure out what's going on.

@jenswi-linaro
Copy link

mobj_reg_shm_release_by_cookie() is not supposed to block, but if the MOBJ still is in used we can't let normal world believe that it has managed to free the MOBJ. So we wait. If that happens it's because normal world tries to release the SHM a bit early. The reference counting in normal world should avoid this, but if it happens anyway it's not accurate enough.

@HernanGatta
Copy link
Author

HernanGatta commented Apr 7, 2020

Just to clarify, by "that's the behavior I'm observing", I meant that we don't block anymore in secure world after increasing the SHMs' refcounts in normal world. They now don't get released until the original invoke that requested one or more OCALLs has fully completed.

The deadlocks I mentioned I am seeing now are something different which I'm still working through understanding (these tend to occur when running multiple CAs concurrently, one of which performs OCALLs, and secure world blocks on closing a session, regardless of whether the CA that performs OCALLs crashed or not).

@jenswi-linaro
Copy link

OK, thanks.

@HernanGatta
Copy link
Author

HernanGatta commented Apr 10, 2020

@jenswi-linaro, I've a question, if you wouldn't mind.

Suppose you build the emulated test environment in the usual way via the corresponding repo manifests, ARM32 or ARM64, it does not matter. In this configuration, OP-TEE's RPC argument cache is automatically enabled (the global thread_prealloc_rpc_cache is set to true).

Now you are at the prompt on the emulated Linux system. You log in and run optee_example_hello_world.

During session open, multiple RPC's are performed to load the TA. Since the RPC argument for the secure thread handling the request has not yet been allocated, an RPC is performed to allocate an SHM for the RPC arguments. This allocation is charged to the TEE context under which executes the CA, in this case optee_example_hello_world.

Seeing as thread_prealloc_rpc_cache is true, the RPC arguments SHM is never freed. Therefore, the refcount corresponding to the TEE context of the CA never reaches zero, and the TEE context is leaked.

To verify this, I've added traces in teedev_ctx_get, teedev_ctx_put, and teedev_ctx_release. These traces display the action taken, the address of the TEE context on which the action was taken, and the resulting refcount. For example:

teedev_ctx_get: incr refcount 0xc0363b80/3

This says that teedev_ctx_get was called and that the refcount of the TEE context at address 0xc0363b80 was increased to 3. Immediately below such lines you'll find a stack dump.

Additionally, I've added traces to show which RPC commands OP-TEE sends to the driver.

Then, I started the emulator (ARM32 in this case [1], but this reproes on ARM64, too), and ran optee_example_hello_world twice, collecting the resulting traces each time. I've placed the traces in a GitHub gist.

Notice how on line 27 of the first run, OP-TEE sends an OPTEE_SMC_RPC_FUNC_ALLOC RPC, and notice how there's a call to teedev_ctx_get immediately following, where the stack shows that it was called as part of the RPC handler. This charges the allocation of the RPC arguments SHM to the TEE context of the current CA. If you scroll to the bottom of the trace, you'll see that the final TEE context's refcount is 1. If I take that address and dump it via GDB after the CA has terminated, I see:

(gdb) print *(struct tee_context *)0xc0363b80
$2 = {
  teedev = 0xc0156400,
  list_shm = {
    next = 0xc0363c88,
    prev = 0xc0363c88
  },
  data = 0xc0363bc0,
  refcount = {
    refcount = {
      refs = {
        counter = 1
      }
    }
  },
  releasing = false,
  supp_nowait = false,
  cap_memref_null = false
}

Notice how refcount->refcount.refs.counter is 1 and releasing is false. Additionally, in that trace, teedev_ctx_release is never called.

In contrast, in the second trace, the RPC to allocate the RPC arguments SHM is not present, the final refcount is 0, and teedev_ctx_release is called.

If I place a call to exit(1) in the Hello World example CA after the function invocation but before the call to close the session and to release the context (i.e., Line 100 in main.c), then run that modified example as the first CA after a cold boot, the session is leaked because the TEE context is never released, which would have closed the outstanding session.

Question: Is this intentional?

[1]: Used repo init -u https://github.com/OP-TEE/manifest.git -m default.xml -b master at 10/04@6:00AM UTC.

@jforissier
Copy link

@HernanGatta I don't want to derail the discussion here ;-) but I think you're raising a point I already made in OP-TEE/optee_os#1918 (well, actually @etienne-lms made it) so I do think we need to address this "leak" in some way or another.

@HernanGatta
Copy link
Author

@jforissier, thanks for digging that up. I don't think I even knew what OP-TEE was when you created that issue :) .

The reason I brought this up in the first place is because I was testing my changes and noticed that the session would always get left behind in OP-TEE on the first run, but never thereafter, and spent longer than I'd like to admit until I figured out that it wasn't my fault.

It seems to me that this behavior is not exactly as it should be, but I agree we needn't derail this PR.

Thanks again.

@HernanGatta HernanGatta force-pushed the optee-generic-rpc-support branch 3 times, most recently from 85a2a6a to b1317b0 Compare April 11, 2020 07:41
@HernanGatta
Copy link
Author

HernanGatta commented Apr 12, 2020

Just a heads-up, I've pushed what I have so far mainly to show you the new direction I'm taking these changes in, the idea being that if you really don't like what you see, I can stop early and reassess.

The biggest changes in the new version are:

  1. Removal of the new ECALL IOCTL and the usage of an extra parameter in the INVOKE IOCTL instead whose sole function is to pass OCALL-related values;
  2. Redesign of the way OCALLs are handled; it's now simpler and there's less duplication;
  3. It's now possible to use the same new functions to have OCALLs during session open (TBD);
  4. Cleaned up the commit history and made checkpatch happy;
  5. Rebased atop the latest lin/optee.

With respect to the overall completeness of this PR, I'm still finding new ways of locking the system up when I get creative with threads in the CA. As such, I'm currently reworking the threading model, which, it turns out, is far from trivial when it comes to OCALL cancellations. The way I've done it in the latest changes, I've discovered, is incorrect.

Thanks.

Copy link

@jenswi-linaro jenswi-linaro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on "tee: rewrite bit masks using the BIT macro"

@@ -43,15 +43,15 @@
#define TEE_IOC_BASE 0

/* Flags relating to shared memory */
#define TEE_IOCTL_SHM_MAPPED 0x1 /* memory mapped in normal world */
#define TEE_IOCTL_SHM_DMA_BUF 0x2 /* dma-buf handle on shared memory */
#define TEE_IOCTL_SHM_MAPPED BIT(0)/* memory mapped in normal world */

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem with this is that the BIT() macro isn't available for userspace and this file is shared with userspace (it's under include/uapi).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed. I ran into this issue and was wondering what you'd suggest doing. There are a few (very few) other headers under include/uapi that use the BIT macro.

I also noticed that the version of tee.h in the OP-TEE Client repository is fairly different from the one in the kernel, so I defined the macro there if it's not already been defined elsewhere. This of course doesn't help any other users of this header in user-space.

I'd vote for making these all shifts; checkpatch will complain, but at least the header is consumable by everyone.

Copy link

@jenswi-linaro jenswi-linaro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comments for "tee: add support for supplicant-style memory registration"

@@ -168,18 +168,20 @@ tee_ioctl_shm_register(struct tee_context *ctx,
struct tee_ioctl_shm_register_data __user *udata)
{
long ret;
uint32_t flags = TEE_SHM_DMA_BUF | TEE_SHM_USER_MAPPED;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use u32 instead of uint32_t in the kernel.

@@ -48,11 +49,14 @@ static void tee_shm_release(struct tee_shm *shm)
poolm->ops->free(poolm, shm);
} else if (shm->flags & TEE_SHM_REGISTER) {
size_t n;
int rc = teedev->desc->ops->shm_unregister(shm->ctx, shm);
if (!(shm->flags & TEE_SHM_OCALL)) {
rc = teedev->desc->ops->shm_unregister(shm->ctx, shm);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about letting the callback teedev->desc->ops->shm_unregister() decide for itself what to do if TEE_SHM_OCALL is set?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's fair.

@@ -245,7 +249,7 @@ struct tee_shm *tee_shm_register(struct tee_context *ctx, unsigned long addr,
int num_pages;
unsigned long start;

if (flags != req_flags)
if (!(flags & req_flags))

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should probably be if ((flags & req_flags) != req_flags).

ret = ERR_PTR(rc);
goto err;
if (!(flags & TEE_SHM_OCALL)) {
rc = teedev->desc->ops->shm_register(ctx, shm, shm->pages,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about letting the callback teedev->desc->ops->shm_register() decide for itself what to do if TEE_SHM_OCALL is set?

Copy link

@jenswi-linaro jenswi-linaro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comments for "optee: refactor RPC memory allocation function"

arg->ret = TEEC_ERROR_BAD_PARAMETERS;
rc = optee_rpc_process_shm_alloc(shm, &arg->params[0], &pages_list);
if (rc) {
arg->ret = rc == -ENOMEM

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use an if .. else construct here instead of the ternary operator.

{
phys_addr_t pa;
size_t sz;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Skip this empty line.

@@ -237,52 +284,16 @@ static void handle_rpc_func_cmd_shm_alloc(struct tee_context *ctx,
return;
}

if (tee_shm_get_pa(shm, 0, &pa)) {
arg->ret = TEEC_ERROR_BAD_PARAMETERS;
rc = optee_rpc_process_shm_alloc(shm, &arg->params[0], &pages_list);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

&arg->params[0] looks like a complicated version of arg->params

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I wrote it that way on purpose because optee_rpc_process_shm_alloc only uses the first element and not the others; I wanted to convey that meaning. But I'll change it to arg->params for simplicity's sake.


msg_param->attr = OPTEE_MSG_ATTR_TYPE_TMEM_OUTPUT |
OPTEE_MSG_ATTR_NONCONTIG;
msg_param->u.tmem.buf_ptr =

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please restore the comment you just dropped.
This piece is a bit complicated, perhaps we could try to simplify it a bit while we're at it?
Something like this perhaps?

size_t offs = tee_shm_get_page_offset(shm);

if (offs >= OPTEE_MSG_NONCONTIG_PAGE_SIZE)
        return -EINVAL;
...
msg_param->u.tmem.buf_ptr = virt_to_phys(pages_list) | offs;

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, I didn't mean to the delete the comment.

I'll have a look at your suggestion.

virt_to_phys(pages_list) |
(tee_shm_get_page_offset(shm) &
(OPTEE_MSG_NONCONTIG_PAGE_SIZE - 1));
msg_param->u.tmem.size = tee_shm_get_size(shm);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assign sz instead?

Copy link

@jenswi-linaro jenswi-linaro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comments for "tee: optee: implement OCALL support"

I'm only scratching a bit on the surface here, but I get the feeling that things are made a bit more complicated than necessary. Adding TEE_IOCTL_PARAM_ATTR_OCALL was a good idea.

* result in an OCALL request.
*/

call_ctx = optee_ocall_ctx_alloc(ctx, act_num_params,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bit complicated. How about storing the ocall context as a substruct inside struct optee_session? If there's an active ocall on a session that ocall needs to be completed until another call can be started.

I'd prefer if we didn't need a "fast path".

Copy link
Author

@HernanGatta HernanGatta Apr 16, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your suggestion presumes that TAs will remain be single-threaded. It's true now, but we'd be interested in lifting that restriction at some point.

There is further trouble in that sessions aren't reference-counted. It would be possible for another thread to close and free the session while it's being used by a function invocation. I could of course add a reference count to sessions, assuming you'd be OK with that.

I added the fast path only to avoid the allocation of the context; it all works just the same without it, we don't need it per-se and I'm happy to do away with it.

I'd prefer leaving the call context separate from the session for future-proofing. Having said that, it's likely that if and when TAs gain the ability to support multiple concurrent threads, changes will be required to the driver, so the counter-argument would be to wait until then.

Thoughts?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One of the threading models I'm considering is using one session per thread so this would be another argument for that model.

Threads (in the CA) are supposed to cooperate, so if one closes a session which has an ongoing OCALL it's the wish of the CA to cancel the current invoke and get rid of the session. What we need to worry about here is that the session/OCALL always is in a consistent state. Perhaps that's what you mean by reference counting? Yes, it's fine to make sure that only one "invoke" at a time is using a specific session.

My main concern about future-proofing is the two ABIs between user space and the driver, and between the driver and secure world. So far it seems we have that covered.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Threads (in the CA) are supposed to cooperate

Would this argument also apply to threads created by the TA (assuming we provide e.g., pthread_create() to TAs)?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The short answer is yes, since they share resources that must be expected.
I'm happy to discuss multithreaded TAs further, but perhaps outside of this PR?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Of course. In fact I think this OCALL mechanism could be a key enabler for MT TAs, so the discussion can wait until this is done. I just wanted to make sure no blocker would be introduced.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In fact I think this OCALL mechanism could be a key enabler for MT TAs...

Agree

static void optee_ocall_finalize(struct optee_context_data *ctxdata,
struct optee_call_ctx *call_ctx)
{
atomic_set(&call_ctx->attached, false);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have my doubts about using atomic_set() and friends. I'd prefer if we could use a mutex instead.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll see what I can do.

Given that this is a flag that tells the function whether to proceed, the effects of using atomic_* on it and protecting it with a mutex are the same as far as I can tell.

* present, and NULL otherwise. Hence, 'ocall' can still be NULL
* after this statement.
*/
ocall = tee_param_find_ocall(params, arg.num_params);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're looping through all params to check that stuff is in order. How about also returning: normal_params, num_normal_params, ocall_params, num_ocall_params and then pass that directly to ctx->teedev->desc->ops->invoke_func() below?

@@ -192,6 +196,7 @@ struct tee_shm {
struct tee_device *teedev;
struct tee_context *ctx;
struct list_head link;
struct list_head ocall_link;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer if we didn't need this.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Come to think of it, me too. This is an issue if the same SHM is used in multiple OCALLs. It's not possible now, but it may be in the future.

@HernanGatta
Copy link
Author

I believe I've addressed all the feedback.

I've pushed individual fix commits instead of squashing. While the contribution guidelines suggest doing individual commits for addressing PR feedback, if you find the changes are too difficult to review this way, let me know and I'll squash them (melding as appropriate).

Some outstanding issues:

  1. OCALL support during session open;
  2. OCALL support in the kernel TEE Client API implementation;
  3. I can't tell what's wrong with the IBART failure.

Lastly, the known issue I mentioned in the original PR description has gone away after re-implementing the changes. It would indeed appear that I had done something wrong.

@jenswi-linaro
Copy link

I'd like to propose leaving the implementation of kernel-mode OCALLs for a future PR.

Sure, no problem.

@jenswi-linaro
Copy link

I think we need to simplify the rules around how or where a shared memory object (SHM) is reference counted.

When we're entering in open_session/invoke all SHMs are increased with shm_from_user(). Just before returning from open_session/invoke those SHMs are decreased. This is what was done before this PR.

With OCALLs we're returning early from open_session/invoke so the original "normal" SHMs can't be decreased yet. Instead we must save them in a list for later processing, either when the function does a "normal" (as opposed
to a OCALL) return or if the session or file descriptor is closed.

With a OCALL reply we increase the SHMs and store them in a special list to be decreased next time open_session/invoke returns. In the case of a OCALL request we should only consume the "OCALL reply" list of SHMs, but for a "normal" return we should process the "normal" list too.

I think this should cover all cases and also allow us to reduce the special cases for OCALL processing the in common code at least. Does this make any sense? Am I missing something?

jforissier and others added 9 commits January 14, 2021 12:01
The ID for commit b83685b ("tee: amdtee: fix memory leak in
amdtee_open_session()") is wrong in upstream-tee-subsys-patches.txt.
Fix it.

Reported-by: Victor Chong <victor.chong@linaro.org>
Signed-off-by: Jerome Forissier <jerome@forissier.org>
Signed-off-by: Javier Almansa Sobrino <javier.almansasobrino@arm.com>
Acked-by: Joakim Bech <joakim.bech@linaro.org>
Link: linaro-swg#85
[jf: not currently intended for upstream; add link to PR]
Signed-off-by: Jerome Forissier <jerome@forissier.org>
The optee device status is disabled by default, change its status to 'okay'
in the dts of the EV1 board

Signed-off-by: Timothée Cercueil <timothee.cercueil@st.com>
Move the call queue handling logic off to a new file to allow for its use
in other translation units cleanly.

Signed-off-by: Hernan Gatta <hegatta@microsoft.com>
Acked-by: Etienne Carriere <etienne.carriere@linaro.org>
Supporting OCALLs, where a TA may request a function invocation on its
Client Application (CA), requires that the secure thread requesting the
OCALL be temporarily suspended as a result of performing an RPC to send the
OCALL request to normal world.

A possible OCALL request is one where the TA requests that the CA allocate
shared memory in order to pass large buffers as part of the function
invocation. When this happens, the request is routed to the CA. Normally,
this would result in another entry into secure world to register that
shared memory, if OP-TEE supports it. This re-entry would block, however,
if OP-TEE only has one thread available. To avoid this problem, the TEE
supplicant readily supports registering shared memory only with the driver,
and passing the necessary information back to OP-TEE via RPC parameters
instead.

This change introduces the TEE_IOCTL_SHM_OCALL flag. When used, the call
into OP-TEE to register shared memory is not performed, nor is its
corresponding free. It is up to the OCALL implementation to properly deal
with reference counting.

Signed-off-by: Hernan Gatta <hegatta@microsoft.com>
Reviewed-by: Jens Wiklander <jens.wiklander@linaro.org>
Add a new generic capability to denote support for OCALLs in secure world:
TEE_GEN_CAP_OCALL. Additionally, support detection of this capability in
OP-TEE.

Signed-off-by: Hernan Gatta <hegatta@microsoft.com>
Reviewed-by: Jens Wiklander <jens.wiklander@linaro.org>
Acked-by: Etienne Carriere <etienne.carriere@linaro.org>
At times, it may be useful to allow a caller to increase the reference
count on a shared memory (SHM) object at will. This patch adds tee_shm_get,
the increasing counterpart to tee_shm_put.

Signed-off-by: Hernan Gatta <hegatta@microsoft.com>
Reviewed-by: Jens Wiklander <jens.wiklander@linaro.org>
Acked-by: Etienne Carriere <etienne.carriere@linaro.org>
Enable Trusted Applications (TAs) to invoke functions on their
corresponding Client Application (CA), both during session open and
function invocation. These function invocations from TA to CA are referred
to as "Out Calls", or OCALLs for short.

The fundamental mechanism is one whereby upon a function invocation from
the CA to the TA, the TEE returns prematurely from the invocation with an
RPC. This RPC is generated after a TA calls the TEEC_InvokeCommand
equivalent function in secure world. The RPC carries information describing
the OCALL as well as its parameters. When this happens, the driver saves
the state of the current call and returns to user-mode.

The TEE Client API will have invoked the TEE_IOC_INVOKE IOCTL with a
special parameter that carries OCALL information. When the IOCTL returns
prematurely, this parameter includes information about what the CA is
expected to do on behalf of the TA along with data to be used to reply to
the request. The TEE Client API dispatches the request accordingly to the
CA proper.

Once that is done, the TEE Client API calls the TEE_IOC_INVOKE IOCTL again
with the modified OCALL parameter and associated information (such as the
result of the OCALL, and the parameters, as requested by the TA). The
driver notices that this invocation is in fact a resumption as opposed to a
brand-new invocation, and resumes the secure world thread that sent the RPC
in the first place.

The same mechanism applies to OCALLs during session open.

This patch also minimally updates the OP-TEE and AMD TEE drivers to match
the new signatures for session open and invoke. If an OCALL is specified by
the CA, EOPNOTSUPP is returned.

Signed-off-by: Hernan Gatta <hegatta@microsoft.com>
Enable OCALL support specifically for OP-TEE.

Signed-off-by: Hernan Gatta <hegatta@microsoft.com>
@HernanGatta
Copy link
Author

HernanGatta commented Apr 8, 2021

Hello @jenswi-linaro, I'm back working on this. I'm sorry for the extended hiatus, I should have let you all know I'd be away for a while (I didn't quite plan it that way). I'm looking at your comments now; the latest push is merely a rebase and doubles as a sign of life :).

@jenswi-linaro
Copy link

OK, welcome back! :-)

@etienne-lms
Copy link

@HernanGatta, until now I reused you patches without modifications. No change on the Linux patches. I only split the optee_os patch in 2, splitting CFG_OCALL into CFG_CORE_OCALL and CFG_USER_OCALL. I've not used the optee_client patches aside applying them to run the tests and examples you provided.

I have no new comment on your patches. If i remember correctly there is somewhere I needed to use 5 params at least (2 meta + 3 or 4 std) in the Ocall context because of a test in a tee or optee ocall sequence. Not an issue so far. I'll check that once I'm ready with the SCMI PTA Ocall patches for Linux kernel.

abuzarra pushed a commit to digi-embedded/linux that referenced this pull request Nov 30, 2022
Move the call queue handling logic off to a new file to allow for its use
in other translation units cleanly.

Signed-off-by: Hernan Gatta <hegatta@microsoft.com>
Acked-by: Etienne Carriere <etienne.carriere@linaro.org>
[etienne: picked commit from linaro-swg/linux#72]
Signed-off-by: Etienne Carriere <etienne.carriere@linaro.org>
Change-Id: Ifbd281a8be32921d8273ba8683c53d5b19334b66
abuzarra pushed a commit to digi-embedded/linux that referenced this pull request Nov 30, 2022
At times, it may be useful to allow a caller to increase the reference
count on a shared memory (SHM) object at will. This patch adds tee_shm_get,
the increasing counterpart to tee_shm_put.

Signed-off-by: Hernan Gatta <hegatta@microsoft.com>
Reviewed-by: Jens Wiklander <jens.wiklander@linaro.org>
Acked-by: Etienne Carriere <etienne.carriere@linaro.org>
[etienne: picked commit from linaro-swg/linux#72]
Signed-off-by: Etienne Carriere <etienne.carriere@linaro.org>
Change-Id: I4ec1ef1f3a44b9000ef26717a52853cecd3fcaa3
abuzarra pushed a commit to digi-embedded/linux that referenced this pull request Nov 30, 2022
Enable Trusted Applications (TAs) to invoke functions on their
corresponding client in Linux kernel driver during both both session
open and command invocation. These function invocations from TA to client
are referred to as "Out Calls", or OCALLs for short.

The fundamental mechanism is one whereby upon a function invocation from
the client to the TA, the TEE returns prematurely from the invocation
with an RPC. This RPC is generated after a TA calls a TEEC_InvokeCommand()
equivalent function in secure world. The RPC carries information
describing the OCALL as well as its parameters. When this happens, the
driver saves the state of the current call and returns to user-mode.

The TEE kernel client API has to call tee_client_open_session() or
tee_client_invoke_command() with a special parameter that carries
OCALL information. When the function returns prematurely, this parameter
includes information about what the client is expected to do on behalf
of the TA along with data to be used to reply to the request.

Once that is done, TEE kernel client API calls tee_client_open_session()
(respectively tee_client_invoke_command()) again with the modified
OCALL parameter and associated information (such as the result of the
OCALL and the output parameters as requested by the TA). The driver
notices that this invocation is in fact a resumption as opposed to a
brand-new invocation, and resumes the secure world thread that sent
the RPC in the first place.

The same mechanism applies to OCALLs during session open.

This patch also minimally updates the OP-TEE and AMD TEE drivers to match
the new signatures for session open and invoke. If an OCALL is specified
by the CA, EOPNOTSUPP is returned.

This change it based on the OCALL implementation proposal from Hernan
Gatta posted in [1] with few modifications to remove changes in shared
memory from/to sequence since OCALL is not yet available to user client
application, and to remove TEE drivers pre-release handler that are not
needed when supporting OCalls only in Linux kernel TEE client drivers.

Link: [1] linaro-swg/linux#72
Co-developed-by: Hernan Gatta <hegatta@microsoft.com>
Signed-off-by: Hernan Gatta <hegatta@microsoft.com>
Signed-off-by: Etienne Carriere <etienne.carriere@linaro.org>
Change-Id: I95b35e2447bfb24b729d7bf1d3dec4cc620100e6
abuzarra pushed a commit to digi-embedded/linux that referenced this pull request Nov 30, 2022
Enable OCALL support specifically for OP-TEE without support for
Ocall specific shared memory allocation.

This change it fully based on the OCALL implementation proposal from
Hernan Gatta posted in [1] with modifications to remove OCall specific
shared memory allocation.

Link: [1] linaro-swg/linux#72
Co-developed-by: Hernan Gatta <hegatta@microsoft.com>
Signed-off-by: Hernan Gatta <hegatta@microsoft.com>
Signed-off-by: Etienne Carriere <etienne.carriere@linaro.org>
Change-Id: I70e9a0c0730df96277a2f4c0619d844e26d125ec
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants