To add further- At this time while native UI supports picture in picture, the web UI Library does not.
The picture in picture we do see today is just around "video" tags.
Examples of this would be Youtube which is just a video playing and it jumps back to the page where this video element is being projected from. Video calls are a little more complex. They are a collection of video tags and a set of controls custom to video tag in this case the ability to mute/unmute your microphone. So instead of a single video we need to project a mini webpage or a "document"
For web we are dependent on web apis supporting “document” Picture in Picture which is currently only experimental and only available on some browsers. https://developer.mozilla.org/en-US/docs/Web/API/Document_Picture-in-Picture_API
In the link there are some supported browsers however it is specific to desktop at this time with no webview or mobile web support.
It would be great to understand more about your scenario and what browsers you may need to support and perhaps we can continue this discussion and understand how we can support you with this picture in picture experience!