Thursday, March 12, 2009

RESTful UDP: a Live Framework Feature Request

Yes, I admit, “RESTful UDP” sounds unnatural, maybe even unethical.  I also admit that I’m getting ahead of myself, talking about design before requirements.  So what motivates me to ask for such a feature?

I want real-time P2P messaging between users, apps, and devices.  Notifications get us most of the way there, but they aren’t enough.  Before I discuss the issues with notifications, let’s talk about scenarios that motivate this.

Scenarios

Imagine walking up to a large-screen Mesh-enabled device such as an XBox, Microsoft Surface, or public kiosk.  You pair your Mesh-enabled smartphone and project its apps and data onto the big screen, with your smartphone acting as the data entry device.

Expanding on this scenario, imagine a game that takes advantage of the smartphone’s accelerometer and camera, turning your phone into a high-powered Wiimote with the entire touch screen used for control surface.  You might want to attach a wrist strap…

Imagine the cool apps you could write if a group of people shares real-time GPS data from smartphones and carputers.

This feature would be useful for more than just extending the capabilities of smartphones.  You could remotely control media playback, chat with people, push real-time financial data, and build a variety of interesting distributed apps that are designed to run in real-time across a mesh of devices, aggregating specialized device capabilities into a single composite experience.

Cross-platform support exponentially increases the possibilities and relevance of the mesh.  You can imagine special-purpose devices whose entire reason for existence is to be plugged into the mesh to supplement apps and user experiences.  This is true even without real-time messaging, but this capability is crucial for enabling the most seamless composite device experiences.

Who else wants this?

Back in December, Kevin Hoffman kicked off an extended discussion on this feature request.  Here’s a brief excerpt:

Low Latency: Other actions that people take within the application need to happen quickly. I need very low latency between when the action takes place and when the other client(s) are notified about the action. Think of these as instant messages, though with a domain-specific purpose. Some can be directed at an individual MEWAs, others can be broadcasts. I do not currently have a solution for the low latency.

I know that Silverlight applications cannot receive push messages because of their highly restricted network Sandbox. However, I'm wondering if they would be able to create an HTTP WCF service via .NET Services and host the proxy in the cloud that would allow near-real-time HTTP message posting between MEWAs... Is this possible?

Strages has a big wish list in the Mesh forum that includes:

as mobile phones and laptops will get close and closer together.. there will get a time that you just plug your phone to a local monitor and keyboard.. or even better bring your own flexible seized monitor etc etc.. you get it

John Macintyre’s PDC session had a demo of remotely controlling a Media Center.  Sync delays in the demo made for a not so seamless experience.

Warning: detailed discussion ahead

I could probably end the feature request here.  Please take everything that follows as optional food for thought.  If you like what you’ve heard so far and just want to vote for this, please visit this feature request on the Live Framework forum and vote it up.  If you have comments, please post them on the forum rather than here.

Why UDP?

First, I don’t intend UDP to be taken literally (well maybe I do, but that’s an implementation detail).  Specifically, I’m interested in the following UDP characteristics:

  • One-way messaging
  • Low latency
  • Stateless (lossy, no sessions, unordered, etc.)
  • Multicasting

It shouldn’t be necessary to expose the concept of P2P session initiation, even if that is ultimately an implementation detail.

Why REST?

I do intend REST to be taken literally.  I’m interested in the following REST characteristics:

  • URI-addressable resources
  • Hyperlinks to other resources
  • Arbitrary user content
  • Transport-agnostic (yes, I think that’s RESTful)
  • POST and GET (PUT and DELETE don’t make sense here)

Links to arbitrary resources are an important tool for keeping message size small.

Being transport-agnostic is important, for performance and for constrained network environments.  Of course HTTP should be supported, but it should also be possible to use TCP, UDP, the Messenger Relay, or even use SMS like Mesh4x does.  Just like with Notifications, there should be no need for senders and receivers to use the same transport or representation.

Making this feature available to plain old DHTML Mesh apps would be pretty amazing.

Meshisms

I would also like support for some Mesh-specific features:

  • AtomPub feed-based model
  • Multiple representations (ATOM, POX, JSON, Binary XML)
  • Expansions
  • “LINQ to REST”
  • Send messages from triggers (a fine-grained alternative to subscriptions)
  • Local LOE support

Why aren’t Notifications enough?

In my Notifications post I covered some of the issues that make Notifications less than ideal for real-time messaging, even when paired with Activities.

  • Three round-trips per notification cycle (can be reduced to two using expansions)
  • Watermarks are destructive
  • You can’t directly publish your own notifications
  • Notification queues are per-user and designed for single-threaded use (one per client)
  • Each app does its own polling rather than sharing one connection, unlike iPhone’s push notifications
  • Notifications are one-way from cloud to client (no client-to-cloud or P2P)
  • Notification polls don’t chain from local LOE to cloud LOE
  • While Notifications are usually instantaneous, they can become backlogged and delayed by several minutes
  • The local LOE doesn’t currently implement Activities
  • There is no way for multiple clients to efficiently receive only unseen Activities in a single round trip

A proposed solution

There may be better solutions, but I visualize this problem being solved by a REST front-end to a “transport-agnostic UDP” messaging system.  Under the hood, the system prefers direct UDP communication but can use Messenger Relay or HTTP if necessary.  The front-end seen by developers is an AtomPub interface very much like the Notifications interface.  When using HTTP, push messaging is enabled by parking requests for up to 30 seconds if no messages are waiting.

It is preferable to program against the local LOE.  Polling the local LOE for messages causes the local LOE to poll the cloud LOE, and if any participating devices are reachable, they are also polled if another push mechanism can’t be established.  It is important to be able to establish local P2P connections even if the cloud LOE isn’t reachable.  Care is needed to avoid round-robin message loops.

Messages would auto-generate a short MaxAge upon receipt by each LOE, similar to Activities.  It is probably most efficient to simply let Messages expire instead of explicitly deleting them.

In order to support multiple recipients of the same message, clients should be able to poll the queue with a nondestructive query string watermark.  An alternative might be to let each client create its own destructive NotificationQueue that receives full copies from the main message queue, but this assumes the 3-round-trips issue is solved.

This can be a soft state service with no need for additional infrastructure recovery features.

If it is necessary to impose the UDP equivalent of Twitter’s 140-character limit, that’s fine.  65,507 bytes seems like plenty.  Users can always just send links to large resources.  If expansions are supported, the receiver can optionally inline the large data on demand without necessarily transmitting it across the network if it already exists in the local LOE.

Addressability

Messages should be able to target one or more users and devices, and scope a broadcast by MeshObject/AppInstance.  This could be solved in a variety of ways.

You could scope all messages by MeshObject (MeshObjects/{id}/MessageQueue), with the option to subscope them further with links to Members and Mappings.  This is the option I prefer.  It doesn’t require directly addressing users and devices, it doesn’t require any changes to enable the “cloud device” to be addressed, and it should provide decent partitioning.  A downside is that it requires separate requests for separate MeshObjects.  However, if you program against a local LOE which multiplexes everything under the hood, this shouldn’t be a problem.

Or you could have one queue to rule them all at Mesh/MessageQueue.  Messages would have one or more links to users, devices, and/or MeshObjects.  Subscribers would poll the queue with the option of filtering on these links by query string.

Or you could add /MessageQueue to various scoping contexts such as Devices/{id} and {id}/Profiles.  This would require exposing the cloud LOE as a device entry.

I’m not sure if devices or mappings from other users are currently discoverable.  This would need to be addressed.

Device connectivity

I suspect there may already be a way to access existing device connectivity information, but if not, it would be very helpful to see which devices (including the cloud device) are reachable for real-time messaging.

Issues with just exposing Mesh’s P2P support

At the PDC I heard talk of exposing Mesh’s peer-to-peer channel to enable developers to establish connections between devices for streaming data.  Although this is a great solution for some scenarios (and I would like to have such a feature), it’s not so great for other scenarios.

Streaming in this context is a reliable, sessionful feature.  In order to get the lowest latency possible, I want unreliable, sessionless communication.

You can optimize for latency or for throughput, but not both.  Streaming is optimized for high throughput.  I want low latency (UDP, or disable TCP’s Nagle algorithm).

Bonus points: ad-hoc device discovery

This is a tangentially related issue that can be addressed independently.  Although apps can be written to select from a known list for pairing with other users and devices, I would love to have ad-hoc user/device pairing.

There are many, many ways to accomplish this.  I would like to use Microsoft Tag for pairing devices from unknown users and meshes.  One device displays a tag containing its mesh address, and the other device scans the tag.  The tag could be a printed tag, or displayed on a screen.  The tag exchange could be two-way for greater security.  For public kiosks, the tag could be changed every minute or generated on demand.

There are a whole host of additional issues raised by ad-hoc device pairing, but it sure sounds like the future to me.  Someone’s going to solve this problem in a generic way, please don’t let Apple get there first. ;-)

Conclusion

Hopefully the additional detail is a helpful starting point for discussion.  I am quite open to alternate solutions that can provide real-time P2P messaging.  As I noted earlier, if you like this idea, please visit this feature request on the Live Framework forum and vote it up.  If you have any comments, please post them on the forum thread.  Thanks!

No comments: