Thursday, March 12, 2009

RESTful UDP: a Live Framework Feature Request

Yes, I admit, “RESTful UDP” sounds unnatural, maybe even unethical.  I also admit that I’m getting ahead of myself, talking about design before requirements.  So what motivates me to ask for such a feature?

I want real-time P2P messaging between users, apps, and devices.  Notifications get us most of the way there, but they aren’t enough.  Before I discuss the issues with notifications, let’s talk about scenarios that motivate this.


Imagine walking up to a large-screen Mesh-enabled device such as an XBox, Microsoft Surface, or public kiosk.  You pair your Mesh-enabled smartphone and project its apps and data onto the big screen, with your smartphone acting as the data entry device.

Expanding on this scenario, imagine a game that takes advantage of the smartphone’s accelerometer and camera, turning your phone into a high-powered Wiimote with the entire touch screen used for control surface.  You might want to attach a wrist strap…

Imagine the cool apps you could write if a group of people shares real-time GPS data from smartphones and carputers.

This feature would be useful for more than just extending the capabilities of smartphones.  You could remotely control media playback, chat with people, push real-time financial data, and build a variety of interesting distributed apps that are designed to run in real-time across a mesh of devices, aggregating specialized device capabilities into a single composite experience.

Cross-platform support exponentially increases the possibilities and relevance of the mesh.  You can imagine special-purpose devices whose entire reason for existence is to be plugged into the mesh to supplement apps and user experiences.  This is true even without real-time messaging, but this capability is crucial for enabling the most seamless composite device experiences.

Who else wants this?

Back in December, Kevin Hoffman kicked off an extended discussion on this feature request.  Here’s a brief excerpt:

Low Latency: Other actions that people take within the application need to happen quickly. I need very low latency between when the action takes place and when the other client(s) are notified about the action. Think of these as instant messages, though with a domain-specific purpose. Some can be directed at an individual MEWAs, others can be broadcasts. I do not currently have a solution for the low latency.

I know that Silverlight applications cannot receive push messages because of their highly restricted network Sandbox. However, I'm wondering if they would be able to create an HTTP WCF service via .NET Services and host the proxy in the cloud that would allow near-real-time HTTP message posting between MEWAs... Is this possible?

Strages has a big wish list in the Mesh forum that includes:

as mobile phones and laptops will get close and closer together.. there will get a time that you just plug your phone to a local monitor and keyboard.. or even better bring your own flexible seized monitor etc etc.. you get it

John Macintyre’s PDC session had a demo of remotely controlling a Media Center.  Sync delays in the demo made for a not so seamless experience.

Warning: detailed discussion ahead

I could probably end the feature request here.  Please take everything that follows as optional food for thought.  If you like what you’ve heard so far and just want to vote for this, please visit this feature request on the Live Framework forum and vote it up.  If you have comments, please post them on the forum rather than here.

Why UDP?

First, I don’t intend UDP to be taken literally (well maybe I do, but that’s an implementation detail).  Specifically, I’m interested in the following UDP characteristics:

  • One-way messaging
  • Low latency
  • Stateless (lossy, no sessions, unordered, etc.)
  • Multicasting

It shouldn’t be necessary to expose the concept of P2P session initiation, even if that is ultimately an implementation detail.


I do intend REST to be taken literally.  I’m interested in the following REST characteristics:

  • URI-addressable resources
  • Hyperlinks to other resources
  • Arbitrary user content
  • Transport-agnostic (yes, I think that’s RESTful)
  • POST and GET (PUT and DELETE don’t make sense here)

Links to arbitrary resources are an important tool for keeping message size small.

Being transport-agnostic is important, for performance and for constrained network environments.  Of course HTTP should be supported, but it should also be possible to use TCP, UDP, the Messenger Relay, or even use SMS like Mesh4x does.  Just like with Notifications, there should be no need for senders and receivers to use the same transport or representation.

Making this feature available to plain old DHTML Mesh apps would be pretty amazing.


I would also like support for some Mesh-specific features:

  • AtomPub feed-based model
  • Multiple representations (ATOM, POX, JSON, Binary XML)
  • Expansions
  • “LINQ to REST”
  • Send messages from triggers (a fine-grained alternative to subscriptions)
  • Local LOE support

Why aren’t Notifications enough?

In my Notifications post I covered some of the issues that make Notifications less than ideal for real-time messaging, even when paired with Activities.

  • Three round-trips per notification cycle (can be reduced to two using expansions)
  • Watermarks are destructive
  • You can’t directly publish your own notifications
  • Notification queues are per-user and designed for single-threaded use (one per client)
  • Each app does its own polling rather than sharing one connection, unlike iPhone’s push notifications
  • Notifications are one-way from cloud to client (no client-to-cloud or P2P)
  • Notification polls don’t chain from local LOE to cloud LOE
  • While Notifications are usually instantaneous, they can become backlogged and delayed by several minutes
  • The local LOE doesn’t currently implement Activities
  • There is no way for multiple clients to efficiently receive only unseen Activities in a single round trip

A proposed solution

There may be better solutions, but I visualize this problem being solved by a REST front-end to a “transport-agnostic UDP” messaging system.  Under the hood, the system prefers direct UDP communication but can use Messenger Relay or HTTP if necessary.  The front-end seen by developers is an AtomPub interface very much like the Notifications interface.  When using HTTP, push messaging is enabled by parking requests for up to 30 seconds if no messages are waiting.

It is preferable to program against the local LOE.  Polling the local LOE for messages causes the local LOE to poll the cloud LOE, and if any participating devices are reachable, they are also polled if another push mechanism can’t be established.  It is important to be able to establish local P2P connections even if the cloud LOE isn’t reachable.  Care is needed to avoid round-robin message loops.

Messages would auto-generate a short MaxAge upon receipt by each LOE, similar to Activities.  It is probably most efficient to simply let Messages expire instead of explicitly deleting them.

In order to support multiple recipients of the same message, clients should be able to poll the queue with a nondestructive query string watermark.  An alternative might be to let each client create its own destructive NotificationQueue that receives full copies from the main message queue, but this assumes the 3-round-trips issue is solved.

This can be a soft state service with no need for additional infrastructure recovery features.

If it is necessary to impose the UDP equivalent of Twitter’s 140-character limit, that’s fine.  65,507 bytes seems like plenty.  Users can always just send links to large resources.  If expansions are supported, the receiver can optionally inline the large data on demand without necessarily transmitting it across the network if it already exists in the local LOE.


Messages should be able to target one or more users and devices, and scope a broadcast by MeshObject/AppInstance.  This could be solved in a variety of ways.

You could scope all messages by MeshObject (MeshObjects/{id}/MessageQueue), with the option to subscope them further with links to Members and Mappings.  This is the option I prefer.  It doesn’t require directly addressing users and devices, it doesn’t require any changes to enable the “cloud device” to be addressed, and it should provide decent partitioning.  A downside is that it requires separate requests for separate MeshObjects.  However, if you program against a local LOE which multiplexes everything under the hood, this shouldn’t be a problem.

Or you could have one queue to rule them all at Mesh/MessageQueue.  Messages would have one or more links to users, devices, and/or MeshObjects.  Subscribers would poll the queue with the option of filtering on these links by query string.

Or you could add /MessageQueue to various scoping contexts such as Devices/{id} and {id}/Profiles.  This would require exposing the cloud LOE as a device entry.

I’m not sure if devices or mappings from other users are currently discoverable.  This would need to be addressed.

Device connectivity

I suspect there may already be a way to access existing device connectivity information, but if not, it would be very helpful to see which devices (including the cloud device) are reachable for real-time messaging.

Issues with just exposing Mesh’s P2P support

At the PDC I heard talk of exposing Mesh’s peer-to-peer channel to enable developers to establish connections between devices for streaming data.  Although this is a great solution for some scenarios (and I would like to have such a feature), it’s not so great for other scenarios.

Streaming in this context is a reliable, sessionful feature.  In order to get the lowest latency possible, I want unreliable, sessionless communication.

You can optimize for latency or for throughput, but not both.  Streaming is optimized for high throughput.  I want low latency (UDP, or disable TCP’s Nagle algorithm).

Bonus points: ad-hoc device discovery

This is a tangentially related issue that can be addressed independently.  Although apps can be written to select from a known list for pairing with other users and devices, I would love to have ad-hoc user/device pairing.

There are many, many ways to accomplish this.  I would like to use Microsoft Tag for pairing devices from unknown users and meshes.  One device displays a tag containing its mesh address, and the other device scans the tag.  The tag could be a printed tag, or displayed on a screen.  The tag exchange could be two-way for greater security.  For public kiosks, the tag could be changed every minute or generated on demand.

There are a whole host of additional issues raised by ad-hoc device pairing, but it sure sounds like the future to me.  Someone’s going to solve this problem in a generic way, please don’t let Apple get there first. ;-)


Hopefully the additional detail is a helpful starting point for discussion.  I am quite open to alternate solutions that can provide real-time P2P messaging.  As I noted earlier, if you like this idea, please visit this feature request on the Live Framework forum and vote it up.  If you have any comments, please post them on the forum thread.  Thanks!

Sunday, March 08, 2009

Exploring Live Framework Notifications

In order to tune in to the data that matters to you, Live Framework lets you receive near-real-time notifications of changes by subscribing to objects and feeds.  Although the SDK documentation is fairly sparse at the moment, Viraj Mody wrote an excellent blog post describing how the notification system works, and John Macintyre also gave a great PDC session on notifications.  Being the curious sort, I still had many questions, so I dug in further and this blog post and ResourceClient library are the result.


Notifications are exposed in the high-level programming model through the ChangeNotificationReceived event.  Under the hood, this is made possible by queues, subscriptions, and notifications working together.  These three building blocks can be accessed through the resource-oriented programming model, enabling even more interesting scenarios.  Even if you have no desire to program at this level, understanding how notifications work can help you take fuller advantage of the high-level model.

Slide 9 from John Macintyre’s presentation shows how the pieces tie together.


The typical usage pattern looks like this:

  • Client creates a queue
  • Client subscribes its queue to one or more resources or feeds
  • Client polls the queue’s Notifications feed
  • A resource or feed changes
  • Subscription Service posts a notification to each of the queues that are subscribed to the resource or feed
  • Client’s poll returns with one or more notifications
  • Client takes the watermark of the most recent notification and posts it to the queue
  • Client takes some action based on the notifications such as requesting the newest version of the changed resource or feed
  • Client polls the queue’s Notifications feed…

After the initial queue creation, a minimum of three round-trips are necessary to complete each cycle.  Later we will see a trick for doing this in two round-trips.

Although the client has to poll the queue, this is actually more of a push than a pull.  The HTTP request stays “parked” at the server until a notification arrives or a timeout expires (just like relay binding’s HTTP “parked requests” in BizTalk Services .NET Service Bus).  This means that when a notification arrives, the server can immediately push it to the client on the HTTP response of the parked request, saving half a round-trip of latency.  Jeremy Mazner has more to say on the subject in this Channel9 thread.

So far I have described the behavior of the cloud LOE.  We will cover the client LOE’s slightly different behavior later.

Recovery from failures

In John’s diagram above, the Queue Service and Subscription Service are both in-memory soft state services.  This means it is possible for a Queue Service instance to go down resulting in queue loss (including all its notifications).  A Subscription Service instance can also go down, resulting in loss of subscriptions.  In both cases, the client will receive a specific error notification the next time it tries to poll the queue.

In the case of queue loss, it is the responsibility of the client to create a new queue and resubscribe to each resource.  In the case of subscription loss, the client is told “resources from these addresses lost subscriptions” and it just needs to resubscribe to each one using its existing queue.

As Viraj notes in his blog post,

A short summary of the solution is that in cases where one or several Queue and/or PubSub Servers go down, the system is able to detect exactly what happened and take remedial action to restore state in the cloud in cooperation with clients (because clients were the original source for all the transient data that was resident on those servers before they lost state).

What can you subscribe to?

From the high-level object model, only the following types are subscribable via ChangeNotificationReceived:

  • MeshObject
  • MeshDevice
  • LiveItemCollection (feeds)
    • Mesh.Devices
    • Mesh.MeshObjects
    • Mesh.News
    • MeshObject.DataFeeds
    • MeshObject.Mappings
    • MeshObject.Members
    • MeshObject.News
    • These may not work yet:
      • Contact.Profiles
      • LOE.Contacts
      • LOE.Profiles
      • Member.Profiles

If you use the resource-oriented programming model, you can subscribe to all of the resources behind the high-level objects as well as:

  • ApplicationResource
  • ApplicationInstanceResource

In addition to the high-level feeds, you can also subscribe to feeds for:

  • MeshObject.Activities
  • Applications
  • InstalledApplications

One quirk is that MeshObject isn’t subscribable from the local LOE, although you can still subscribe to the local MeshObjects feed.

It is useful to consider the MeshObject->DataFeed->DataEntry hierarchy in terms of what is and isn’t subscribable.

  • Mesh/MeshObjects/Subscriptions
  • Mesh/MeshObjects/{id}/Subscriptions
  • Mesh/MeshObjects/{id}/DataFeeds/Subscriptions
    • this is a “feed of feeds”
  • Mesh/MeshObjects/{id}/DataFeeds/{id}/Subscriptions
    • this doesn’t exist
  • Mesh/MeshObjects/{id}/DataFeeds/{id}/Entries/Subscriptions
  • Mesh/MeshObjects/{id}/DataFeeds/{id}/Entries/{id}/Subscriptions
    • this doesn’t exist

What notifications do you receive?

You would expect all resources and feeds to notify you when resources are created, updated, or deleted, but that isn’t the case.  Some resources and feeds don’t notify you when entries are updated, some aren’t subscribable, and some are read-only.  This varies between the cloud LOE and the local LOE and between various objects.  Here’s an incomplete listing of which notification triggers work based on my experimentation:

  Local Cloud
MeshObject Not subscribable Update, Delete
MeshDevice Not subscribable Nothing? at least not Update
MeshObjects feed Create, Update, Delete Create, Delete (no Update)
Devices feed Can’t update locally Update, maybe others
DataFeeds Create, Update, Delete Create, Update, Delete
DataEntries Create, Update, Delete (with double notifications for each) Create, Update, Delete

I’m not sure why, but a subscription to DataEntries creates two notifications for each change on the local LOE.

How subscriptions work

As you have seen, you can subscribe to any resource or feed that has a Subscriptions URL (ex: Mesh/MeshObjects/{id}/Subscriptions).  This Subscriptions feed only supports HTTP POST (no GET).  Individual subscriptions only support PUT (no DELETE).

Subscriptions have the following interesting properties:

  • NotificationQueueLink
  • ExpirationDuration
  • ResourceEntityTag

NotificationQueueLink is the only thing you need to include when you create your subscription.  This link should point to the SelfLink of the NotificationQueue you have created, not its NotificationsLink.

ExpirationDuration is expressed in seconds and is assigned a random number between 2700 and 4500 (45 min to 75 min) from the cloud LOE and a fixed value of 3600 (60 min) from the local LOE.  The internal class NotificationManager that is used to implement ChangeNotificationReceived has a hard-coded subscriptionRenewalInterval of 60 minutes, so it seems there’s a chance of cloud subscriptions being in an unrenewed state for up to 15 minutes, but I could be wrong.

ResourceEntityTag is the ETag of the resource when the subscription was created.  I believe a notification is sent each time the resource’s ETag changes.

Note that the subscription doesn’t have a link to the resource or feed you subscribed to.  It is important for you to maintain your own copy of the URI to the resource you subscribed to.  First, if you want to imitate ChangeNotificationReceived and execute different event handlers for different subscriptions, you will need this URI to demux from notifications to event handlers.  Second, you will need this URI to create new subscriptions when you receive an AllSubscriptionsLost notification.

Clients should track their subscriptions for a variety of reasons:

  • You can’t GET the subscription feed
  • You can’t GET a subscription via its SelfLink
  • For subscription renewal
  • To recover from queue or subscription loss

How notification queues work

Notification queues are created by posting to Mesh/NotificationQueues.  If using AtomPub, you can simply post an empty entry:

<entry xmlns=""/>

You can’t GET Mesh/NotificationQueues to see which queues exist, so you will need to maintain a reference to the queue you get back.

Queues are intended to only have a single consumer (one queue per client), and that consumer’s queue usage should effectively be single-threaded.

There are a few important properties on queues:

  • ExpirationDuration
  • Watermark
  • SelfLink
  • NotificationsLink

ExpirationDuration is expressed in seconds and is 300 (5 min) from the cloud LOE and 600 (10 min) from the local LOE.  ExpirationDuration is the duration after which the notification queue expires if its Notifications feed isn’t polled.

Watermark is used together with PUT to remove all notifications in the queue with watermarks less than or equal to the watermark you sent.  Updating a watermark is destructive.  You can’t PUT an earlier watermark to roll back the queue, and that’s fine with me.

SelfLink is the URI you use for a new subscription’s NotificationQueueLink.  You also PUT to the SelfLink URI when updating the queue’s watermark.  SelfLink is PUT only.  You can’t GET or DELETE a queue.

NotificationsLink is the feed where new notifications appear.  This feed is not your typical feed.  As I mentioned earlier, polling the NotificationsLink “parks” the HTTP request on the server until a notification appears or a timeout expires.  This timeout varies between 25 and 30 seconds for the cloud LOE and is hopefully low enough so that intermediary proxies don’t prematurely close the connection.  Requests will return immediately if the queue already contains notifications.  Unfortunately, requests to the NotificationsLink on the local LOE always return immediately whether or not the queue is empty.  This is a significant problem not only because the programming model is inconsistent, but because it means that in order to get the same low latency as the cloud LOE you have to hammer the snot out of the local LOE, pegging the CPU in the process.  Hopefully this gets fixed.  The SDK sidesteps this issue by only polling every 5 seconds, but that loses much of the benefit of push notifications.

The notifications feed behaves strangely when it contains more than one notification.  Sometimes it displays all of the notifications (up to 10), and sometimes it only displays the first one.  For example, you can poll the queue and see 4 notifications.  You can poll it again and perhaps only see one.  Polling a third time might show all 4 again.  At least it always displays notifications in order (oldest first).  This means you shouldn’t count on seeing more than one notification at once, but be aware that it is possible.  Simply act on all of the notifications you receive, PUT the watermark of the last one, and poll the queue again.  Using this technique you will eventually see all of the notifications in the queue, even if it appears there is only one or if the queue appears to be clipping the results at 10 notifications.

Queues are semi-private.  They can’t be seen by other user accounts and can’t be shared.  Since you can’t enumerate queues with GET and their URLs are randomly generated, the only way another app or client within the same user account can see your queue is if you choose to share the queue’s URI, although I can’t think of any good reason to do that other than for testing.

Queue loss

Queues can be lost either by failing to poll them within their ExpirationDuration or by the unexpected death of a queue manager in the LOE.  You will find out the queue is gone when you poll the queue’s notifications feed and receive an AllSubscriptionsLost notification.

You might be wondering, if the queue is gone, then how can I poll its notifications feed?  It turns out that if a queue doesn’t exist at a particular URI, a new “dead” queue will be automatically created at that URI.  You can see this by connecting to the cloud LOE with the LiveFX Resource Browser and visiting the following URI:


Note that this only works on the cloud LOE.  The local LOE will return an error.  Apparently you shouldn’t expect to lose a queue on the local LOE.

A weird bit of trivia is that you can create two different queues with identical URIs in two different user accounts.  I’m guessing this is because queue managers are allocated per-user.

Another weird bit of trivia is that the AllSubscriptionsLost notification has a watermark that increments each time a subscription attempts (and fails) to post a notification to the queue.  This is one reason why the name AllSubscriptions lost is misleading and ought to be renamed to something like QueueLost.  The subscriptions are most certainly still alive.

How notifications work

You can’t create notifications directly.  In other words, you can’t POST a new notification to a queue’s Notifications feed.  Notifications are only created by subscriptions (and queue or subscription loss).

Notifications have the following interesting properties:

  • NotificationType
  • ResourceLink
  • Watermark

NotificationType can be one of the following:

  • ResourceChanged
  • SubscriptionLost
  • AllSubscriptionsLost
  • System

ResourceChanged is what you will see as the result of subscribing to a resource.  SubscriptionLost is received when a Subscription Service unexpectedly dies on the cloud LOE.  SubscriptionLost is not received when a subscription expires due to its ExpirationDuration reaching zero.  As mentioned earlier, AllSubscriptionsLost would probably be better named QueueLost.  I’m not sure if System is used anywhere in the public API.  Perhaps it’s used for device connectivity.

ResourceLink points to the feed or object that changed, or in the case or SubscriptionLost, I believe it points to the resource whose subscription was lost.

Watermark is a counter string that increases with each new entry.  On the cloud LOE these look like “1.248.0”, “2.248.0”, “3.248.0”.  On the client LOE they are simply “1”, “2”, “3”.

Notification SelfLinks are incrementing integers on the cloud LOE and GUID-like strings on the local LOE.  You can’t GET a notification using its SelfLink.  You can only see notifications by polling the notifications feed.  You also can’t do queries on notification feeds.  If you try to use something like $skip or $top, you will get an AllSubscriptionsLost notification and every notification in the queue gets discarded without you ever seeing them.  However, the subscriptions still work and the queue continues to function normally afterward (other than the data loss).

Notifications and expansions

A cool trick for saving a round-trip is to use expansions to return changed resources and feeds inline in the notification results.  A not-so-cool potential side-effect is that if the notifications feed returns more than one notification for the same resource or feed, the expansion will result in duplicate expanded data, wasting bandwidth.

If you could combine watermark updates with the next poll request, you could get the poll-watermark-update cycle from 3 round-trips down to just 1 round-trip.

Notifications and activities

When you combine notifications, expansions, activities, and the cloud LOE, this enables near-real-time messaging between clients.  By subscribing to an Activities feed, clients can poll the Notifications feed using $expand and receive complete activity entries as soon as someone posts a new activity.  The latency in this scenario is half a round-trip to post the activity plus half a round-trip to receive the activity through your parked HTTP request to the notifications feed.  Since notifications, subscriptions, and activities all use in-memory stores, this should have quite good performance.  There are a number of issues with this technique that I’ll cover in a future blog post, but it is promising for near-real-time communications.

Sync bypasses update notifications

Sync appears to bypass notification of updated feed entries.  Specifically, if I subscribe to the MeshObjects feed on the local LOE, I am notified when sync from the cloud causes a MeshObject to be added or removed from the feed, but I am not notified if sync causes a MeshObject to be updated.  I haven’t experimented with other feed types, but the same issue might exist with feeds such as DataEntries.

Trigger support

You can use Create and Update triggers on subscriptions and queues.  Delete triggers aren’t persisted and therefore won’t work.  You might use this to POST a queue and create subscriptions for it in its PostCreateTrigger, saving a round trip.  Of course these subscriptions wouldn’t be tracked and managed by the SDK.  You could also create a MeshObject and add a subscription to it in the MeshObject’s PostCreateTrigger.

Updating resources and feeds through Resource Scripts or triggers doesn’t bypass notifications.

Client vs. Cloud

The client LOE waits for data to sync to it from the cloud before notifying you of any changes.  This can take a while, so if you want quick notifications, be sure to subscribe to the cloud LOE, not the local LOE.  It would sure be nice if local subscriptions caused the local LOE to subscribe to the same resource in the cloud (if connected), taking advantage of push notifications to achieve the same latency for local subscriptions as if you were subscribed directly to the cloud LOE.

It is probably stating the obvious at this point, but I think it’s worth noting that queues are one-way from cloud to client.

Notification delays

Although notifications are normally posted to queues immediately, it is possible for notifications to be delayed if there is a large backlog generated by lots of rapid updates to a subscribed resource or feed.  You might see a few updates trickle in, then 15 seconds later another batch of updates appears, and so on, for several minutes.

Device connectivity

Supposedly device connectivity uses the subscription and notification services under the hood as a signal channel for P2P session establishment.  It may be possible to see this in action and even participate in the process, but I haven’t explored this.  There must be a reason you can subscribe directly to individual MeshDevices, but I haven’t seen anything interesting pop up yet.  Check out George Moore’s description of P2P notifications and file sync for a fascinating scenario that I’m not sure is possible with the current CTP.

Other transports

Supposedly there is a TCP transport for receiving push notifications, but it doesn’t appear to be used or available at the moment.  This could be useful for chaining subscriptions from the local LOE to the cloud LOE (or other clients) while preserving an HTTP programming experience for developers.

The high-level programming model

If you have managed to read this far, you should now have a healthy appreciation for the services provided by the high-level programming model.  At this level, all we see is the ChangeNotificationReceived event on feeds, MeshObject, and MeshDevice.  You simply subscribe to this event on the appropriate object and your event handler will be called when entries are created (on feeds only of course), updated, and deleted.

Here are some of the gory details it takes care of for you:

  • Queue creation
  • Queue polling
  • Updating the watermark
  • Subscription creation
  • Subscription renewal
  • Recovering from queue loss
  • Recovering from subscription loss
  • Demuxing from notifications to event handlers

Let’s dig into that last one a bit deeper.  If you use Reflector to examine how ChangeNotificationReceived works, you will see that each object subscribes to receive any and all notifications that arrive on the queue.  When these notifications are received, each object iterates through all notification entries, checking to see if the notification’s ResourceLink matches its own SelfLink.  If there is a match, the object raises the ChangeNotificationReceived event and stops iterating the notification list, effectively filtering out multiple notifications for itself that might have been received in a single response.  It appears that if the client’s LiveOperatingEnvironment is configured with AutoLoadRelationships, the object will then be reloaded, after the event is raised, meaning that if you examine the object in your event handler, it may not contain the latest changes.  This is reported in the forums here and here and is supposed to be fixed in the next release.  Those threads also mention a performance issue with notifications when using the Silverlight library that will be fixed in the next release.

Programming at the Resource level

As part of my explorations, I wrote a library called ResourceClient that makes it easier to work directly with Resources such as queues, subscriptions, and notifications.  Here’s a brief example of the syntax it enables.

using (new ResourceClientContext(username, password))
    var queue = Uris.Cloud.NotificationQueues.Post(new NotificationQueueResource());
    queue.StartAutoPoll((notifications, context) =>
        if (notifications.Entries.Count() > 0)
    var mo = Uris.Cloud.MeshObjects.Post(new MeshObjectResource("My Object"));
    var feed = mo.DataFeedsLink.Post(new DataFeedResource("My Feed"));

This code writes to the console Atom-formatted notifications for the new MeshObject and the new DataFeed.  I will cover the details of this library in another blog post.  For now I will just mention that in addition to being a generic resource-oriented API, it contains notification-specific helpers for automatic queue polling, manual polling, sending watermarks, subscribing, and dispatching to different event handlers for each subscription.

MeshNotificationPlayground sample app

I have written a little WPF app that demonstrates queues, subscriptions, notifications, and activities in action.


You can create a queue, poll it, copy the queue’s URL to the clipboard for pasting into Resource Browser, select a notification and send its watermark, select a MeshObject and subscribe to its Activities feed, and create new Activities for the selected object.

Here are some things to try:

  • Poll an empty queue and see that the request returns after 25 to 30 seconds
  • Poll the queue with the WPF app and Resource Browser at the same time, notice that they both wait, then cause a new notification and see that both requests return immediately
  • Subscribe to multiple Activities feeds, create activities in each of them, and notice the resulting notifications have different ResourceLinks
  • Send a watermark that isn’t the last watermark and see that the queue only empties up to the watermark you sent
  • Wait 5 minutes and poll the queue to see an AllSubscriptionsLost notification
  • After receiving AllSubscriptionsLost, create more activities for subscribed feeds and see that AllSubscriptionsLost’s watermark increases each time
  • Click Refresh Activities List and watch MaxAge count down for each activity. Notice the random MaxAges.  When a maxAge reaches zero, the activity disappears.  See that this causes a new notification.
  • Create more than 10 notifications and see that no more than 10 are returned.  Send the latest watermark, poll again, and see that the remaining notifications appear.
  • Poll the queue repeatedly when it has multiple notifications and see that sometimes only 1 notification appears.
  • Click Create Activity many times in a row and see that notifications for that Activities feed continue to trickle in several minutes later.

You can download the code here.  It includes and uses the ResourceClient library.

Comparison to iPhone’s Push Notifications

If for some reason your eyes haven’t glazed over yet, check out Viraj’s analysis of the iPhone Push Notification Service.  If you read between the lines, this is a fascinating compare-and-contrast to Live Mesh’s notification solution, with Live Mesh being better in many ways.  It hints that Microsoft might imitate Apple and use notifications to mine usage metrics for apps.

Apple’s solution is slightly more efficient in that it creates a single channel from the cloud to each device, whereas Live Framework currently requires each app to establish its own channel.  This could be solved by having apps subscribe to the local LOE using local queues which would then transparently chain the subscriptions and queues through a single channel to the cloud LOE.


Hopefully this helps clarify how you should expect notifications to behave in your apps, as well as providing ideas for more creative uses of the building block features of queues, notifications, and subscriptions.  I plan to follow up with more details on my ResourceClient library, and write up a feature request that combines the best parts of notifications and activities to enable better real-time communication between clients.