The Case for Reliable Messaging
The success of interaction between a service consumer and a service provider, is clearly dependent on how much reliable the underlying transport protocol is. However when it comes to networks, we know that even in the most reliable setups, all sorts of things can go wrong.
Let’s consider some scenarios of common problems that might occur during communication between a client and a service.
Lets start by the simplest of scenarios: a client tries to send a message to a service. However, during transmission, something goes wrong on the wire, and the message cannot be delivered.
In this case, there should be a way for the client to be notified about the transmission failure, so that it can send the message again.
In this scenario, the message is actually received at the service. However, before the service could confirm (or acknowledge) the client that it received the message, again there was a problem in the network, and the service does not receive the acknowledgment. In this case, the client risk sending the same message again, which will result in message duplication at the service side.
So, there should either be a way to detect and ignore these duplicated messages at the service side, or to eventually send the acknowledgment to the client at the right time so that it knows not to send the message again.
Another more complicated scenario arises, when we consider the transmission of a sequence of related messages. In these type of scenarios, the client is sending the service a series of messages that – for some business reason – the service must process in the same order that was sent by the client.
For example, assume the client sends Message 1 of the sequence. Which is received by the service. Now the client sends Message 2, which because of network latency caused by a specific route, is delayed to be delivered. The client – unaware of this delay – then sends Message 3. Which is received by the service. Moments later, network latency for this specific route is resolved, and Message 2 is received by the service.
Here there is a problem, that from the perspective of the service, it received all messages, however, it cannot tell if the order is correct, because it could not have possibly known that Message was actually delivered after M3. So, there should be a way for the service, to know exactly what was the order of messages as intended by the client.
Of course keep in mind, that what I showed you here is a simplification of what would happen in the real world. Because the service would actually be engaged in communication with multiple instances of the same client…so not only does the service have to detect the correct sequence of messages, it also has to correlate each sequence with the correct client instance.
A Reliable Protocol is Needed
These scenarios, highlight the importance of having a protocol that assures reliable message transmission.
But wait: SOAP messages are most commonly transferred over HTTP, which itself uses TCP as a transport protocol. So:
- Isn’t TCP a reliable transport protocol?
- And isn’t it enough to provide the required reliable transmission?
Well, the answer to these questions, is yes; and it depends on the case. Let’s discuss this in detail in the next section.
In the OSI model, TCP functions in a layer below HTTP. TCP is a transport protocol; while HTTP is an application protocol. TCP works at the packet level. It does its job making sure network packets are delivered reliably and in the required order.
However, HTTP needs reliable messaging mechanism on the SOAP message level, and not on the network packet level as offered by TCP. There are three core differences between WS-RM reliability and TCP reliability:
- Although SOAP is most commonly used on top of HTTP, but HTTP is not mandatory. WS-RM provides reliability in a transport-neutral manner, and not just for TCP.
- Using WS-RM we can achieve reliability even if we are not using a single TCP connection (for example keep-alive is off). Multiple messages over different TCP connections will still be part of the same session
- TCP reliability works only between the current node and the immediate next node in the transport (i.e. between the two endpoints/sockets of a TCP connection/session). WS-RM provides end-to-end reliability across multiple nodes (after all as we will see later, the ‘reliability’ information is part of the message itself so it travels with the message to the final destination/node)
WS-ReliableMessaging (WS-RM) Specification Overview
End-to-End Transport Independence
The previous two sections have provided the right build-up, for you to understand the need for WS-ReliableMessaging. This standard, provides reliable messaging on the SOAP message level regardless of the used protocol.
This means that, even if a message is going through multiple intermediaries, on route to its final destination, because reliability is achieved at message level, it does not matter if the message actually goes through multiple protocols:
For example, a message can be carried over HTTP on the first route, then over SMTP on the second route, then over whatever SOAP Transport Binding, such as FTP, on the final route. Reliability is built into the SOAP message, and it doesn’t matter what the underlying transport protocols are.
The participants in RM are logically grouped as follows.
- The Application Source, represents the service consumer.
- While the Application Destination, represents the service provider.
- The Reliable Message Source, and Reliable Message Destination, represent the layers that take care of RM processing. These layers are implemented by each vendor wishing to add support for RM; later I will discuss how WCF implements these layers through the channel stack.
WS-ReliableMessaging acting in the Reliable Messaging Source and Reliable Messaging Destination layers, guarantees that:
- A message sent from the Application Source to the Application Destination is able to survive connectivity issues
- The Application Destination notifies back the Application Source about message arrival so that the source does not send duplicate messages
- However, even if a duplicate message is sent – for example because the acknowledgement was delayed or never received – even in this case, the duplicate message is neglected
- And finally, messages are processed in the Application Destination in the same order they were sent by the Application Source, even if they were not actually received in order
Later I will explain how these guarantees are built into WCF.
The key to the functioning of WS-RM, is the concept of a sequence. All interactions between a client and a service take place within this sequence.
Through an example, lets see how the sequence works, and the message types that form the Reliable Messaging standard.
- A client wishing to engage in reliable messaging interaction with a service, starts the interaction by sending a create sequence request message
- The service replies back with a create sequence response message, which contains a new sequence identifier
- The client now has the sequence id that will be used to transmit a series of messages.
- So, the client sends the first message of the sequence. The message has a sequence number of 1
- Now, before receiving any acknowledgment from the service, the client sends the second message of the sequence. Only in this case, the message was never delivered
- The client, goes ahead and sends the third message of the sequence. Inside this message, the client informs the service that this is the final message in the sequence
- At this stage, the service has received the first and the third messages; but never received the second one, and in fact it has no idea that there is one. So, the service sends to the client a SequenceAcknowledgment message confirming that it received messages 1 and 3
- The client in this case, knows that message 2 was not delivered to the service, because its id was not part of the acknowledgment. So the client, resends message 2, and informs the service to send an acknowledgment once the message is received
- The service receives the message, and sends a new SequenceAcknowledgment message, containing the ids of the 3 messages.
- The client now is assured that all 3 messages of the sequence are delivered; so it sends a Terminate Sequence message
Of course because all messages have been assigned a sequence number, the service knows to process the messages in the order originally intended by the client; even though in this case, the 3rd message was delivered before the 2nd.
CreateSequence / CreateSequenceResponse
In this display, you can see a request message with Action header set to CreateSequence. This is the request by which the client is asking the service to initialize a new sequence and return back a sequence identifier.
In the response, the service replies back with a CreateSequenceResponse message. Inside this message, you can the randomly generated sequence identifier. From this moment, the client will use this identifier when sending messages that are part of the same sequence.
Sequence / MessageNumber
This display shows the client calling service operation GetData. The important thing to notice, is the Sequence header which indicates that this message is part of a WS-ReliableMessaging sequence.
The Message Number is 1, which tells the service that this is the first message of the sequence. recall that this is important so that the client knows in which order to process messages if they were not received in the original order.
Finally, the identifier, is the sequence identifier generated by the service in the previous step.
This display, shows a sequence acknowldgement sent from the service to the client. Remember that the acknowldgement is the mechanism by which the service tells the client what messages are successfully received.
The AcknowledgmentRange element, sets the range of the messages received. In this case the service is acknowledging messages 1 and 2.
This message from the client to the service, has the action header set to LastMessage, which is telling the service that this is the last message in the sequence
Finally, this is the TerminateSequence message, where the client tells the service that it has completed transmission and it can terminate the sequence.
Reliable Messaging in WCF
The WCF Runtime
As I explained before, reliable messaging participants, are logically divided into an:
- Application Source, representing a client; or an Application Destination, representing a service.
- The other participant, is either a Reliable Message Source, which takes care of reliable messaging processing for the client, or a Reliable Message Destination, which takes care of reliable messaging processing for the service
On the other hand, previously, I explained that the WCF runtime is divided into
- a service layer, which developers typically code against, and its known by the name of proxy at the client side and dispatcher at the service side
- the channel layer on the other hand contains a set of binding elements. One of these binding elements is the ReliableSessionBindingElement
The Application Source or Destination, map to the WCF Service layer, while the Reliable Messaging Source or Destination, map to the ReliableSessionBindingElement:
Now lets see an illustration about how the whole interaction is implemented in WCF.
- A client service layer, sends a message down the channel stack. The RM channel picks up the request.
- The channel, holds the message in an internal cache. The reason it stores the message in the cache rather than sending it to the client, is that first it needs to initiate a new sequence.
- So the client RM channel sends a CreateSequence request to the service RM channel
- The service RM channel sends back the CreateSequence response with the sequence id
- Now, the client channel pulls the message from the cache, and sends it to the service channel.
- Assuming a Request/Response operation, then the message is first passed to the service layer, the request is processed and a response is generated. The response is then handed to the channel, which embeds a sequence acknowledgment as part of the response message. The response is then sent to the client channel, which hands it over to the client service layer
WCF Implementation: Ordered Delivery
Now following the same protocol, if ordered delivery is enabled; assume that the client channel sent two messages M2 and M3.
- For some reason, M3 is received by the service channel before M2.
- The service channel holds a cache. When it receives M3, it sees that the sequence number is 3, so it knows that a message of sequence number 2 is not received yet. Therefore, it holds M3 in the cache.
- It then waits until M2 is received; and only then delivers M2; followed by M3 to the service layer
WCF Implementation: Resending Messages
Now let’s examine the following scenario:
- the client sends message M4, during transmission network goes down and M4 is lost.
- The client waits for the acknowledgement from the service. After the configured time is elapsed, because the acknowledgement is not received, it resends M4 again.
- The key thing to notice here, is that the message is actually resent from the client channel cache, without asking the service layer to resubmit the message.
- This time, the message is delivered and the acknowledgment is received; now the client channel removes the message from the cache.
WCF Implementation: Removing Duplicates
The final scenario I want to discuss, is that of redundant messages.
- In this case, the client sends M5. M5 is actually delivered to the service.
- However, while sending the acknowledgment back, the network goes down.
- So the client – having not received the acknowledgment – is tricked to think that M5 was not delivered. So it resends M5 again.
- The service receives M5 a second time. However, the service channel keeps track of all message sequence numbers inside its cache, and it finds out that M5 is a duplicate so it silently discards the message and sends back the ack to the client so that it stops retrying.
Bindings Session Support
<reliableSession> Binding Element
Sessions are enabled on bindings that support RM, by using the <reliableSession> binding element. This element has two attributes.
- “enabled”, enables reliable message delivery
- While “ordered”, enables in-order processing by the service model
When defining a custom binding, you can set more properties for the <reliableSession> binding element:
For example, here I am setting the acknowledgementInterval, which is the maximum time interval the channel is going to wait to send an acknowledgment for messages received up to that point. And the maxRetryCount which specifies the maximum number of times a reliable channel attempts to retransmit a message it has not received an acknowledgment for…
Overall you can check all attributes that you can specify using this link:
Out of the Box Bindings
The out-of-the bindings, vary in their support to RM:
- It should come as no surprise, that BasicHttpBinding does not support RM
- WSHttpBinding is of course built to support all WS-* headers, so it supports RM as long as its enabled using the <reliableSession> binding element
- WSDualHttpBinding also supports RM, and it does so implicitly, because it relies on RM for callbacks. You saw this binding in action in the previous module
- NetTcpBinding also supports RM when its enabled. Recall though that this is not an interoperable binding.
- The NetNamedPipeBinding relies on windows support for reliable message delivery and reliable streams through named pipes. Therefore RM is not needed.
- Similarly, NetMsmqBinding does not need RM as it relies on reliability guarantees offered by MSMQ.