Monday, April 04, 2011

WCF streaming inside data contracts

In order to support transferring of large messages (e.g., uploading or downloading a large file), WCF added support (since its first version) to streaming message transfer – unlike the default behavior, which is to buffer the entire message prior to sending it to the wire (or delivering it to the application layer, in the receiving end). Buffering has lots of advantages – it is faster, it allows for operations such as signing the body (which needs the whole content), reliable messaging (which may need to retransmit a message, and streams can often be read only once). However, there are situations, such as uploading a 1GB file to a server, when buffering the message would be too memory-intensive, if at all possible. For these cases streaming can be used.

Enabling streaming is simply a matter of setting the appropriate transfer mode in the binding. BasicHttpBinding, NetTcpBinding and NetNamedPipeBinding expose that property directly. For other binding types, you’ll need to convert them to a custom binding and set that property on the transport binding element itself. But to really get the benefits of streamed transfer, you need to use some data types which don’t need to buffer all the information prior to being serialized. Using a Message object directly (untyped message programming) certainly can be done, as you control the whole message layout, and it can be created based on classes which can emit the message parts on the fly (such as a XmlReader or a BodyWriter), but that’s too low-level for most applications (you need essentially to created the request / parse the response almost “from scratch” – just a little above dealing with raw bytes).

Another parameter type which is naturally “suited” for streaming scenarios are types which implement IXmlSerializable. The contract for such types is that they essentially control their whole serialization / deserialization. On serialization, the class WriteXml method is called, receiving a XmlWriter positioned at the wrapping element. At that point, the class can write as much information as needed, without needing to buffer anything in memory. On the ReadXml method, the class receives a XmlReader positioned again at the wrapping element, and the class can read as much information can consume it without having to buffer it all (but it should only read information pertaining to itself). Below is a simple example of an IXmlSerializable type which produces / consumes 10000 elements in the message, without having to buffer it.

public class MyXmlSerializable : IXmlSerializable
{
    int total;
    public XmlSchema GetSchema()
    {
        return null;
    }
    public void ReadXml(XmlReader reader)
    {
        reader.ReadStartElement();
        for (int i = 0; i < 10000; i++)
        {
            this.total += reader.ReadElementContentAsInt();
        }
        reader.ReadEndElement();
    }
    public void WriteXml(XmlWriter writer)
    {
        for (int i = 0; i < 10000; i++)
        {
            writer.WriteStartElement("item_" + i);
            writer.WriteValue(i);
            writer.WriteEndElement();
        }
    }

IXmlSerializable types aren’t very friendly for simple operations, as the user still has to write code to handle all the serialization. For simple scenarios such as uploading / downloading files, for example, it would be cumbersome to have to write the code to read from the stream / write to the stream and convert it into XML. To make those scenarios easier to implement, WCF exposes on the service model a few capabilities to help with streaming. For operation or message contracts, if you define a single body parameter of type System.IO.Stream, WCF will map it to the whole message body, and the operation can be defined fairly simply, like in the example below. As far as a WCF operation is concerned, a Stream type is equivalent to a byte[] operation (they both map to the XML schema type xs:base64Binary directly). Notice that the type needs to be Stream, not any of its subclasses (MemoryStream, FileStream, etc). When writing the message to the wire, WCF will read the stream passed by the user, and write its bytes out. When reading the message, WCF will create its own read-only stream, and pass it to the user.

[MessageContract]
public class UploadFileRequest
{
    [MessageHeader]
    public string fileName;
    [MessageBodyMember]
    public Stream fileContents;
}
[ServiceContract]
public interface IFileDownloader
{
    [OperationContract]
    Stream DownloadFile(string fileName);
    [OperationContract]
    void UploadFile(UploadFileRequest request);
}

Read more: Carlos' blog