Thursday, May 7, 2015

Offline data access and synchronization in a mobile application with Couchbase Lite

This is the translation of my article that first appeared on Habrahabr.

Couchbase and Couchbase Lite

When developing data-driven mobile applications, we often encounter the customer's wish to fully access all of the app’s features, including changing the data, when the device is offline. The changes made to the data also have to sync with the backend when the device goes online. The backend is also concurrently accessed by desktop and web frontend applications which may also modify the data.

Public cloud synchronization is not always viable, especially when security concerns are in place, and customer wishes to keep all of their data on their private servers. In this article I’ll describe my experience of solving this task by using Couchbase database on the server and Couchbase Lite database in the mobile application with two-way replication between them.

The Couchbase database is a document-oriented distributed NoSQL database that ensures high performance by writing data into memory first, eventually persisting it onto the disk. Couchbase enables strong consistency between the nodes in a clustered environment by making them independent and equal, while each document being bound to a certain node. Couchbase is queried with indexed views that implement the MapReduce pattern.

Couchbase Lite is a lightweight version of Couchbase that is intended for desktop and mobile applications and is able to replicate with Couchbase server. Couchbase Lite is implemented on iOS, Android, Java and .NET platforms, so it can be used not only in mobile but also in desktop applications. It’s worth mentioning that the iOS version of Couchbase Lite currently has several advantages against other platforms. For instance, there is full-text search, and also automatic mapping of documents to Objective C and Swift objects.

For synchronization of Couchbase and Couchbase Lite, a CouchDB-almost-compatible replication protocol is used. Almost — because the authors don’t guarantee complete compatibility due to obscure documentation of CouchDB protocol which they even had to partly reverse. This protocol is implemented in Sync Gateway — a REST-based replication service. All clients that wish to sync data should connect to the central database using this service.

Couchbase Server installation and setup

Couchbase installation

The installation process of Couchbase differs between platforms and is described in the documentation. Let’s assume the database is already installed on localhost. The default location of admin console is http://localhost:8091/. Let’s go there and create a bucket named "demo" which we’ll use for storing our documents. To do that, open Data Buckets tab and click Create New Data Bucket button.

Enter the bucket name "demo" and limit it’s memory quota to 100 MB.

When all is done, a new bucket named demo will appear in the list of buckets, with a green circle beside it that indicates its normal activity.

Click the Documents button and observe that the newly created bucket is empty.

Sync Gateway setup

Sync Gateway installation and setup are described in the documentation. Here I’ll provide a sync-gateway-config.json file that will allow you to run the sample application that we’ll develop in this article:

{
     "interface":":4984",
     "adminInterface":"0.0.0.0:4985",
     "log": ["CRUD+", "REST+", "Changes+", "Attach+"],
     "databases":{
          "demo":{
               "bucket":"demo",
               "server":"http://localhost:8091",
               "users": {
                    "GUEST": {"disabled": false, "admin_channels": ["*"]}
               },
               "sync":`function(doc) {channel(doc.channels);}`
          }
     }
}


After running the Sync Gateway with this config file, you should observe the following log showing that the demo bucket is ready for acting as our central data synchronization storage:

23:27:02.411961 Enabling logging: [CRUD+ REST+ Changes+ Attach+]
23:27:02.412547 ==== Couchbase Sync Gateway/1.0.3(81;fa9a6e7) ====
23:27:02.412559 Configured Go to use all 8 CPUs; setenv GOMAXPROCS to override this
23:27:02.412604 Opening db /demo as bucket "demo", pool "default", server 
23:27:02.413160 Opening Couchbase database demo on 
23:27:02.601456 Reset guest user to config
23:27:02.601467 Starting admin server on 0.0.0.0:4985
23:27:02.603461 Changes+: Notifying that "demo" changed (keys="{_sync:user:}") count=2
23:27:02.604248 Starting server on :4984 ...


Refresh the page with the bucket document list, and you should see some internal Sync Gateway documents there which IDs start with _sync:

Console application

The code of the console application is available on GitHub together with the mobile application. It is mainly intended for demonstrating and testing the interaction of mobile and desktop databases and is comprised of a simple Java application that connects to an embedded Couchbase Lite database, which is also implemented in Java. The application is able to create a local document with an image attachment and a timestamp_added attribute. It also initiates replication of local changes to Couchbase Server.

Mobile application

The mobile application will show thumbnails of pictures that were added in a console application, persisted to the local database and replicated to the mobile database via server database. The process of creating this mobile application is described here in full. I chose the iOS platform for the mobile application as it is has better support for the Couchbase Lite API. The language used here is Swift.

Creating a project and adding dependencies

First let’s create a simple Single View Application:

To attach the couchbase-lite-ios library to the project, let's use the CocoaPods dependency manager. The CocoaPods installation is described in its documentation. Let’s initialize CocoaPods in the project directory:

pod init


Add the couchbase-lite-ios dependency to Podfile:

target 'CouchbaseSyncDemo' do
     pod 'couchbase-lite-ios', '~> 1.0'
end


Install the specified library into the project:

pod install


Now you should reopen the project as a workspace (CouchbaseSyncDemo.xcworkspace). Now add a bridging header file so you can use the CocoaPods-installed Objective C libraries in your Swift classes. To do that, add to the project the following header file, naming it CouchbaseSyncDemo-Bridging-Header.h:

#ifndef CouchbaseSyncDemo_CouchbaseSyncDemo_Bridging_Header_h
#define CouchbaseSyncDemo_CouchbaseSyncDemo_Bridging_Header_h
#import "CouchbaseLite/CouchbaseLite.h"
#endif


Specify this file in your Build Settings:

UI stub

Inherit the automatically generated ViewController class from the UICollectionViewController:

class ViewController: UICollectionViewController {


Open Main.storyboard and switch the default ViewController to a Collection View Controller, dragging it from the Object Library and redirecting the Storyboard Entry Point to it. In the Custom Class section of the Identity Inspector specify the generated ViewController. Also select the Collection View Cell and in its Attribute Inspector specify "cell" as its Reuse Identifier. The result is shown on the following screenshot:

Initializing and starting the replication

Create a class CouchbaseService that will incapsulate the database-related functionality and implement it as a singleton:

private let CouchbaseServiceInstance = CouchbaseService()

class CouchbaseService {

     class var instance: CouchbaseService {
          return CouchbaseServiceInstance
     }

}


Now open the demo database in the constructor of this class and start continuous pull replication. If the application is run inside the emulator, and Couchbase Server is running on the same machine, then we can use localhost as the address for replication. The continuous flag ensures that the replication runs continuously via long polling mechanism. You should also create the "images" view for extracting the list of all images:

private let pull: CBLReplication
private let database: CBLDatabase

private init() {

     // create or open the database
     database = CBLManager.sharedInstance().databaseNamed("demo", error: nil)

     // initiate pull replication
     let syncGatewayUrl = NSURL(string: "http://localhost:4984/demo/")
     pull = database.createPullReplication(syncGatewayUrl)
     pull.continuous = true;
     pull.start()

     // create a view of all documents in the database
     database.viewNamed("images").setMapBlock({(doc: [NSObject : AnyObject]!, emit: CBLMapEmitBlock!) -> Void in
          emit(doc["timestamp_added"], nil)
     }, version: "1")
}


Couchbase Lite views

Couchbase view is an indexed and automatically refreshed result of execution of a pair of functions — map and (optionally) reduce — on all of the documents in the bucket. Here the view is specified only by its map function that for each document returns its creation timestamp as the key. The key in views is also used to sort the view’s results, so the images will always be sorted by the time they were added. The version parameter specifies the view’s version and has to be changed each time we change the view’s code. The change in the version is a signal for Couchbase to rebuild the view using the new version of the code.

Views in Couchbase can be queried. A specific type of queries is a live query, which results in an automatically updated array of documents. Thanks to Objective C and Swift’s KVO feature, we can observe this array’s changes and update the interface of our application when new data arrives via replication.

As a matter of fact, this way of tracking the changes may only signal the fact that the query results changed, but not the concrete added or deleted records. Such information would allow us to minimize the updates to interface — and gladly, Couchbase Lite provides it via the kCBLDatabaseChangeNotification event. This event signals of all new revisions that are added to the database. But in this example I decided to use the more simple live query mechanism.

Dealing with the data

Let’s add to CouchbaseService class a function for executing live query to our images view:

func getImagesLiveQuery() -> CBLLiveQuery {
     return database.viewNamed("images").createQuery().asLiveQuery()
}


The iOS implementation of Couchbase Lite stands out from other platforms by its automatic bi-directional mapping of documents to object models. This mapping leverages dynamic features of Objective C. A Swift implementation of this mapping is as follows:

@objc
class ImageModel: CBLModel {

     @NSManaged var timestamp_added: NSString

     var imageInternal: UIImage?

     var image: UIImage? {
          if (imageInternal == nil) {
               imageInternal = UIImage(data: self.attachmentNamed("image").content)
          }
          return imageInternal
     }

}


The timestamp_added attribute is dynamically linked to the corresponding field in the document, and the attachmentNamed: function allows us to receive binary data attached to the document. To convert the document to its object model, we can use the ImageModel constructor.

Binding interface and data

All that’s left to do is to subscribe ViewController to live query refresh and process this refresh by reloading the collection view. The images attribute keeps the list of documents converted to object models.

private var images: [ImageModel] = []

private var query: CBLLiveQuery?

override func viewDidAppear(animated: Bool) {
     query = CouchbaseService.instance.getImagesLiveQuery()
     query!.addObserver(self, forKeyPath: "rows", options: nil, context: nil)
}

override func observeValueForKeyPath(keyPath: String, ofObject object: AnyObject, change: [NSObject : AnyObject], context: UnsafeMutablePointer) {
     if object as? NSObject == query {
          images.removeAll()
          var rows = query!.rows
          while let row = rows.nextRow() {
               images.append(ImageModel(forDocument: row.document))
          }
          collectionView?.reloadData()
     }
}


The UICollectionViewDataSource protocol methods are quite typical and self-explanatory, except that we use the "cell" reuse identifier that we specified for the collection view cell in the storyboard earlier.

override func collectionView(collectionView: UICollectionView, numberOfItemsInSection section: Int) -> Int {
     return images.count
}

override func collectionView(collectionView: UICollectionView, cellForItemAtIndexPath indexPath: NSIndexPath) -> UICollectionViewCell {
     let cell = collectionView.dequeueReusableCellWithReuseIdentifier("cell", forIndexPath: indexPath) as! UICollectionViewCell
     cell.backgroundView = UIImageView(image:images[indexPath.item].image)
     return cell
}


Running the application

Now let’s see what we’ve achieved. Let’s run the console application. By issuing the start command inside the console application, we’re starting the replication; with the attach command we can create several documents with images.

    start

CBL started
апр 15, 2015 11:41:14 PM com.github.oxo42.stateless4j.StateMachine publicFire
INFO: Firing START
push event: PUSH replication event. Source: com.couchbase.lite.replicator.Replication@144c1e50 Transition: INITIAL -> RUNNING Total changes: 0 Completed changes: 0
апр 15, 2015 11:41:15 PM com.github.oxo42.stateless4j.StateMachine publicFire
push event: PUSH replication event. Source: com.couchbase.lite.replicator.Replication@144c1e50 Transition: RUNNING -> IDLE Total changes: 0 Completed changes: 0
INFO: Firing WAITING_FOR_CHANGES

    attach http://upload.wikimedia.org/wikipedia/commons/4/41/Harry_Whittier_Frees_-_What%27s_Delaying_My_Dinner.jpg

Saved image with id = 8e357b3c-1c7f-4432-b91d-321dc1c9fd9d
push event: PUSH replication event. Source: com.couchbase.lite.replicator.Replication@144c1e50 Total changes: 1 Completed changes: 0
push event: PUSH replication event. Source: com.couchbase.lite.replicator.Replication@144c1e50 Total changes: 1 Completed changes: 1


The data is replicated to the mobile device and gets displayed right away:

Summary

In this article I demonstrated synchronization of data between server side and mobile application by means of Couchbase and Couchbase Lite. This allows us to create a mobile application that can be fully functional while the device is offline. In my future articles I’ll explore document revisions and replication protocol of Couchbase Lite more closely and test it for bad connectivity, sudden backgrounding of application and other perils of mobile app development.

Links

Sources of sample applications on GitHub
Couchbase
Couchbase Lite
Sync Gateway
MapReduce computation model
Couchbase installation
Installing and running Sync Gateway
Installing CocoaPods

Friday, February 6, 2015

Solving issues with installing the "Real World OCaml" prerequisite libraries under Mac OS X

    I’ve run into some troubles installing the «Real World Ocaml» prerequisites under Mac OS X Yosemite. The installation process is described on the books’ wiki here. I've found little help by googling the error messages, which possibly means they could be very specific to my installation. So I've decided to share my experience in case someone runs into similar issues.
    At one point, the instruction advises you to install some libraries that are used throughout the book, by issuing the following command:
opam install \
   async yojson core_extended core_bench \
   cohttp async_graphics cryptokit menhir

    At first I’ve failed to install the cohttp and async_graphics packages. Possibly that was because I already had objective-caml installed, and my process of installation deviated a bit from the one prescribed by the instruction.
    The cohttp package depends on ctypes package, which was failing with the following error:
# fatal error: 'ffi.h' file not found
# #include <ffi.h>

    To solve it, just install libffi and add it to LDFLAGS environment variable for the time of build:
brew install libffi
export LDFLAGS=-L/usr/local/opt/libffi/lib
opam install cohttp

    If during the installation of async_graphics module you receive an error:
# Error: Unbound module Graphics

    That probably means you’ve installed objective-caml without the Graphics module, which may be verified by the listing:
ls /usr/local/opt/objective-caml/lib/ocaml/graphics*

    If the listing is empty, then you should reinstall objective-caml with graphics:
brew uninstall objective-caml
brew install objective-caml --with-x11

Now the process of installing the async_graphics package should work just fine:
opam install async_graphics

Installing Ocsigen web framework under Mac OS X and CentOS and creating a simple web application

        I recently wanted to play around with OCaml and create a web application. It appears that the Ocsigen framework is the only (or the most popular) choice for building web applications with OCaml, so here’s how to install it on Mac OS X and CentOS 6 and create a simple web application.

Installing on Mac OS X


        This process was tested on Yosemite. First, install brew, if you haven't got it already:
ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)"

        Now install ocaml, opam package manager and all the prerequisites for the Eliom (web application framework) and Macaque (database framework). Macaque is not really needed to run a simple example, but you're going to need it soon enough if you're about to develop a database-backed web application.
brew install ocaml opam libev gdbm pcre openssl pkg-config sqlite3

        Create a symbolic link for pkgconfig to the sqlite package. Use your current installed version of sqlite instead. Not a very clean solution, as it will break if the sqlite package is updated via brew. If you know how to reference the latest sqlite version, please let me know.
ln -s /usr/local/Cellar/sqlite/3.8.8.2/lib/pkgconfig/sqlite3.pc /usr/local/lib/pkgconfig/sqlite3.pc

        Now initialize the opam package manager. It will create the ~/.opam directory, where it keeps all of its data, including installed packages.
opam init

        Now edit the ~/.profile file and add this line:
eval `opam config env`

        Restart the terminal shell to pick up the environment variables, and then check that the scaffolding tool is available:
eliom-distillery

Installing on CentOS 6


        First add the OCaml repository to yum:
cd /etc/yum.repos.d/
wget http://download.opensuse.org/repositories/home:ocaml/CentOS_6/home:ocaml.repo

        Now install OCaml, opam and all the prerequisites:
yum install ocaml opam ocaml-camlp4 ocaml-camlp4-devel ocaml-ocamldoc openssl-devel pcre-devel sqlite-devel

        Initialize the opam repository and install the eliom web framework and macaque database framework:
opam init
opam install eliom macaque

        If for some reason you encounter an error during installation:
# ocamlfind: Package `camlp4' not found

        Then try to reinstall the ocamlfind package and run the installation again:
opam reinstall ocamlfind
opam install eliom macaque

Creating and running your first Ocsigen web application


        Create a barebones application using the generator:
eliom-distillery -name mysite -template basic -target-directory mysite

        Run it:
cd mysite
make test.byte

        Open http://localhost:8080/ in your browser. You should see the «Welcome from Eliom’s distillery!» greeting message.

Friday, January 31, 2014

Creating the simplest HTTP server with basic authentication using node.js

        In this article I will show you how to create the simplest possible HTTP server with basic authentication in node.js. I have to warn you though, I needed a quick and dirty solution for testing purposes, so this is definitely not for production use. You should at least keep hashes of user passwords, not plaintext passwords themselves, and use digest authentication as a more secure method.
        First, install the htpasswd module globally:
npm install -g htpasswd
        Create a directory for your project and install http-auth module locally:
npm install http-auth
        Create a file auth-server.js with your editor of choice. Put the following lines into it:
var http = require("http");
var auth = require("http-auth");

var basic = auth.basic({
    file: __dirname + '/htpasswd'
});

http.createServer(basic, function(req, res) {
    console.log('Received request: ' + req.url);
    res.end('User successfully authenticated: ' + req.user);
}).listen(8080);
        Now create a file htpasswd in the same directory and populate it with a user name and a password separated by a colon:
forketyfork:mypassword
        Now run the node server:
node auth-server.js
        Go to the http://localhost:8080 URL in your browser. It will greet you with a standard basic-auth panel to enter your username and password. After successful authentication, you will see the message from server.
        For more info on how to use the http-auth package for basic and digest authentication, see its page on github: https://github.com/gevorg/http-auth. For more info on htpasswd module, including using different types of hashes instead of plain text passwords, see https://github.com/gevorg/htpasswd.

Wednesday, January 29, 2014

T-SQL: Unicode-escaping characters in a string

I am in no way a T-SQL pro, but today I had a need of escaping a varchar field value to create a valid JSON string, while being limited only to Microsoft T-SQL features.
The JSON RFC states that:
All Unicode characters may be placed within the quotation marks except for the characters that must be escaped: quotation mark, reverse solidus, and the control characters (U+0000 through U+001F).
As it turns out, the way of iterating through a string in T-SQL is as such:
set @wcount = 0
set @index = 1
set @len = len(@string)

while @index <= @len
begin
  set @char = substring(@string, @index, 1)
  /* do something with @char */
  set @index += 1
end
To escape a quote or a backslash, we just prefix it with a backslash. As for the control characters, this gets a bit trickier, as we need to convert them to \u-notation that is used in JSON. We can use the built-in unicode function to get the ordinal value of a char and determine that it needs to be escaped.
when unicode(@char) < 32
Then we take advantage of the fn_varbintohexstr system function to convert a char value through varbinary type to a hex string.
sys.fn_varbintohexstr(cast(@char as varbinary))
Finally, after some string chopping and concatenating, we get what we want:
'\u00' + right(sys.fn_varbintohexstr(cast(@char as varbinary)), 2)
Here's the code of the function json_escape in its entirety.
if object_id(N'dbo.json_escape', N'FN') is not null
    drop function dbo.json_escape
go

create function dbo.json_escape (@string varchar(max)) returns varchar(max)
as
begin
    declare @wcount int, @index int, @len int, @char char, @escaped_string varchar(max)

    set @escaped_string = ''
    set @wcount = 0
    set @index = 1
    set @len = len(@string)

    while @index <= @len
    begin
        set @char = substring(@string, @index, 1)
        set @escaped_string += 
        case
            when @char = '"' then '\"'
            when @char = '\' then '\\'
            when unicode(@char) < 32 then '\u00' + right(sys.fn_varbintohexstr(cast(@char as varbinary)), 2)
            else @char
        end
        set @index += 1
    end
    return(@escaped_string)
end

go

Wednesday, November 27, 2013

Date formatting in Velocity templates

Here's how to format a date inside a velocity template. Add an additional velocity-tools library in your dependencies:

     org.apache.velocity
     velocity-tools
     2.0
 
Import the DateTool class:
import org.apache.velocity.tools.generic.DateTool;
Add an instance of this class to the VelocityContext:
VelocityContext context = new VelocityContext();
context.put("date", new DateTool());
Add your date object to the context:
context.put("some_date", new Date());
Use the DateTool parameter in the template to format date:
$date.format('dd.MM.yyyy', $some_date)

Thursday, July 4, 2013

Sending large attachments via SOAP and MTOM in Java

     Sometimes you need to pass a large chunk of unstructured (possibly even binary) data via SOAP protocol — for instance, you wish to attach a file to a message. The default way to do this is to pass the data in an XML element with base64Binary type. What it effectively means is, your data will be Base64-encoded and passed inside the message body. Not only your data gets enlarged by about 30%, but also any client or server that sends or receives such message will have to parse it entirely which may be time and memory consuming on large volumes of data.

     To solve this problem, the MTOM standard was defined. Basically it allows you to pass the content of a base64Binary block outside of the SOAP message, leaving a simple reference element instead. As for the correspondent HTTP binding, the message is transferred as a SOAP with attachments with a multipart/related content type. I won't go into the details here, you may learn it all straight from the above mentioned standards and RFCs.

     The tricky part is, although we've disposed of a 30% volume overhead by passing the data outside of the message, the standards themselves don't specify the ways of processing the messages by the implementations of clients and servers — whether the messages should be completely read into memory with all their attachments during sending and receiving or offloaded on external storage. By default, the implementations (including Java's SAAJ) usually read the attachments completely into memory, thus causing a possibility of running out of memory on large files or heavy-loaded systems. In Java, this is usually signified by a "java.lang.OutOfMemoryError: Java heap space" error.

     In this post I will demonstrate a simple client-server application that can transfer SOAP attachments of arbitrary volume with disk offloading, using Apache CXF on the client and Oracle's SAAJ implementation (a part of JDK 6+) on the server. This will require some tuning for the mentioned frameworks. The complete code of the application is available on GitHub.

     First, we will place the common files (XSD and WSDL) in a separate project, as they will be used by both client and sever. The WSDL schema of the service is relatively straightforward: we have a port with a single operation that consists of a SimpleRequest request and a SimpleResponse response from the server. The file is transferred in the request to the server. The XSD schema of request and response is as follows:

<?xml version="1.0" encoding="UTF-8"?>
<s:schema elementFormDefault="qualified"
          targetNamespace="http://forketyfork.ru/mtomsoap/schema"
          xmlns:s="http://www.w3.org/2001/XMLSchema"
          xmlns:xmime="http://www.w3.org/2005/05/xmlmime">

    <s:element name="SampleRequest">
        <s:annotation>
            <s:documentation>Service request</s:documentation>
        </s:annotation>
        <s:complexType>
            <s:sequence>
                <s:element name="text" type="s:string" />
                <s:element name="file" type="s:base64Binary" xmime:expectedContentTypes="*/*" />
            </s:sequence>
        </s:complexType>
    </s:element>

    <s:element name="SampleResponse">
        <s:annotation>
            <s:documentation>Service response</s:documentation>
        </s:annotation>
        <s:complexType>
            <s:attribute name="text" type="s:string" />
        </s:complexType>
    </s:element>

</s:schema>
     Take a note of the imported xmime schema, and the usage of xmime:expectedContentTypes="*/*" attribute on a binary data element. This enables us to generate correct JAXB code out of this schema, because by default the base64Binary element corresponds to a byte[] array field in the JAXB-mapped class. But as we'll see, the expectedContentTypes attribute alters the generation of the class:

@XmlAccessorType(XmlAccessType.FIELD)
@XmlType(name = "", propOrder = {
    "text",
    "file"
})
@XmlRootElement(name = "SampleRequest")
public class SampleRequest {

    @XmlElement(required = true)
    protected String text;
    @XmlElement(required = true)
    @XmlMimeType("*/*")
    protected DataHandler file;

    ...
     Note that the file field is of type DataHandler, which allows for streaming processing of the data.

     We shall generate the JAXB classes for both client and server, and a service class for the client, using Apache CXF cxf-codegen-plugin for Maven during build-time. The configuration is as follows:

<plugin>
    <groupId>org.apache.cxf</groupId>
    <artifactId>cxf-codegen-plugin</artifactId>
    <version>${cxf.version}</version>
    <executions>
        <execution>
            <id>generate-sources</id>
            <phase>generate-sources</phase>
            <configuration>
                <sourceRoot>${project.build.directory}/generated-sources/cxf</sourceRoot>
                <wsdlOptions>
                    <wsdlOption>
                        <wsdl>${basedir}/src/main/resources/service.wsdl</wsdl>
                        <wsdlLocation>classpath:service.wsdl</wsdlLocation>
                    </wsdlOption>
                </wsdlOptions>
            </configuration>
            <goals>
                <goal>wsdl2java</goal>
            </goals>
        </execution>
    </executions>
</plugin>
     In this Maven plugin configuration we explicitly specify the wsdlLocation property that will be included into the generated service class. Without it, the generated path to the WSDL file will be a local path on the developer's machine, which we obviously don't want.

     The client (module mtom-soap-client) is plain simple, as it is based on Apache CXF and a generated SampleService class. Here we only enable MTOM for underlying SOAP binding and specify an infinite timeout, as the transfer of large files may take time:


        // Creating a CXF-generated service
        Sample sampleClient = new SampleService().getSampleSoap12();

        // Setting infinite HTTP timeouts
        HTTPClientPolicy httpClientPolicy = new HTTPClientPolicy();
        httpClientPolicy.setConnectionTimeout(0);
        httpClientPolicy.setReceiveTimeout(0);
        HTTPConduit httpConduit = (HTTPConduit) ClientProxy.getClient(sampleClient).getConduit();
        httpConduit.setClient(httpClientPolicy);

        // Enabling MTOM for the SOAP binding provider
        BindingProvider bindingProvider = (BindingProvider) sampleClient;
        SOAPBinding binding = (SOAPBinding) bindingProvider.getBinding();
        binding.setMTOMEnabled(true);

        // Creating request object
        SampleRequest request = new SampleRequest();
        request.setText("Hello");
        request.setFile(new DataHandler(new FileDataSource(args[0])));

        // Sending request
        SampleResponse response = sampleClient.sample(request);

        System.out.println(String.format("Server responded: \"%s\"", response.getText()));

     The server is based on the Spring WS framework. Only we won't use a typical default <annotation-config /> configuration here and specify a custom DefaultMethodEndpointAdapter configuration, because we need Spring WS to use our custom-configured jaxb2Marshaller bean:

<!-- The service bean -->
<bean class="ru.forketyfork.mtomsoap.server.SampleServiceEndpoint" p:uploadPath="/tmp"/>

<!-- SAAJ message factory configured for SOAP v1.2 -->
<bean id="messageFactory" class="org.springframework.ws.soap.saaj.SaajSoapMessageFactory"
      p:soapVersion="#{T(org.springframework.ws.soap.SoapVersion).SOAP_12}"/>

<!-- JAXB2 Marshaller configured for MTOM -->
<bean id="jaxb2Marshaller" class="org.springframework.oxm.jaxb.Jaxb2Marshaller"
      p:contextPath="ru.forketyfork.mtomsoap.schema"
      p:mtomEnabled="true"/>

<!-- Endpoint mapping for the @PayloadRoot annotation -->
<bean class="org.springframework.ws.server.endpoint.mapping.PayloadRootAnnotationMethodEndpointMapping" />

<!-- Endpoint adapter to marshal endpoint method arguments and return values as JAXB2 objects -->
<bean class="org.springframework.ws.server.endpoint.adapter.DefaultMethodEndpointAdapter">
    <property name="methodArgumentResolvers">
        <list>
            <ref bean="marshallingPayloadMethodProcessor" />
        </list>
    </property>
    <property name="methodReturnValueHandlers">
        <list>
            <ref bean="marshallingPayloadMethodProcessor" />
        </list>
    </property>
</bean>

<!-- JAXB@ Marshaller/Unmarshaller for method arguments and return values -->
<bean id="marshallingPayloadMethodProcessor" class="org.springframework.ws.server.endpoint.adapter.method.MarshallingPayloadMethodProcessor">
    <constructor-arg ref="jaxb2Marshaller" />
</bean>
     Important thing to notice here is a mtomEnabled property of jaxb2Marshaller, the rest of the configuration is quite typical.

     The SampleServiceEndpoint class is a service that is bound via the @PayloadRoot annotation to process our SampleRequest requests:

    @PayloadRoot(namespace = "http://forketyfork.ru/mtomsoap/schema", localPart = "SampleRequest")
    @ResponsePayload
    public SampleResponse serve(@RequestPayload SampleRequest request) throws IOException {

        // randomly generating file name as a UUID
        String fileName = UUID.randomUUID().toString();
        File file = new File(uploadPath + File.separator + fileName);

        // writing attachment to file
        try(FileOutputStream fos = new FileOutputStream(file)) {
            request.getFile().writeTo(fos);
        }

        // constructing the response
        SampleResponse response = new SampleResponse();
        response.setText(String.format("Hi, just received a %d byte file from ya, saved with id = %s",
                file.length(), fileName));

        return response;
    }

     Notice how we work with the request.getFile() field of the request. Remember, the type of the field is DataHandler. What actually happens is, the request.getFile() wraps an InputStream that points to the attachment that was offloaded by SAAJ to disk when the request was received. So we may copy this file to another location or process it in any way while not loading it completely into memory.

     A final trick is to enable the attachment offloading for the Oracle's SAAJ implementation that is bundled with Oracle's JDK starting from version 6. To do that, we must run our server with the -Dsaaj.use.mimepull=true JVM argument.

     Once again, the complete code for the article is available on GitHub.