Hot questions for Using Transmission Control Protocol in nio

Question:

We are currently using a multithreaded solution for a high performance TCP server handling 20 simultaneous connections. Our average latencies run around 200 microseconds per message and we have been struggling to tame the GC activity which can produce 1ms+ outliers. Low latency is our utmost goal for this server and we are aware that our current numbers are bad. We are evaluating single threaded approaches so we can have all these 20 connections handled by a single thread.

What is the current floor for TCP latency in Java, in other words, how fast can two machines exchange messages in Java through a TCP socket over a 10Gb network?


Answer:

A multithreaded server is not the way to go for latency and 200 micros is indeed too high. For ultra-low-latency network applications it is mandatory to use a single-threaded, asynchronous, non-blocking network library. You can easily handle these 20 socket connections inside the same reactor thread (i.e. network selector) which can be pinned to a dedicated and isolated cpu core. Moreover, if using Java, it is mandatory to use a network library that leaves zero garbage behind since a cleaning GC activity will most likely block the critical reactor thread introducing the bad outliers you are observing.

To give you an idea of TCP latencies you can take a look on these benchmarks using CoralReactor, which is an ultra-low-latency an garbage-free network library implemented in Java.

Messages: 1,000,000 (size 256 bytes)
Avg Time: 2.15 micros
Min Time: 1.976 micros
Max Time: 64.432 micros
Garbage created: zero   
75% = [avg: 2.12 micros, max: 2.17 micros]
90% = [avg: 2.131 micros, max: 2.204 micros]
99% = [avg: 2.142 micros, max: 2.679 micros]
99.9% = [avg: 2.147 micros, max: 3.022 micros]
99.99% = [avg: 2.149 micros, max: 5.604 micros]
99.999% = [avg: 2.149 micros, max: 7.072 micros]

Keep in mind that 2.15 micros is over loopback, so I am not considering network and os/kernel latencies. For a 10Gb network, the over-the-wire latency for a 256-byte size message will be at least 382 nanoseconds from NIC to NIC. If you are using a network card that supports kernel-bypass (i.e. SolarFlare's OpenOnLoad) then the os/kernel latency should be very low.

Disclaimer: I am one of the developers of CoralReactor.

Question:

In Java's blocking IO API accept() blocks until a connection is available and read() blocks until input is available.

But does write() block also for different blocking IO OutputStreams? So how long does write() block in FileOutputStream and SocketOutputStream (TCP) classes?

Edit: or to ask even more generally: Does nonblocking IO have any advantages for write operations?


Answer:

But does write() block also for different blocking IO OutputStreams? So how long does write() block in FileOutputStream and SocketOutputStream (TCP) classes?

It blocks until all data that you have asked to be written is delivered to the OS. There is no theoretical upper bound on this. There is no theoretical difference between the different kinds of stream classes.

In practice the length of time blocked depends on how fast the data can be delivered. Writing to local files is generally fastest. Anything involving networks depends on network (and NIC) bandwidth, latency and congestion. (That includes cases where you are using file streams to read / write files to locally mounted remote file systems.)


Does this mean that there is no performance improvement for blocking vs nonblocking writes? If there are such, what are they?

No it doesn't mean that ... exactly. The potential performance improvement is not a direct one.

The performance improvement comes about if you have lots of connections that you are reading and / or writing.

  • With blocking I/O, you effectively need one thread for each connection each thread has significant resources (e.g. thread stack memory) associated with it, and there are overheads whenever you make a thread context switch. Having lots of threads typically also tends to increase other things, like lock contention, heap space usage, GC overheads, virtual memory footprint, paging activity.

  • With non-blocking I/O (and selectors), you can use one thread to service multiple connections, either directly or by passing work off to worker threads via queues.


Does nonblocking IO have any advantages for write operations?

Non-blocking I/O when used correctly allows you to support more simultaneous clients with less resources. But for a single connection / client there is no speedup from using non-blocking I/O.

Question:

I want my clients to continuously read/write to a log file at a remote server. The way I am doing it is by passing the output of tail -f /root/log.txt from my remote server to my clients.

There are 2 problems I faced

  • My Server is executing the command but my client is not receiving the output.
  • Only one client can connect to the server even though I used threading

Client.java

import java.io.*;
import java.net.*;

public class Client
{
  Socket sock;
  String server = "XXX.XXX.XX.XX";
  int port = 5550;
  String filename = "/root/log.txt";
  String command = "tail -f "+filename+"\n";

  public static void main(String[] args)
  {
    new Client();
  }

  public Client()
  {
    openSocket();
    try
    {
      // write to socket
      BufferedWriter wr = new BufferedWriter(new OutputStreamWriter(sock.getOutputStream()));
      wr.write(command);
      wr.flush();

      // read from socket
      BufferedReader rd = new BufferedReader(new InputStreamReader(sock.getInputStream()));
      String str;
      while ((str = rd.readLine()) != null)
      {
        System.out.println(str);
      }
      rd.close();
    } 
    catch (IOException e) 
    {
      System.err.println(e);
    }
  }

  private void openSocket()
  {
    // open a socket and connect with a timeout limit
    try
    {
      InetAddress addr = InetAddress.getByName(server);
      SocketAddress sockaddr = new InetSocketAddress(addr, port);
      sock = new Socket();

      // this method will block for the defined number of milliseconds
      int timeout = 2000;
      sock.connect(sockaddr, timeout);
    } 
    catch (UnknownHostException e) 
    {
      e.printStackTrace();
    }
    catch (SocketTimeoutException e) 
    {
      e.printStackTrace();
    }
    catch (IOException e) 
    {
      e.printStackTrace();
    }
  }
}

Server.java

import java.io.BufferedReader;
import java.io.BufferedWriter;
import java.io.IOException;
import java.io.InputStreamReader;
import java.io.OutputStreamWriter;
import java.net.ServerSocket;
import java.net.Socket;

public class Server  {
    private int portNo = 0;
    private Socket socket = null;

    public Server(int portNo) {
        this.portNo = portNo;
        Thread t = new Thread(new acceptClient());
        t.start();
    }

    class acceptClient implements Runnable {
        public void run() {
            //while(true) {
                try {
                    ServerSocket sSocket = new ServerSocket(portNo);
                    socket = sSocket.accept();
                    System.out.println("A client has connected!");
                    BufferedWriter wr = new BufferedWriter(new OutputStreamWriter(socket.getOutputStream()));

                    BufferedReader rd = new BufferedReader(new InputStreamReader(socket.getInputStream()));
                    System.out.println(rd.readLine());
                    rd.close();
                    Process p = null;

                    p = Runtime.getRuntime().exec("tail -f /root/log.txt");
                    BufferedReader rd2 = new BufferedReader(new InputStreamReader(p.getInputStream()));
                    String s = null;
                    while ((s = rd2.readLine()) != null) {
                        System.out.println(s);
                        wr.write(s);
                    }
                    rd2.close();
                    wr.close();
                    /*try {
                            p.waitFor();
                        } catch (InterruptedException e) {
                            // TODO Auto-generated catch block
                            e.printStackTrace();
                        }*/

//                  /sSocket.close();
                } catch(IOException exception) {
                    System.out.println("Error: " + exception);
                }
            //}
        }
    }

    public static void main(String[] args) {
        int portNo = 5550;
        new Server(portNo);
    }
}

Answer:

My Server is executing the command but my client is not receiving the output.

That's because, your command tail -f is a never ending command (if I am not wrong).

  • Hence rd2.readLine() will never return null in Server.java.
  • Which means your while loop will never exit.
  • Which means, wr.write(s) will keep writing to the stream, but doesn't get a chance to flush() or close() it.
  • Hence, the output doesn't reach the client.

To Fix: Just add flush() below your write().

wr.write(s);
wr.flush();
// While loop close.

Only one client can connect to the server even though I used threading

That's because, you are accepting connection only once in Server.java.

Just creating a new thread will not accept many connections. You need to accept it many times in a loop.

I would suggest you to sSocket.accept() and then create a separate thread for each accepted connection in a loop.

Question:

I have an Android app that acts as server and feeds over TCP some data from sensors with arbitrary intervals (within 5-60 seconds). Client apps occasionally send small chunks of data over the same connection. Data must be sent and received without any delays.

All samples and tutorials (like this one http://adblogcat.com/asynchronous-java-nio-for-dummies/ ) show more or less same scenario - when reading is finished, switch to OP_WRITE. when writing is done switch to OP_READ and so on. Obviously it won't work for my case. I tried yo enable both reads and writes at once like this

serverChannel.register(selector, SelectionKey.OP_READ|SelectionKey.OP_WRITE);

But it makes selector cycle constantly which loads CPU pretty much.

I'm sure this question is not really correct so i'll be glad even if someone give me totally different but working idea or point where i'm wrong. I did not post the code since it is nearly identical to that of aforementioned tutorial.


Answer:

Your question and the samples you cite are based on a fallacy. There is no such thing as a 'mode' in NIO. You can read and write whenever you like, but they can both do nothing if done at the wrong time.

  • OP_READ firing means that a read will return with data or end of stream, i.e. that there is data or a FIN in the socket receive buffer. This is normally false, except when the peer has sent some data or closed his end of the connection.
  • OP_WRITE firing means that a write will transfer some data, i.e. that there is room in the socket send buffer. This is normally true, and this in turn is why selecting on it normally smokes the CPU.
  • Normally a channel should only be registered for OP_READ.
  • When you have something to write, write it.
  • If and only if write() returns zero, register the channel for OP_WRITE, remember the buffer you were writing, and return to the select loop.
  • When OP_WRITE fires for this channel, repeat the write, and if it completes, deregister the channel for OP_WRITE.

There is a lot of junk on the Internet and this is particularly true where NIO is concerned. Among the numerous problems in your citation (seen over and over again in this kind of material):

  • select() is not asynchronous;
  • 'only one of those can happen at a time' is false;
  • you do not need the Selector's 'permission' to write;
  • registering OP_WRITE and waiting for it to fire when you have something to write and don't already know that the socket send buffer is full is just an elaborate and pointless waste of time;
  • you can't change a channel that has been registered for OP_ACCEPT to read/write;
  • closing a channel cancels the key;
  • closing either the channel or the socket closes both;
  • finishConnect() can return false;
  • write() can return zero, or less than the amount of data that was supplied;
  • OP_CONNECT can only fire if isConnectionPending() is true.

Question:

In java NIO, does Selector.select() guarantee that at least one entire UDP datagram content is available at the Socket Channel, or in theory Selector could wake when there is less then a datagram, say couple of bytes ?

What happens if transport protocol is TCP, with regards to Selector.select(), is there difference to UDP ?

From the API: Selects a set of keys whose corresponding channels are ready for I/O operations.

It doesn't however specify what ready means.

So my questions:

  • how incoming datagrams/streams go from hardware to Java application Socket (Channels).

  • when using UDP or TCP client, should one assume that at least one datagram is received or Selector could wake when there is only a part of datagram available ?


Answer:

It doesn't however specify what ready means.

So my questions:

  • how incoming packages/streams go from hardware to Java application Socket (Channels).

They arrive at the NIC where they are buffered and then passed to the network protocol stack and from there to the socket receive buffer. From there they are retrieved when you call read().

  • when using UDP or TCP client, should one assume that at least one package is received

You mean packet. Actually in the case of UDP you mean datagram. You can assume that an entire datagram has been received in the case of UDP.

or Selector could wake when there is only a part of [packet] available?

In the case of TCP you can assume that either at least one byte or end of stream is available. There is no such thing as a 'package' or 'packet' or 'message' at the TCP level.

Question:

When transmitting a file through socket in blocking mode

bytesTransferred = fileIChannel.transferTo(0, fileIChannel.size(), socketChannel);
// or using a buffer
ByteBuffer byteBuffer = ByteBuffer.allocateDirect(1024*8);
while (fileIChannel.read(byteBuffer) != -1) {
byteBuffer.flip();
bytesTransferred += socketChannel.write(byteBuffer);
byteBuffer.clear();

In the event of a connection failure, I need to keep the number of bytes transferred. I can do this by waiting for a response every from the server when it receives a particular number of bytes. Or, when the connection is restored, I send a request for the number of bytes received. Which of the options will be more correct? How is this problem usually resolved?

And the second question. Is data integrity guaranteed when sending large data through a socket?


Answer:

WIth this code you cannot tell. If you want to know that the peer application has received and processed sent data, the peer application has to tell you. TCP does buffering at both ends, so the API alone cannot tell you.

NB Your copy loop is wrong. It should be:

while ((fileIChannel.read(byteBuffer) != -1 && byteBuffer.position() > 0)
{
    byteBuffer.flip();
    bytesTransferred += socketChannel.write(byteBuffer);
    byteBuffer.compact();
}

and there should also be an error test on the write. At present you are assuming that everything got written to the SocketChannel on every write, which isn't guaranteed in non-blocking mode.

The code using transferTo() is also wrong, as transferTo() isn't guaranteed to perform the entire transfer: that's why it returns a count. You have to loop.

Question:

I'm writing a Java tcp/http server that needs to handle thousands of connections through a non-blocking I/O selector. So I'm trying to handle all connections inside the same selector thread but some requests my take a long time to complete. What can I do in that situation? Go back to using threads?


Answer:

There are three ways of doing this, not counting of course the old school one-thread-per-connection way, which as you know does not scale:

  1. You basically use a concurrent queue (i.e. CoralQueue) to distribute the requests’ work (not the requests themselves) to a fixed number of threads that will execute in parallel. Let’s say you have 1000 simultaneous connections. Instead of having 1000 threads you can analyze how many available CPU cores your machine has and choose a much smaller number of threads. The flow would be:

    request -> selector -> demux -> worker threads -> mux -> selector -> response
  2. Like @David Schwartz said, you can make your outbound network calls through the same selector (and thread) that's receiving requests. They will be asynchronous network calls that would never block the selector main thread. You can see some source code for this solution in this article.

  3. You can use a distributed system architecture, where the blocking operation is performed in a separate node. So your server would just pass asynchronous messages to nodes responsible for the heavy duty task, wait but never block. For more information about how asynchronous message queues work you can check this article.

The bottom line is that if you are doing I/O you most certainly should go with option #2. If you are doing CPU computations, you most certainly should go with option #1. If you never want to care about that you should think in terms of a distributed system and go with option #3.

Disclaimer: I'm one of the developers of CoralQueue.