JAVA NIO – FileLock

Locking on File introduced with java.nio package that implemented by FileChannel. A FileLock is two types exclusive and shared. A shared lock allows other concurrently running program to acquire overlapping shared Lock. An Exclusive lock doesn’t allow other program to acquire overlapping lock.

Lock ideally supports shared Lock but it depends OS if OS doesn’t support share lock then it will act as exclusive locking. Locking applied on File not per channel or thread means if channel obtained the exclusive lock then other JVM’ channel will be block till first thread channel resume the lock.

Locks are associated with a file, not with individual file handles or channels Locking applied on File not per channel or thread means if channel obtained the lock before processing the file then other JVM’ channel will be block till first thread channel resume the lock. Since Lock is on File therefore it is not advisable to use lock on a single JVM.

Below is FileLock API in FileChannel class

public abstract class FileChannel extends AbstractInterruptibleChannel implements SeekableByteChannel,
GatheringByteChannel, ScatteringByteChannel {
public abstract FileLock lock(long position, long size, boolean shared) throws IOException;
public final FileLock lock() throws IOException {
return lock(0L, Long.MAX_VALUE, false);
}
public abstract FileLock tryLock(long position, long size, boolean shared) throws IOException;
public final FileLock tryLock() throws IOException {
return tryLock(0L, Long.MAX_VALUE, false);
}
}

The FileLock (long position, long size, boolean shared) acquire lock on specified part of a file. The region of file specified by beginning position and size whereas last argument specified if lock is shared lock (value True) of exclusive (value False). To obtain the shared lock, you need to open file with read permission whereas write permission require exclusive lock.

We can use lock () method to acquire lock on whole file not the specified area on file. File can grow therefore we need to specify the size while obtaining the lock.

Method tryLock on specified area or on full file doesn’t block/hold while acquiring a lock means it immediately return value without going to wait to acquire lock. If other process already acquired lock for specific file then it will immediate return null.

FileLock API specified as below.

public abstract class FileLock {
public final FileChannel channel( )
public final long position( )
public final long size( )
public final boolean isShared( )
public final boolean overlaps (long position, long size)
public abstract boolean isValid( );
public abstract void release( ) throws IOException;
}

The FileLock encapsulates specific region, which is acquired by FileChannel. FileLock associate with specific FileChannel instance and FileLock keep the reference of FileChannel that could be determining by channel () method in FileLock.

FileLock lifecycle will start when lock method or tryLock invoke from FileChannel and end when release () method called. It FileLock also invalid if JVM shutdown or associated channel closed.

isShared() method return whether lock is shared or exclusive. If shared lock is not supported by operating system then this method always return false.

FileLock objects are thread-safe; multiple threads may access a lock object concurrently.

Finally, overlaps(long position, long size) return if lock overlap the given block range or not.

FileLock object associated with an underlying file which occurred deadlock if you don’t release the lock hence it is advisable to always release lock as mentioned below

FileLock lock = fileChannel.lock( )
try {
..............
} catch (IOException) [
..................
} finally {
lock.release( )
}

JAVA NIO – Memory-Mapped File

In Java’s earlier version, It uses FileSystem conventional API to access system file. Behind the scene JVM makes read () and write () system call to transfer data from OS kernel to JVM. JVM utilize it memory space to load and process file that causes trouble on large data file processes. File page convert to JVM system before processing the file data because OS handle file in page but JVM uses byte stream that is not compatible with page. In JDK version 1.4 and above Java provides MappedByteBuffer which help to establish a virtual memory mapping from JVM space to filesystem pages. This removes the overhead of transferring and coping the file’s content from OS kernel space to JVM space. OS uses Virtual Memory to cache the file in outside of Kernel space that could be sharable with other non-kernel process. Java maps the File pages to MappedByteBuffer directly and process these file without loading into JVM.

Belo diagram show how JVM mapped the file with MappedByteBuffer and process the file without loading the file in JVM.

MappedByteBuffer directly map with open file in Virtual Memory by using map method in FileChannel. The MappedByteBuffer object work like buffer but its data stored in a file on Virtual Memory. The get() method on MappedByteBuffer fetch the data from file which will represent current file data stored inside disk. In similar way put () method update the content directly on disk and modified content will be visible to other readers of the file. Processing file through MappedByteBuffer has big advantage because it doesn’t make any system call to read/write on file that improve the latency. Apart from that File in Virtual Memory cache the memory pages that will directly access by MappedByteBuffer and doesn’t consumes JVM space. The only drawback is the it throw page fault if requested page is not in Memory.

Below some of the key benefits of MappedByteBuffer:

The JVM directly process on Virtual Memory hence it will avoid system read () and write() call.
JVM doesn’t load file in its memory besides it uses Virtual Memory that bring the ability to process large data in efficient manner.
OS mostly take care of reading and writing from shared VM without using JVM
It could be used more than one process based on locking provided by OS. We will be discussing locking in later.
This also provides ability map a region or part of file.
The file data always mapped with disk file data without using buffer transfer.

Note: JVM invokes the shared VM, page fault generated that push the file data from OS to shared memory. The other benefit of memory mapping is that operating system automatically manages VM space.

public abstract class FileChannel
   extends AbstractInterruptibleChannel
   implements SeekableByteChannel, GatheringByteChannel, ScatteringByteChannel
   {
   public abstract MappedByteBuffer map(MapMode mode,
           long position, long size) throws IOException;
   public static class MapMode {
   }
   }
   public static final MapMode READ_ONLY
   public static final MapMode READ_WRITE
   public static final MapMode PRIVATE

As per above code map () method on FileChannel pass the arguments as mode, position and size and return MappedByteBuffer for part of file. We can also map entire file by using as below

buffer =
fileChannel.map(FileChannel.MapMode.READ_ONLY, 0, fileChannel.size());

If you increase the size larger than file size then file will also become large to match the size of MappedByteBuffer in write mode but throw IOException in read mode because file can not modified in read mode. There are three type of mapping mode in map() method

READ_ONLY: Read only
READ_WRITE: Read and update the file
PRIVATE: Change in MappedByteBuffer will not reflect to File.

MappedByteBuffer will also not be visible to other programs that have mapped the same file; instead, they will cause private copies of the modified portions of the buffer to be created.

Map()method will throw NonWritableChannelException if it try to write on MapMode.READ_WRITE mode. NonReadableChannelException will be thrown if you request read on channel not open on read mode.

Below sample code show how to use MappedByteBuffer

public class MemoryMappedSample {
  public static void main(String[] args) throws Exception {
       int count = 10;
       RandomAccessFile memoryMappedFile = new RandomAccessFile("bigFile.txt", "rw");
       MappedByteBuffer out = memoryMappedFile.getChannel().map(FileChannel.MapMode.READ_WRITE, 0, count);
       for (int i = 0; i < count; i++) {
           out.put((byte) 'A');
       }
       for (int i = 0; i < count; i++) {
           System.out.print((char) out.get(i));
       }
   }
}

JAVA NIO – Channel

A channel uses ByteBuffer to transfer I/O from source buffers to target buffers. We can pass the data into Buffer, which send to Channel or Channel push the data into Buffer back.

Channels provide direct connection with IO to transport data from OS ByteBuffer and File or Sockets. Channel uses buffers to send and receive data, which is compatible with OS ByteBuffer and minimize the overhead to access the OS’s filesystem.

A channel provides open connection to I/O such as network socket, file etc. to read and write operations. Channel could not directly open but could be open by calling open method on RandomAccessFile, FileInputStream, or FileOutputStream object. A Channel interface provides close method and once channel closed it could not able to reopen and throws exception ClosedChannelException.

Below is the Channel interface

public interface Channel extends Closeable {
public boolean isOpen();
public void close() throws IOException;
}

Channel interface depends upon operating system native calls provide by OS provider. It provides isOpen () to check is channel is open and close () to close the open channel,

Below diagram shows overall package diagram

The InterruptibleChannel interface is marker that could be asynchronously closed and interrupted, means if a thread is blocked in an I/O operation on as interruptible channel then thread may invoke close method asynchronously.

WritableByteChannel interface that extends Channel interface provide write (ByteBuffer src) method to write a sequence of bytes to this channel from the given buffer.

RedableByteChannel interface have read (ByteBuffer) method that reads sequence of bytes from channel into the buffer passed as argument. Only one thread can invoke read operation and if other thread initiate that time it will block until the first operation is complete.

The NIO provide concretes interfaces with various implementation in SPI package which could be change by different provider. SelectableChannel uses Selector to multiplex ByteBuffer. Package java.nio.channels.spi provide implementation of various channel. For instance AbstractInterruptibleChannel and AbstractSelectableChannel, provide the methods needed by channel implementations that are interruptible or selectable, respectively.

A Channel handle communicates with I/O in both direction similar to InputStream and OutputStream. It is capable to handle concurrent write & read, means it doesn’t block I/O to be read or write.

Channels Creations

There are two types of channel File and Sockets. Sockets channels are three types SocketChannel, ServerSocketChannel and DatagramChannel. Channel could be created in several ways for instance socket channels have factory method to create new socket channels whereas FileChannel could be created by calling getChannel () in IO Stream / RandomAccessFile.

SocketChannel socketchannel = SocketChannel.open( );
socketchannel.connect (new InetSocketAddress ("host", someport));
ServerSocketChannel channel = ServerSocketChannel.open( );
ssc.socket( ).bind (new InetSocketAddress (somelocalport));
DatagramChannel channel = DatagramChannel.open( );
RandomAccessFile randaccfile = new RandomAccessFile ("file", "r");
FileChannel channel = randaccfile.getChannel( );

Channels could be unidirectional or bidirectional for instance if class implements ReadableByteChannel and WritableByteChannel both then this class will be bidirectional channel but if class implements any one of them then it will be unidirectional channel.

Selector

Selector provides capabilities to process single IO operation across multiple buffers. Selector class uses multiplex to merge multiple stream into single stream and DE-multiplex to separate single stream to multiple stream. For example write data is gathered from multiple buffers and sent to Channel.

Selector operation intern call OS native call to fill or drain data directly without directly copying to buffer.

public interface ScatteringByteChannel extends ReadableByteChannel {
public long read(ByteBuffer[] dsts, int offset, int length) throws IOException;
public long read(ByteBuffer[] dsts) throws IOException;
}

public interface GatheringByteChannel extends WritableByteChannel {
public long write(ByteBuffer[] srcs) throws IOException;
}
public interface WritableByteChannel extends Channel {
public int write(ByteBuffer src) throws IOException;

}

JAVA NIO – Buffer

A Buffer is abstract class of java.nio, which contains fixed amount of data. It can store data and later retrieve those data. Buffer has key three properties:

Capacity: A buffer’s capacity is the number of elements it contains. It is constant number that doesn’t change
Limit: A buffer limits is the index of the first element that should not be read or written. It always less than buffer’s capacity.
Position: A buffer position is the index of next element to be read from the buffer.

There is one subclass of this class for each non-boolean primitive type e.g. CharBuffer, IntBuffer etc. Buffer is abstract class that extended by non-Boolean primitive that provide common behavior across the various Buffers as shown below.

A buffer is a linear, finite sequence of elements of a specific primitive type wrapped inside an object. It constitutes date content and information about the data into single object. Data may transfer in to or out of the buffer by the I/O operations by a channel at the specified position. The I/O operation can read and write at the current position and then increment the position by the number of elements transferred. Buffer throws BufferUnderflowException and BufferOverflowException if position exceed while get operation or put operation respectively. The position field of Buffer specified the position to retrieve or insert the data element inside Buffer. Limit fields of Buffer indicates the end of the buffer which can be set using below operation

public final Buffer limit(int newLimit)

We can also drain the buffer by using flip method as below operation

public final Buffer flip()

Flip method set the limit to the current position and position to 0.The rewind () method is similar to flip () but does not affect the limit. It only sets the position back to 0. You can use rewind () to go back and reread the data in a buffer that has already been flipped.

Byte Buffers:

ByteBuffer is similar to OS ByteBuffer that could be mapped to OS’s ByteBuffer without any translation. ByteBuffer uses Byte core unit to read and write IO data, which is more significant, compare to other primitive data type buffers.

Byte buffers can be created either by allocation, which allocates space for the buffer’s content, or by wrapping an existing byte array into a buffer.

ByteBuffer are two type Direct Buffer and Indirect Buffer.

Direct & Indirect Buffers

ByteBuffer are two types Direct Buffer and Indirect Buffers. The key difference between direct buffer and indirect buffer is that direct buffer could directly access native IO calls whereas indirect buffers could not. Direct buffer access file data directly by using native I/O operation to fill or drain byte buffer. Direct buffer is memory consuming but other side it provided most efficient I/O mechanism. It doesn’t copy the buffer’s content to an intermediate buffer or vice versa.

Indirect buffer could also be used to pass the data but it could not directly uses native I/O operation upon it. Indirect buffer indirectly uses temporary direct buffer to access file IO data.

A direct buffer could be created by allocateDirect factory method, which is more expensive, and therefore it is advisable to use direct buffers only when they yield a measureable gain in program performance.

Whether a byte buffer is direct or non-direct may be determined by invoking its isDirect () method. This method is provided so that explicit buffer management can be done in performance-critical code

JAVA NIO – Introduction

Operating System allocates memory to JVM to process its task. In old JDK (>1.4), JVM uses FileSystem API to access file from hard disk which is quite burden to JVM because there is no direct reference between JVM and File inside hard disk. JVM uses Operating System system calls to access file. Operating System stores files into large ByteBuffer, which is quite large compare to byte stream used by JVM. JVM uses extra efforts to convert ByteBuffer to byte stream and vice versa.

So there were two key challenges in existing old JDK

It could not access file directly from disk
It has to do extra efforts to convert files data to byte stream

Following are the key steps to access Files

Operating System allocate memory to JVM
Client invoke FileSystem to access specific file from OS
JVM make OS system call to access to File data
JVM get OS’s data as ByteBuffer and converts it into Byte Stream

For large file OS uses Virtual Memory to store data outside of RAM. The benefit of Virtual Memory is that it is sharable across multiple processes and hence VM could be accesses by OS and JVM both. From transferring data from OS to VM could be quite fast by using DMA (Direct Memory Access) whereas transferring data from VM to JVM is slow because JVM does extra efforts to break large data buffer to byte stream.

JAVA FileSystem: Java uses FileSystem API to access physical storage inside system. When client try to access a particular file via FileSystem, FileSystem identify storage location and load those disk stores into memory.

File’s data stores into multiple pages and page contain group of block. Kernel establishes mapping between memory pages and filesystem pages.

The Virtual memory read paging content from disk and uses page fault to synchronize the file data to Virtual Memory. Once pageins completed FileSystem read the file contents and its Meta information

JAVA NIO

JAVA NIO that introduce JDK 1.4 and keep enhancing on newer version improve the I/O operations. It provides new type of buffers such as ByteBuffer, CharBuffer, and IntBuffer etc. that reduce the overhead during transferring data from OS to JVM. Java NIO could be able to map directly from VM to JVM bye using new ByteBuffer. JAVA ByteBuffer is same as OS Byte Buffer that’s why it could be easily mapped from OS Byte Buffer to JVM Byte Buffer so less overhead on data conversion from OS to JVM.

As per above diagram if OS uses VM and JVM use newly NIO interface, it enhances the performance while processing file especially large file. We will also discuss later there are another MappedByteBuffer which have capabilities to directly process the file in VM without transferring data from VM to JVM.

Java Interview Reference Guide – Collection Framework

A Collection represents a group of objects known as elements. Collection framework is basically group of interface with some specific implementation to manage elements of collection. Continue reading “Java Interview Reference Guide – Collection Framework” →