This translator minimizes the data that is transferred over the wire by compressing (deflate) it before it's written to the network. This compressed data is decompressed (inflate) on the client side. Hence, this translator is needed to be loaded on client and well as server with inverse operation modes.
Compression and Decompression would be referred as deflate and inflate (respectively) further ahead in this document.
Data transferred over the network as a result of file read operation is deflated, i.e. the translator implements a readv fop that is responsible for deflating the data before passing it to server protocol. On the client side, this deflated data is inflated after client protocol hands over the data read from the network to the translator.
It may be worthwhile to deflate data for writev fop too. In this case, the operation mode would need to be flipped as the client would deflate and the server would inflate the data (for this fop).
Deflated data is transferred in Gzip format. Zlib library is used to deflate/inflate data and for data checksums.
Gzip Format: <Gzip header> + <compressed data> + <Gzip trailer>
10 byte header consisting of:
'\037', '\213', Z_DEFLATED, 0, 0, 0, 0, 0, 0, 0x03
As of now, we transfer this header as an aid to assist debugging. In memory deflated data can be written to a file on disk and can be examined for correctness in deflation. (See Debugging later in this document)
Identification of deflated data on client side is done by the presence of a key:value pair in dictionary and not by checking bits in the Gzip header. Commit 9d3af... introduces a extra dict in all fops which can exchanges b/w all layers. Presence of a chosen key indicates deflated content.
Deflation is taken care by deflate() routine in Zlib library. Interested people may look at this. Similarly, Inflating data is taken care by inflate() routine in Zlib library.
Both APIs need correct pointers (Zlibs stream structure) to input and output buffers along with the length of the buffers.
Trailer is 8 bytes in length; first 4 bytes is the checksum of the original data and the next 4 bytes is it's length. This is primarily used to validate the correctness of the inflated data on the client side. gzip
also makes use of the trailer for the same.
Load the translator above server protocol
with operation mode as compress. I stripped of other translators for brevity. (Translator options are mentioned further ahead in this document)
volume colon-d-posix
type storage/posix
option directory /d0
option volume-id 928515dd-fc50-4612-a87a-7440cb87c258
end-volume
volume colon-d-cdc
type features/cdc
option mode compress
option buffer-size 16384
subvolumes colon-d-posix
end-volume
volume /d0
type debug/io-stats
option latency-measurement off
option count-fop-hits off
subvolumes colon-d-cdc
end-volume
volume colon-d-server
type protocol/server
option transport-type tcp
option auth.addr./d0.allow *
subvolumes /d0
end-volume
Load the translator below dht
with operation mode as decompress. Note, loading below client protocol
makes more sense as dht readv callback
should see the actual data. Try it out if you want that.
volume colon-d-client-0
type protocol/client
option remote-host 192.168.1.75
option remote-subvolume /d0
option transport-type tcp
end-volume
volume colon-d-client-1
type protocol/client
option remote-host 192.168.1.75
option remote-subvolume /d1
option transport-type tcp
end-volume
volume colon-d-dht
type cluster/distribute
subvolumes colon-d-client-0 colon-d-client-1
end-volume
volume colon-d-cdc
type features/cdc
option mode decompress
option buffer-size 16384
subvolumes colon-d-dht
end-volume
volume colon-d
type debug/io-stats
option latency-measurement on
option count-fop-hits on
subvolumes colon-d-cdc
end-volume
Some of the translator options directly correspond to the Zlib options mentioned here. The most relevant options are:
buffer-size
: Internal buffer size used by Zlib (best is 16K).
cdc-level
: The compression level; Ranges from 0 (No compression), 1 (Best speed, not-so-good compression) upto 9 (Best compression, slow speed). Defaults to -1 which provides a good compromise between compression and speed.
mode
: compress
or decompress
; depending on where the translator is loaded.
It's a pain to read gzip data since you cannot make much sense about it. One way to check if the data was deflated properly is to gdb
to one of the brick processes and do the bits outlined below:
gdb -p <pidof-glusterfsd>
(gdb) cdc_readv_cbk
Breakpoint 1 at 0x7ff918b54832: file cdc.c, line 566.
(gdb) c
Continuing.
At this point cat
a file on Gluster
mount.
Breakpoint 1, cdc_readv_cbk (frame=0x7ff91b86e0d8, cookie=0x7ff91b86e184, this=0x67e5b0, op_ret=14, op_errno=2, vector=0x7fffd70231d0, count=1, buf=0x7fffd7023150, iobref=0x678d60, xdata=0x0) at cdc.c:566
(gdb) # 'n' a number of times till you reach
587
(gdb) ret = cdc_compress (this, priv, &ci);
588
(gdb) if (ret)
(gdb) call cdc_dump_iovec_to_disk (this, &ci, "/tmp/file.gz")
At this point the compressed data would be written to /tmp/file.gz
. gzip -d /tmp/file.gz
would inflate the file and produce the actual data.