项目作者: clsync

项目描述 :
file live sync daemon based on inotify/kqueue/bsm (Linux, FreeBSD), written in GNU C
高级语言: C
项目地址: git://github.com/clsync/clsync.git
创建时间: 2013-07-18T08:22:29Z
项目社区:https://github.com/clsync/clsync

开源协议:Other

下载


Build Status
Gitter

clsync

0 - Contents

  1. Name
  2. Motivation
  3. inotify vs fanotify
  4. Installing
  5. How to use
  6. Example of usage
  7. Other uses
  8. Clustering
  9. FreeBSD support
  10. Support
  11. Developing
  12. Articles
  13. See also

1 - Name

Why clsync? The first name of the utility was insync (due to inotify) but
then I was suggested to use fanotify instead of inotify and utility has been
renamed to fasync. Then I started to intensively write the program and
I faced with some problems in fanotify (see “inotify vs fanotify”). So I was
have to temporary fallback to inotify, so I decided that the best name is
“Runtime Sync” or “Live Sync” but rtsync is a name of some corporation and
lsync is busy by “lsyncd“. So I called it
clsync that should be interpreted as “lsync but on c” due to “lsyncd” that
written on “LUA” and may be used for similar purposes.

2 - Motivation

This utility has been written for two purposes:

  • for making high availability clusters
  • for making backups of them

To do a HA cluster I’ve tried a lot of different solutions, like “simple
rsync by cron”, “glusterfs”, “ocfs2 over drbd”, “shared replicated external
storage”, “incron + perl + rsync”, “inosync”, “lsyncd” and so on. When I
started to write the utility we were using “lsyncd”, “ceph” and
“ocfs2 over drbd”. However all of this solutions doesn’t satisfy me, so I
was have to write own utility for this purpose.

To do backups we also tried a lot of different solution, and again I was have
to write own utility for this purpose.

The best known (for me) replacement for this utility is “lsyncd”, however:

  • It’s code is on LUA. There a lot of problems connected with it,
    for example:
    • It’s more difficult to maintain the code with ordinary sysadmin.
    • It really eats 100% CPU sometimes.
    • It requires LUA libs, that cannot be easily installed to few
      of our systems.
  • It’s a little buggy (it crashed on our cases).
  • Sometimes, it’s too complex in configuration for our situation (not flexible
    enough).
  • It doesn’t support kqueue/bsm (we also had a FreeBSD-based system).
    etc

Long story short: “lsyncd” - is a good and useful utility, just did not
fit to our needs well enough. And we spent enough much time on tuning
“lsyncd” to realize that we could’ve already write a new tool
specialized to our tasks. So there it is :)

Also clsync had been used for some other tiny tasks, like to replace
incron/csync2/etc in our HPC-clusters for syncing /etc/{passwd,shadow,group,shells}
files and running post-scripts.

3 - inotify vs fanotify

It’s said that fanotify is much better than inotify. So I started to write
this program with using of fanotify. However I encountered the problem, that
fanotify was unable to catch some important events at the moment of writing
the program, like “directory creation” or “file deletion”. So I switched to
“inotify”, leaving the code for “fanotify” in the safety… So, don’t use
“fanotify” in this utility ;).

UPD: Starting with kernels 5.1 we will be able to use fanotify for all events ;)

4 - Installing

Linux Distributions

Some distributions already have clsync supported in the main repo:

Debian/Ubuntu:

  1. apt-get install clsync

An optional clsync socket monitoring and control library is available
in the libclsync0 package and its devel files are in the
libclsync-dev

Gentoo:

  1. emerge clsync

You may customize all clsync features via a multitude of USE flags.

Alt Linux:

  1. apt-get install clsync

An optional clsync socket monitoring and control library is available
in the libclsync package and its devel files are in the
libclsync-devel. Examples are located in the clsync-examples
package and doxygen API documentation is in clsync-apidocs.

From the Source Code

If it’s required to install clsync from the source, first of all, you should
install dependencies to compile it. Names may vary in various
distributions, but you’ll get the idea:

Only the following packages are mandatory:
glib2-devel autoreconf gcc

Dependencies for optional features:

  • libcap-devel — capabilities support for privilege separation
  • libcgroup-devel — cgroups support for privilege separation
  • libmhash-devel — use mhash for faster Adler-32 implementation
    (used only in cluster and kqueue code)
  • doxygen — to build API documentation
  • graphviz — to build API documentation

Next step is generating Makefile. To do that usually it’s enough to execute:

  1. autoreconf -i && ./configure

You may be interested in various configuration options, so see for
details:

  1. ./configure --help

Next step is compiling. To compile usually it’s enough to execute:

  1. make -j$(nproc)

Next step is installing. To install usually it’s enough to execute:

  1. su -c 'make install'

Portable binary

It is also possible to build a portable static binary:

  1. ./configure --without-libcgroup --without-gio --disable-shared
  2. make clean all -j$(nproc) LDFLAGS='-all-static'
  3. ldd ./clsync

5 - How to use

How to use is described in “man” ;). What is not described, you can ask me
personally (see “Support”).

See also section 7 of this document.

6 - An example from scratch

Example of usage, that works on my PC is in directory “examples”. Just run
“clsync-start-rsyncdirect.sh” and try to create/modify/delete files/dirs in
“example/testdir/from”. All modifications should appear (with some delay) in
directory “example/testdir/to” ;)

For dummies:

  1. pushd /tmp
  2. git clone https://github.com/clsync/clsync
  3. cd clsync
  4. autoreconf -fi
  5. ./configure
  6. make
  7. export PATH_OLD="$PATH"
  8. export PATH="$(pwd):$PATH"
  9. cd examples
  10. ./clsync-start-rsyncdirect.sh
  11. export PATH="$PATH_OLD"

Now you can try to make changes in directory
“/tmp/clsync/examples/testdir/from” (in another terminal).
Wait about 7 seconds after the changes and check directory
“/tmp/clsync/examples/testdir/to”. To finish the experiment press ^C
(control+c) in clsync’s terminal.

  1. cd ../..
  2. rm -rf clsync
  3. popd

Note: There’s no need to change PATH’s value if clsync is installed
system-wide, e.g. with

  1. make install

For dummies, again (with “make install”):

  1. pushd /tmp
  2. git clone https://github.com/clsync/clsync
  3. cd clsync
  4. autoreconf -fi
  5. ./configure
  6. make
  7. sudo make install
  8. cd examples
  9. ./clsync-start-rsyncdirect.sh

Directory “/tmp/clsync/examples/testdir/from” is now synced to
“/tmp/clsync/examples/testdir/to” with 7 seconds delay. To terminate
the clsync press ^C (control+c) in clsync’s terminal.

  1. cd ..
  2. sudo make uninstall
  3. cd ..
  4. rm -rf clsync
  5. popd

For really dummies or/and lazy users, there’s a video demonstration:
http://ut.mephi.ru/oss/clsync

7 - More examples (use cases)

Mirroring a directory:

  1. clsync -Mrsyncdirect -W/path/to/source_dir -D/path/to/destination_dir

Syncing authorized_keys files:

  1. mkdir -p /etc/clsync/rules
  2. printf "+w^$\n+w^[^/]+$\n+W^[^/]+/.ssh$\n+f^[^/]+/.ssh/authorized_keys$\n-*" > /etc/clsync/rules/authorized_files_only
  3. clsync -Mdirect -Scp -W/mnt/master/home/ -D/home -R/etc/clsync/rules/authorized_files_only -- -Pfp --parents %INCLUDE-LIST% %destination-dir%

Mirroring a directory, but faster:

  1. clsync -w5 -t5 -T5 -Mrsyncdirect -W/path/to/source_dir -D/path/to/destination_dir

Instant mirroring of a directory:

  1. clsync -w0 -t0 -T0 -Mrsyncdirect -W/path/to/source_dir -D/path/to/destination_dir

Making two directories synchronous:

  1. clsync -Mrsyncdirect --background -z /var/run/clsync0.pid --output syslog -Mrsyncdirect -W/path/to/dir1 -D/path/to/dir2 --modification-signature '*'
  2. clsync -Mrsyncdirect --background -z /var/run/clsync1.pid --output syslog -Mrsyncdirect -W/path/to/dir2 -D/path/to/dir1 --modification-signature '*'

Fixing privileges of a web-site:

  1. clsync -w3 -t3 -T3 -x1 -W/var/www/site.example.org/root -Mdirect -Schown --uid 0 --gid 0 -Ysyslog -b1 --modification-signature uid,gid -- --from=root www-data:www-data %INCLUDE-LIST%

‘Atomic’ sync:

  1. clsync --exit-on-no-events --max-iterations=20 --mode=rsyncdirect -W/var/www_new -Srsync -- %RSYNC-ARGS% /var/www_new/ /var/www/

Moving a web-server:

  1. clsync --exit-on-no-events --max-iterations=20 --pre-exit-hook=/root/stop-here.sh --exit-hook=/root/start-there.sh --mode=rsyncdirect --ignore-exitcode=23,24 --retries=3 -W /var/www -S rsync -- %RSYNC-ARGS% /var/www/ rsync://clsync@another-host/var/www/

Copying files to slave-nodes using pdcp(1):

  1. clsync -Msimple -S pdcp -W /opt/global -b -Y syslog -- -a %INCLUDE-LIST% %INCLUDE-LIST%

Copying files to slave-nodes using uftp(1):

  1. clsync -Mdirect -S uftp -W/opt/global --background=1 --output=syslog -- -M 248.225.233.1 %INCLUDE-LIST%

A dry running to see rsync(1) arguments that clsync will use:

  1. clsync -Mrsyncdirect -S echo -W/path/to/source_dir -D/path/to/destination_dir

An another dry running to look how clsync will call pdcp(1):

  1. clsync -Msimple -S echo -W /opt/global -b0 -- pdcp -a %INCLUDE-LIST% %INCLUDE-LIST%

Automatically run make build if any *.c file changed

  1. printf "%s\n" "+f.c$" "-f" | clsync --have-recursive-sync -W . -R /dev/stdin -Mdirect -r1 --ignore-failures -t1 -w1 -Smake -- build

8 - Clustering

I’ve started to implement support of bi-directional syncing with using
multicast notifing of other nodes. However it became a long task, so it was
suspended for next releases.

However let’s solve next hypothetical problem. For example, you’re using
LXC and trying to replicate containers between two servers (to make failover
and load balancing).

In this case you have to sync containers in both directions. However, if you
just run clsync to sync containers to neighboring node on both of them, you’ll
get sync-loop [file-update on A causes file-update on B causes file-update
on A causes …].

Well, in this case I with my colleagues were using separate directories for
every node of cluster (e.g. “/srv/nodes/<NODE NAME>/containers/<CONTAINERS>“)
and syncing every directory only in one direction. That was failover with
load-balancing, but very unconvenient. So I’ve started to write code for
bi-directional syncing, however it’s no time to complete it :(. So
Andrew Savchenko proposed to run one clsync-instance per container. And this’s
really good solution. It’s just need to start clsync-process when container
starts and stop the process when containers stops. The only problem is
split-brain, that can be solved two ways:

  • by human every time;
  • by scripts that chooses which variant of container to save.

Example of the script is just a script that calls “find” on both sides to
determine which side has the latest changes :)

UPD: I’ve added option “—modification-signature” that helps to prevent syncing file, that is not changed. You can easily use it to prevent sync-loops for bi-directional syncing.

9 - FreeBSD support

clsync has been ported to FreeBSD.

FreeBSD doesn’t support inotify, so there’re 3.5 ways to use clsync on it:

  • using libinotify;
  • using BSM API (with or without a prefetcher thread);
  • using kqueue/kevent directly.

And any of this methods is bad (in it’s own way), see the excerpt from the
manpage:

  1. Possible values:
  2. inotify
  3. inotify(7) [Linux, (FreeBSD via libinotify)]
  4. Native, fast, reliable and well tested Linux FS monitor subsystem.
  5. There's no essential performance profit to use "inotify" instead of
  6. "kevent" on FreeBSD using "libinotify". It backends to "kevent" any‐
  7. way.
  8. FreeBSD users: The libinotify on FreeBSD is still not ready and unus‐
  9. able for clsync to sync a lot of files and directories.
  10. kqueue
  11. kqueue(2) [FreeBSD, (Linux via libkqueue)]
  12. A *BSD kernel event notification mechanism (inc. timer, sockets,
  13. files etc).
  14. This monitor subsystem cannot determine file creation event, but it
  15. can determine a directory where something happened. So clsync is have
  16. to rescan whole dir every time on any content change. Moreover,
  17. kqueue requires an open() on every watched file/dir. But FreeBSD
  18. doesn't allow to open() symlink itself (without following) and it's
  19. highly invasively to open() pipes and devices. So clsync just won't
  20. call open() on everything except regular files and directories. Con
  21. sequently, clsync cannot determine if something changed in sym
  22. link/pipe/socket and so on. However it still can determine if it
  23. will be created or deleted by watching the parent directory and res
  24. caning it on every appropriate event.
  25. Also this API requires to open every monitored file and directory. So
  26. it may produce a huge amount of file descriptors. Be sure that
  27. kern.maxfiles is big enough (in FreeBSD).
  28. CPU/HDD expensive way.
  29. Not well tested. Use with caution!
  30. Linux users: The libkqueue on Linux is not working. He-he :)
  31. bsm
  32. bsm(3) [FreeBSD]
  33. Basic Security Module (BSM) Audit API.
  34. This is not a FS monitor subsystem, actually. It's just an API to
  35. access to audit information (inc. logs). clsync can setup audit to
  36. watch FS events and report it into log. After that clsync will just
  37. parse the log via auditpipe(4) [FreeBSD].
  38. Reliable, but hacky way. It requires global audit reconfiguration
  39. that may hopple audit analysis.
  40. Warning! FreeBSD has a limit for queued events. In default FreeBSD
  41. kernel it's only 1024 events. So choose one of:
  42. - To patch the kernel to increase the limit.
  43. - Don't use clsync on systems with too many file events.
  44. - Use bsm_prefetch mode (but there's no guarantee in this case
  45. anyway).
  46. See also option --exit-on-sync-skip.
  47. Not well tested. Use with caution! Also file /etc/secu
  48. rity/audit_control will be overwritten with:
  49. #clsync
  50. dir:/var/audit
  51. flags:fc,fd,fw,fm,cl
  52. minfree:0
  53. naflags:fc,fd,fw,fm,cl
  54. policy:cnt
  55. filesz:1M
  56. unless it's already starts with "#clsync\n" ("\n" is a new line char‐
  57. acter).
  58. bsm_prefetch
  59. The same as bsm but all BSM events will be prefetched by an addi‐
  60. tional thread to prevent BSM queue overflow. This may utilize a lot
  61. of memory on systems with a high FS events frequency.
  62. However the thread may be not fast enough to unload the kernel BSM
  63. queue. So it may overflow anyway.
  64. The default value on Linux is "inotify". The default value on FreeBSD is "kqueue".

I hope you will send me bugreports to make me able to improve the FreeBSD support :)

10 - Support

To get support, you can contact with me this ways:

  • Official IRC channel of “clsync”: irc.freenode.net#clsync
  • And e-mail: xaionaro@gmail.com.

11 - Developing

I started to write “DEVELOPING” and “PROTOCOL” files.
You can look there if you wish. ;)

I’ll be glad to receive code contribution :)

The astyle command:

  1. astyle --style=linux --indent=tab --indent-cases --indent-switches --indent-preproc-define --break-blocks --pad-oper --pad-paren --delete-empty-lines

12 - Articles

Russian:

LVEE (Russian):

13 - See also