Bean Counting
… rsync, integrit, aide – all these tools monitor the system's directory tree and issue an alarm as soon as they detect unauthorized changes.
|
… rsync, integrit, aide – all these tools monitor the system's directory tree and issue an alarm as soon as they detect unauthorized changes.
The data pool on storage media doesn't last forever. There is a risk of loss due to natural aging, defects of the data storage but also by mistakes or even intrusions into the system. Therefore, it is part of each responsible system administrator's tasks to audit if the data is intact and whether there have been changes.
In order to prevent write access to a storage medium or directory you can use a write-only medium like a DVD or activate a write protection, for example when using a SD card. Experienced users often mount selected directories as read-only or set their write protection to on (see the "Mount Filesystems as Read-Only box).
Mount Filesystems as Read-Only
The file /etc/fstab lists all partitions the system mounts into the directory tree. In the fourth column of each row, you can define with ro (read-only) or rw (read-write) whether one can only read or also write on to the partition (Listing 1).
Another option is to use the immutable flag [1]. This flag is part of a directory entry's advanced attributes, which only a few Linux users know and even fewer use in practice. If File-based Access Control Lists (FACLs) come into play on top of that, for example in the context of SELinux [2] in distributions like Red Hat, Fedora, or CentOS, one can define access rights even more precisely.
With the immutable flag set, the directory entry cannot be changed any more – the file or the directory is write protected. Each attempt to modify data is denied by the operating system. Only the root user can set and remove this flag for single users. You can do the former by using chattr +i <file> , or the latter with chattr -i <file> . The i in the fourth row of Listing 2 shows that the file example.txt carries the immutable flag.
Listing 1
fstab
$ grep data /etc/fstab /dev/sdb1 /data ext4 ro 0 0
Listing 2
Attributes of example.txt
# touch example.txt # chattr +i example.txt # lsattr example.txt ----i--------e-- example.txt # echo "# Comment" >> example.txt bash: file: Operation not permitted # chattr -i example.txt
A change in the data pool may happen with regard to either its content by additions and deletions or also data-access privileges. Possible modifications also include adding, renaming, moving and deleting of files, directories, and (symbolic) links. Your concern as a system administrator is to understand at which point in time which modifications happened, which user executed them, and – in case of errors – how you can repair things.
Detecting modifications beforehand including an appropriate reaction to such an incident goes beyond the scope of this article, so I will focus on how you can detect such changes retroactively, after they happen. In the examples below, you'll see the steps to follow with rsync and integrit .
In principle, the procedures can be used with other tools as well (e.g., Tripwire , Aide , and Iwatch ). However the configuration and evaluation of the results will differ one from another.
If you have two data pools – for example an original and a backup – you can already make use of the default tool rsync [3] to detect differences between both datasets. Rsync was originally designed to be used to synchronize two directories, and it echoes to the terminal which entries differ from each other.
Listing 3 demonstrates this with the two directories original/ and copy/ . Each contains three initially identical files. But in the copy/ directory, I have modified data. While alright.txt remains unchanged, I set the execution bit for the group for anything.txt and added additional content to somewhat.txt .
Listing 3
Using rsync to find changes
$ ls -la {original,copy} copy: total 16 drwxr-xr-x 2 frank frank 4096 Jun 1 14:28 . drwxr-xr-x 4 frank frank 4096 Jun 1 14:25 .. -rw-r-xr-- 1 frank frank 15 Jun 1 16:36 alright.txt -rw-r-xr-- 1 frank frank 10 Jun 1 14:28 anything.txt -rw-r--r-- 1 frank frank 24 Jun 1 14:30 somewhat.txt original: total 16 drwxr-xr-x 2 frank frank 4096 Jun 1 14:26 . drwxr-xr-x 4 frank frank 4096 Jun 1 14:25 .. -rw-r-xr-- 1 frank frank 15 Jun 1 16:36 alright.txt -rw-r--r-- 1 frank frank 10 Jun 1 14:26 anything.txt -rw-r--r-- 1 frank frank 10 Jun 1 14:26 somewhat.txt $ rsync -anv --out-format="[%t]:%o:%f:Last Modified %M" copy/* original sending incremental file list [2016/06/01 16:40:25]:send:copy/alright.txt:Last Modified 2016/06/01-16:36:14 [2016/06/01 16:40:25]:send:copy/anything.txt:Last Modified 2016/06/01-14:28:49 [2016/06/01 16:40:25]:send:copy/somewhat.txt:Last Modified 2016/06/01-14:30:23 sent 137 bytes received 25 bytes 324.00 bytes/sec total size is 34 speedup is 0.21 (DRY RUN)
Rsync allows a dry run using the -n switch (long option --dry-run ). Here you use this mode of operation to detect modification without actually starting a synchronization of both directories. Rsync compares both folders by using -a (long option --archive ) taking into account the names of the existing entries, their size, and the set access permissions.
Without any additional options, rsync behaves a little bit tight-lipped. Only when using -v (long version --verbose ) does it show details of the transactions taken place. The option -v can be set multiple times where necessary to increase the amount of details. The additional switch --out-format defines how rsync comments the details about the data transaction.
In my example, %t prints the transfer's timestamp, %o the action to be executed (send or receive), %f the file name, and %M the timestamp of the last modification (see Table 1 [4]). Additional help for rsync can be obtained from an introductory article [5], as well as the rsync man page.
Table 1
Rsync Format Placeholder
Placeholder | Meaning |
---|---|
%a | Remote IP address |
%b | Number of bytes actually transferred |
%B | Permission bits of the file (e.g., rwxrwxrwt ) |
%c | Total size of the block checksums received for the basis file (only when sending) |
%f | File name (long form on sender; no trailing "/") |
%G | GID of the file (decimal) or DEFAULT |
%h | Remote hostname |
%i | Itemized list of what is being updated |
%l | Length of the file in bytes |
%L | String -> SYMLINK , => HARDLINK , or empty |
%m | Module name |
%M | Last-modified time of the file |
%n | Filename (short form; trailing "/" on dir) |
%o | Operation (send , recv , or del ) |
%p | PID of the rsync session |
%P | Module path |
%t | Current date and time |
%u | Authenticated username or an empty string |
%U | UID of the file (decimal) |
Because you only care about the modified entries, you can also make use of the combination of rsync -i (long form --info ) and the filter tool grep . From the detailed but still compact output of rsync, you filter out only information that contains modifications. All other lines are dropped.
The output contains one line per file, each of which is preceded by a > . The following 10 characters represent the properties rsync uses to compare the two entries. If there is a dot in any of the positions, there is no difference between the files regarding that property. If there are letters, there is a modification. For example, c stands for checksum, meaning the files' checksums or hash values are different; s indicates different sizes; and p indicates different permissions.
You filter the relevant lines from the output by using grep and an appropriate regular expression. The expression used in Listing 4 matches sequences that start with an f , followed by an arbitrary character, which is followed by either a dot and tp , st and a dot, or three dots. The last two lines contain the matches.
Listing 4
Comparing checksums
$ rsync -acniv copy/* original | grep --color -E "f.(\.tp|st\.|\.\.\.)" >f..tp..... anything.txt >fcst...... somewhat.txt
The -c switch in the rsync call is a peculiar case: It makes the program compare the files not only by their size but also calculates a checksum in the form of a hash value (see the "Hash Values" box). In doing so, you can also trace the modifications made to contents that do not change the size and where the timestamp has been set back to the original date afterwards.
Hash Values
Hash functions belong to the cryptographic methods. They can be used to calculate checksums. With Linux, you can use the tools md5sum (MD5 with 128 bits), sha1sum (SHA1 with 160 bits), sha224sum (SHA2 with 224 bits), sha256sum (SHA2 wth 256 bits), sha384sum (SHA2 with 384 bits), and sha512sum (SHA2 with 512 bits). The numerical sequence usually describes the length of the resulting hash value in bits whereby MD5 and SHA1 mark an exception. If your system isn't equipped with any of the listed applications, you can use opensslc , which also calculates hash values.
In order to check quickly whether two files have the same content, the Linux tools cmp , comm , diff , and sdiff can only partly help. They work line-by-line, byte-by-byte, or block-by-block and are excruciatingly slow in some cases. Instead, the shell script from Listing 5 uses the SHA256 operation on lines 3 and 4 – MD5 and SHA1 aren't considered to be safe anymore.
Listing 5
Comparing with SHA256
#! /bin/bash # Create hash values hashValue1=$(sha256sum $1 | awk ,{ print $1 }') hashValue2=$(sha256sum $2 | awk ,{ print $1 }') # Compare hash values if [ $(echo -e "$hashValue1\n$hashValue2" | uniq | wc -l) == 1 ]; then echo "$1 and $2 are identical." exit 0 fi echo "$1 and $2 are not identical." exit 1
The more compact Listing 6 solves the problem with less computational cost but requires a deeper understanding of shell programming. You execute it with two files as parameters. Following usual Unix practices, the return value 0 in line 3 is for parity, and the value 1 in line 5 for disparity.
Listing 6
Comparing with SHA256 (II)
#! /bin/bash if [ "$(sha256sum $1 | awk ,{ print $1 }')" == "$(sha256sum $2 | awk ,{ print $1 }')" ]; then echo "$1 and $2 are identical."; exit 0 fi echo "$1 are $2 are not identical." exit 1
Pages: 6
If you don't have the right tools, comparing PDF documents for differences can be very cumbersome. We discuss five nifty tools that can help with this task.
Network plans, nested dependencies, or binary trees – with Graphviz, you can visualize complex relationships in a simple way.
The clever mintBackup not only backs up your files, it also lists your currently installed programs.
The Obnam command-line tool allows backups and restores, even when the X server is on strike. Its many options will easily meet the needs of a SOHO environment.
© 2024 Linux New Media USA, LLC – Legal Notice