GSoC: Improvements in smbclient backup mode

Table of Contents


The goal of this project is to replace smbclient own limited tar implementation with a more complete one using a separate library (libarchive).

Using libarchive would fix several bugs related to the current tar implementation, it would provide multiple archive/compression formats and it might improve performances.

You can read more about my initial plans on my proposal.


About the journal

The journal is just an Org-mode document edited on GNU Emacs. The CSS is inlined in the document using a trick I described in my blog.

I've also made these small functions to edit, export and upload quickly the document from Emacs:

(setq my-journal "~/work/gsoc13/")

(defun my-back-command (cmd)
  "Start async process silently"
  (interactive "sCmd: ")
  (start-process (concat " *cmd: " cmd) nil "bash" "-c" cmd))

(defun my-new-journal ()
  "Open journal, insert date and place point for a new entry"
  (find-file my-journal)
  (goto-char (point-max))
  (insert "\n** " (format-time-string "%Y-%m-%d") "\n\n"))

(defun my-test-journal ()
  "Export journal locally and preview it in firefox"
  (with-current-buffer (find-file-noselect my-journal)
    (let ((fn (expand-file-name (org-html-export-to-html))))
      (my-back-command (concat "firefox file://"
                               (shell-quote-argument fn)))
      (my-back-command "xdotool search 'mozilla firefox' windowraise"))))

(defun my-push-journal ()
  "Export journal, copy it on server using scp and view it in firefox"
  (with-current-buffer (find-file-noselect my-journal)
    (let ((fn (expand-file-name (org-html-export-to-html))))
      (my-back-command (format "scp %s diobla:old/doc/gsoc13/"
                               (shell-quote-argument fn)))
      (my-back-command (concat "firefox"))
      (my-back-command "xdotool search 'mozilla firefox' windowraise"))))

The const nightmare

Consider this code, which is very close in form to what lib/util/util_strlist.c exposes:

char** list_empty (void)
    return 0;

const char** list_add (const char** list, const char *s)
    return 0;

int main (void)
    char** list = list_empty();
    list = list_add(list, "foo");
    return 0;

But the const keyword in C has a complex semantic. If you compile this program you get this from GCC (4.8.1):

$ gcc -Wall const_nightmare.c

const_nightmare.c: In function ‘main’:
const_nightmare.c:15:5: warning: passing argument 1 of ‘list_add’ from incompatible pointer type [enabled by default]
     list = list_add(list, "foo");
const_nightmare.c:7:14: note: expected ‘const char **’ but argument is of type ‘char **’
 const char** list_add (const char** list, const char *s)
const_nightmare.c:15:10: warning: assignment from incompatible pointer type [enabled by default]
     list = list_add(list, "foo");

You can't pass a mutable variable to a function which expects a constant when its type has more than one level of dereferencing without triggering a warning.

  • char** is basically a table of strings
  • const char**: table of const string ie. each character is constant.
  • char const ** : the same type as above.
  • char * const *: you can't change the pointer in each cell of the table (but you can change the characters).
  • char ** const: you can't point to another table ie. the variable can only be set when it's defined.
  • char const * const * const *: you can't change anything once the variable is defined.

You can always either cast of change the type of the variable list but:

  • casting is not always safe! this is why we have all those warnings in the first place.
  • changing char** list to const char** list will make the list_empty() call emit warnings because the type still differs.

The problem here is that the list_empty() function doesn't return a const. Thus, by adding a const to the return type and the variable type we can finally compile without any warnings:

const char** list_empty (void)
    return 0;

const char** list_add (const char** list, const char *s)
    return 0;

int main (void)
    const char ** list = list_empty();
    list = list_add(list, "foo");
    return 0;

Cool! … But now I can't change the characters of the strings, even out of the functions… Noooo..

You can cast everything but it's ugly:

int main (void)
    char ** list = (char**)list_empty();
    list = (char**)list_add((const char**)list, "foo");
    list[0][0] = 4;
    return 0;

Samba use the -Wcast-qual flag which triggers warnings when you discard a const by casting so it's useless.

The only solution really is to drop all the const of functions and variable types.

I've came to the conclusion that there is no way to only indicate that a function will not mutate its arguments with a const qualifier in the prototype without triggering warnings/errors (in the case of multiple dereferencing).

How file matching works in smbclient

There are 2 selections modes (inclusion (default) and exclusion) and 2 operation modes (creation and extraction).

Additionally, there are filters set by the tarmode command to include/exclude certain files based on their DOS attributes. This behaviour is not affected by the selection and only works in creation mode so I consider the files as already filtered when they pass the matching test.

When I talk about the list (or the path list) I'm referring to the path list provided either on the CLI or via a file (F switch)

If no list is provided everything is included, no matter the mode.



Each file in the list is used as a pattern to list recursively the remote share. Thus, although it's not documented you can use wildcards without using the r switch in this case because the actual matching is done by the server.

In the old version using r in inclusion with a path containing wildcards will exclude everything. I consider this a bug. In the new version, the r switch is simply ignored in this case.

Instead of listing the path list we could use the current directory as a "starting" point to recursively list and match each file to the path list but this has an non-negligeable cost if the share is large. But by using each path of the path list as a "starting" pattern the include selection is already made and only the included paths are being listed which is the optimal solution. On the other hand the usage of wildcards is inconsistent with the rest of the program because the matching engine works differently.


The remote share is listed recursively from the current directory. Each file which either matches exactly a path in the list or is contained in a path on the list is excluded. You can't use wildcards here without r.

If you use the r switch, each file is matched locally with mask_match() which means you can use wildcards to match the basename of each file.

Unfortunately, the pattern matching engine used in the server is different from the one used in mask_match():

Pattern   *.exe                 foo/*.exe
File      /foo/a.exe  a.exe     foo/a.exe
Server    No          Yes       Yes
Client    Yes         Yes       No



Without r no fancy matching is done and both version do what you would expect. But in the presence of the r switch, the old version is completely broken. In the old version extracting in inclusion mode with regex is the interpreted as excluding. In other words if you are extracting using the r switch you will always be in the exclusion mode. This is fixed in the new version but since the matching is done locally with mask_match() we have the same problem as above.


Like for inclusion, without r no fancy matching is done and wildcards are not handled.

With r, matching done locally with same problem as above.



  • use 4.0.5 tarball laying around from the time of my proposal
  • getting familiar with waf
  • python headers/GCC issue I reported few weeks ago is now resolved
  • abi-check reports differences → –abi-check-disable
  • compiles successfully!
  • ./bin/smbclient fails to run:
./bin/smbclient: /home/knarf/prog/c/samba-4.0.5/bin/shared/private/ version `SAMBA_4.0.6' not found (required by /usr/lib/
./bin/smbclient: /home/knarf/prog/c/samba-4.0.5/bin/shared/private/ version `SAMBA_4.0.6' not found (required by /usr/lib/
./bin/smbclient: /home/knarf/prog/c/samba-4.0.5/bin/shared/private/ version `SAMBA_4.0.6' not found (required by /usr/lib/
./bin/smbclient: /home/knarf/prog/c/samba-4.0.5/bin/shared/private/ version `SAMBA_4.0.6' not found (required by /usr/lib/
./bin/smbclient: /usr/lib/samba/ version `SAMBA_4.0.5' not found (required by /home/knarf/prog/c/samba-4.0.5/bin/shared/private/
./bin/smbclient: /usr/lib/samba/ version `SAMBA_4.0.5' not found (required by /home/knarf/prog/c/samba-4.0.5/bin/shared/private/
  • system libs are somehow mixed with my build
  • try several samba versions → same results
  • solution: bundle everything using -rpath, which I thought was done by default

Final configuration/compilation command:

./buildtools/bin/waf configure \
    --enable-developer         \
    --enable-socket-wrapper    \
    --enable-nss-wrapper       \
    --abi-check-disable        \
    --bundled-libraries=ALL    \
    -j8 && make JOBS=8


  • review and push patch from ddiss to fix tab completion in smbclient. public repo on bitbucket.
  • setup a samba server on my machine
  • test various tar option/switch in particular archive (a) and incremental mode (g).
  • start a small perl script to automate the tests
  • use allinfo <file> to get the DOS attributes of a file
  • use setmode <file> <+-rhsa> to change the attributes of file
  • no attributes visible on the samba server → setup a virtual machine with a windows install
  • install virtualbox → update linux host system first for it to work
  • somehow break grub during update → fix system using a live cd…
# mount local filesystem somewhere on the livecd system
mount /dev/sda4 /mnt/root
mount -o bind /dev /mnt/root/dev
mount -t proc none /mnt/root/proc

# chroot into it
chroot /mnt/root

# regenerate conf and install
grub-mkconfig -o /boot/grub/grub.cfg
grub-install /dev/sda


  • finish installing windows vm
  • experiment with it using attrib cli tool to change file attributes from windows
  • write a script to make every combinations of attributes possible
  • continue working on perl script to test the various tar modes
  • after some discussion with ddiss on irc start working on a full blown test suite for samba selftests

The plan is to make a script which will test the behaviour of smbclient tar creation/restoration including its handling of:

  • the include/exclude list (resp. I and X)
  • the file list (F)
  • the regex switch (r) which changes the semantics of F, I and X
  • "newer than" (N)
  • tar modes (full, inc, nosystem, nohidden, reset)
  • archive bit removal (a)

The script will work with samba itself since there's already code doing that (the "selftest" suite) but it could also work on an actual windows box if there's a way or some kind of framework to do some things remotely on it.

For each creation (c) test:

  • setup the environnement (files and their attributes on the server)
  • fetch according to the test parameters
  • compare what the tarball contains vs. what's expected

For each restoration (x) test:

  • setup empty environement
  • restore

note: the doc on recurse could be updated because it changes the behaviour of other commands than mget and mput (ls / dir at least).


  • read and understand current test script (
  • set store dos attributes = yes in my smb.conf
  • set force user = myuser to let anyone change attributes (where myuser is the user running the script)
  • continue writing my script
  • reached a usable point
  • push on my public repo on branch gsoc_test_tarmode. commit url.
  • run example:
$ perl
TEST: creation -- normal files (no attributes)
 CMD: tarmode full
 ARG: -Tc /tmp/smb-tmp/tarmode.tar tarmode
    5 files, +0, -0, !0

TEST: creation -- incremental w/ -g (backup only archived files)
 ARG: -Tcg /tmp/smb-tmp/tarmode.tar tarmode
    2 files, +0, -0, !0

TEST: creation -- incremental w/ tarmode inc (backup only archived files)
 CMD: tarmode inc
 ARG: -Tc /tmp/smb-tmp/tarmode.tar tarmode
    2 files, +0, -0, !0
  • the results are colored in bold red/green in the terminal
  • the line above the result show:
    • how many files were downloaded in total in the tarball
    • the number of file in excess
    • the number of missing file
    • the number of different files (correct file but corrupted)
  • without the force user trick, all the setmode calls fail, resulting in this output:
TEST: creation -- normal files (no attributes)
 CMD: tarmode full
 ARG: -Tc /tmp/smb-tmp/tarmode.tar tarmode
    5 files, +0, -0, !0

TEST: creation -- incremental w/ -g (backup only archived files)
 ARG: -Tcg /tmp/smb-tmp/tarmode.tar tarmode
 -    ./tarmode/file-1
 -    ./tarmode/file-2
    2 files, +0, -2, !0

TEST: creation -- incremental w/ tarmode inc (backup only archived files)
 CMD: tarmode inc
 ARG: -Tc /tmp/smb-tmp/tarmode.tar tarmode
 -    ./tarmode/file-2
 -    ./tarmode/file-1
    2 files, +0, -2, !0
  • only creation tests for now
  • improvement: no tar extraction, computing md5sum of each file in the tarball in memory
  • setup/env more or less hardcoded in the script → need to add cli options


  • read up on Getopt::Long and Pod::Usage modules
  • add proper option parsing and corresponding documentation
  • add tests for:
    • reset
    • improve the one for inc
    • N flag (newer than)
  • pushed everything on the repo


  • refactored every file related function in a File package
    • less complex: less code deading with path transformation (remote vs. local)
    • more extensible: set_time, set_attr, etc
    • test simpler to write
  • add test for creation of nested directories
  • add first extraction test
  • add (or bring back)
    • File::list() – return list of File in a path
    • File::tree() – same but recursive
    • File::walk() – high order function to iterate on a File hierarchy
  • compute md5 sum when not cached
  • add check_remote(), will be useful for all extraction tests
  • fix consecutive slashes bug in some File::xxxpath()
  • add option to run single test
  • send a small patch to smbclient manpage. applied on master branch.

The script can now replace which was only doing a simple creation and extration → my script does that and more (g, a, N, nested files).


  • add tests for cI, cX, cF, xI, xX
  • report cX bug on mailing list


  • open bug #9989
  • add test for xF (file containing list of file)
  • try to wrap my head around r which no longer works like described in man page.
  • samba has no regex support, only Windows-style shell expansion
  • this has been the case for ~8 years!
  • including paths with r without using a pattern is equivalent to not using r
  • including pattern-paths with r never copy anything in the tar (bug?)
  • excluding prefixed path with r includes everything anyway (bug?)
  • excluding unprefixed paths with a pattern with r excludes any file which match the pattern
  • ugh. tiiired…


  • add a clean option to completely erase the local path
  • add tests for wildcard pattern (cI, cX, cF, xF)
  • some usage of r flag are really buggy, not sure about including them
  • toy with libarchive
  • start rewrite of clitar.c from sratch
  • new branch gsoc_clitar_libarchive on my public repo.


  • added About the journal section
  • refactored and documented the test script
  • doc is in embedded in POD format in the script
  • html export here.
$ cat << 'EOF' > pod.css
p code {
    font-size: 1.2em;
    text-shadow: .1em .1em .1em #AAA5A5;
pre {
    background: #DEDEDE;
    padding: 1em;
p {
    font-family: verdana;
    font-size: .8em;
    width: 40em;
h1, h2, h3, h4 {
    font-family: georgia;
h1 {
    border-bottom: 1px solid black;
h3 code, h4 code {
    border-bottom: 1px solid black;
    padding-bottom: .4em;
body {
    background: #E1E1E1;
#index {
    font-size: .7em;
    font-family: verdana;
$ pod2html --css pod.css                     \
           --title \
           --header                          \
           > test_smbclient_tarmode.html


  • work on new clitar.c
  • cmd_setmode
  • tar_parseargs
    • can't really use popt without breaking previous argument parsing
    • not finished
  • document code


  • implement argument parsing
  • read inclusion file list (F)
  • re-use some utils in lib/util for files and strings
  • use a separate header for clitar.c prototype (clitar_proto.h)
  • remove unused external declaration in client.c


  • stumble upon the const nightmare
  • fix a few bugs
  • add libarchive dependency
  • remove last clitar.c global from client.c!
  • client.c imports the tar context only in certain functions
  • definition of the context struct opaque to the client ie. not exported
  • less side effects → safer!
  • start processing tar archive for extraction
  • only list file name for now


  • basic tar extraction working
  • basic tar creation almost working


  • implement exclusion selection
  • tar creation working
  • honor several modes/flags
    • incremental
    • nosystem
    • nohidden
    • dry run
  • fix bad entry header bug
  • lots of tests fail, needs more debugging
  • reading from compressed archives works
    • auto-detection of compression algorithm


  • currently working at a different place with a temporary setup
  • reinstall dev environment
    • older version of Perl (v5.14.2)
    • older version of Samba server (v3.6.9)
  • update test script to work on Perl <5.16 (now only need 5.14)
  • fix ls parsing (no N flag on normal files with Samba client/server v3.6.9, haven't looked into it…)
  • implement tar extraction exclude filter


  • merge gsoc_test_tarmode into gsoc_clitar_libarchive branch (rebase)
  • add GPLv3+ copyright notice to test suite
  • cleanup style issue (space around parens in while/for/if (…) blocks)
  • sanitize input in test script
  • use File::Temp module instead of hardcoding temp path
  • –test option can now run multiple tests (you can pass a list of test number or intervals)
  • honor reset mode (flag a, or tarmode reset)
  • fix include/exclude bug
  • unify INCLUDE and INCLUDE_LIST selections
  • implement interactive tar command
  • add test for it


  • implement wildcard inclusion
  • use included path as starting listing point instead of whole share + filtering
  • add long path test
  • add large file test


  • standardize error code across clitar.c
    • when returning an int, 0 for success
  • blocksize is a multiple of 512
  • honor blocksize when writing the tar file


  • implement regex switch
  • add many tests


  • write How file matching work in smbclent.
  • write extraction regex tests
  • add verbose options to test script (hide output by default)


  • document code
    • general overview
    • most functions doc
  • rearrange definition for easier reading
  • add prototypes
  • start better handling of talloc context


  • fill GSoC mid term evaluation


  • update smbclient man page


  • remove some warnings regarding usage of given/when in test script
  • add subunit output for test script
  • add the new test to the test suite under samba3.blackbox.smbclient_tar
$ make test TESTS=samba3.blackbox.smbclient_tar
WAF_MAKE=1 python ./buildtools/bin/waf test
'test' finished successfully (0.013s)
Waf: Entering directory `/home/knarf/prog/c/samba/bin'
    Selected embedded Heimdal build
[ 168/4114] Generating smbd/build_options.c
Waf: Leaving directory `/home/knarf/prog/c/samba/bin'
'build' finished successfully (4.864s)
test: running (/usr/bin/perl /home/knarf/prog/c/samba/selftest/ --target=samba --prefix=./st --srcdir=/home/knarf/prog/c/samba --exclude=/home/knarf/prog/c/samba/selftest/skip --testlist="/home/knarf/bin/python /home/knarf/prog/c/samba/selftest/|" --testlist="/home/knarf/bin/python /home/knarf/prog/c/samba/source3/selftest/|" --testlist="/home/knarf/bin/python /home/knarf/prog/c/samba/source4/selftest/|" --binary-mapping=nmblookup3:nmblookup,nmblookup4:nmblookup4,smbclient3:smbclient,smbclient4:smbclient4,smbtorture4:smbtorture,ntlm_auth3:ntlm_auth --exclude=/home/knarf/prog/c/samba/selftest/slow --socket-wrapper samba3.blackbox.smbclient_tar && touch ./st/st_done) | /home/knarf/bin/python -u /home/knarf/prog/c/samba/selftest/filter-subunit --expected-failures=/home/knarf/prog/c/samba/selftest/knownfail --flapping=/home/knarf/prog/c/samba/selftest/flapping | tee ./st/subunit | /home/knarf/bin/python -u /home/knarf/prog/c/samba/selftest/format-subunit --prefix=./st --immediate
smbtorture 4.2.0rc1-DEVELOPERBUILD
Version 4.2.0rc1-DEVELOPERBUILD
OPTIONS --configfile=$SMB_CONF_PATH --maximum-runtime=$SELFTEST_MAXTIME --basedir=$SELFTEST_TMPDIR --format=subunit --option=torture:progress=no
smbtorture 4.2.0rc1-DEVELOPERBUILD
Version 4.2.0rc1-DEVELOPERBUILD
WARNING: allowing empty subunit output from samba4.urgent_replication.python(dc)
WARNING: allowing empty subunit output from samba4.blackbox.samba3dump
tdbsam_open: Converting version 0.0 database to version 4.0.
WARNING: database '/home/knarf/prog/c/samba/st/s3dc/private/passdb.tdb.tmp' does not end in .[n]tdb: treating it as a TDB file!
tdbsam_convert_backup: updated /home/knarf/prog/c/samba/st/s3dc/private/passdb.tdb file.
account_policy_get: tdb_fetch_uint32 failed for type 1 (min password length), returning 0
account_policy_get: tdb_fetch_uint32 failed for type 2 (password history), returning 0
account_policy_get: tdb_fetch_uint32 failed for type 3 (user must logon to change password), returning 0
account_policy_get: tdb_fetch_uint32 failed for type 4 (maximum password age), returning 0
account_policy_get: tdb_fetch_uint32 failed for type 5 (minimum password age), returning 0
account_policy_get: tdb_fetch_uint32 failed for type 6 (lockout duration), returning 0
account_policy_get: tdb_fetch_uint32 failed for type 7 (reset count minutes), returning 0
account_policy_get: tdb_fetch_uint32 failed for type 8 (bad lockout attempt), returning 0
account_policy_get: tdb_fetch_uint32 failed for type 9 (disconnect time), returning 0
account_policy_get: tdb_fetch_uint32 failed for type 10 (refuse machine password change), returning 0
delaying for nbt name registration
querying __SAMBA__ on __SAMBA__<00> __SAMBA__<00>
querying __SAMBA__ on __SAMBA__<00>
querying LOCALS3DC2 on LOCALS3DC2<00> LOCALS3DC2<00>
checking for winbindd
Ping to winbindd succeeded
wait for smbd
Domain=[SAMBA-TEST] OS=[Unix] Server=[Samba 4.2.0rc1-DEVELOPERBUILD]

    Sharename       Type      Comment
    ---------       ----      -------
    tmp             Disk      smb username is []
    tmpenc          Disk      encrypt smb username is []
    tmpguest        Disk
    guestonly       Disk
    forceuser       Disk
    forcegroup      Disk
    ro-tmp          Disk
    write-list-tmp  Disk
    valid-users-tmp Disk
    msdfs-share     Disk
    hideunread      Disk      smb username is []
    tmpcase         Disk      smb username is []
    hideunwrite     Disk      smb username is []
    durable         Disk      smb username is []
    print1          Printer   smb username is []
    print2          Printer   smb username is []
    print3          Printer   smb username is []
    lp              Printer   smb username is []
    nfs4acl_simple  Disk      smb username is []
    nfs4acl_special Disk      smb username is []
    xcopy_share     Disk      smb username is []
    posix_share     Disk      smb username is []
    print$          Disk      smb username is []
    IPC$            IPC       IPC Service (Samba 4.2.0rc1-DEVELOPERBUILD)

    Server               Comment
    ---------            -------
    LOCALS3DC2           Samba 4.2.0rc1-DEVELOPERBUILD

    Workgroup            Master
    ---------            -------
Successfully added group domusers to the mapping db as a domain group
Successfully added group domadmins to the mapping db as a domain group
You are not root, most things won't work
Created BUILTIN group Users with RID 545
Unable to setup corepath for smbd: No such file or directory
smbd version 4.2.0rc1-DEVELOPERBUILD started.
Copyright Andrew Tridgell and the Samba Team 1992-2013
standard input is not a socket, assuming -D option
Unable to setup corepath for nmbd: No such file or directory
nmbd version 4.2.0rc1-DEVELOPERBUILD started.
Copyright Andrew Tridgell and the Samba Team 1992-2013
standard input is not a socket, assuming -D option
Attempting to become logon server for workgroup SAMBA-TEST on subnet
Attempting to become domain master browser on workgroup SAMBA-TEST on subnet
become_domain_master_browser_bcast: querying subnet for domain master browser on workgroup SAMBA-TEST
become_logon_server_success: Samba is now a logon server for workgroup SAMBA-TEST on subnet

Samba server LOCALS3DC2 is now a domain master browser for workgroup SAMBA-TEST on subnet

Starting winbindd with config /home/knarf/prog/c/samba/st/s3dc/lib/server.conf
Unable to setup corepath for winbindd: No such file or directory
Unable to setup corepath for winbindd: No such file or directory
winbindd version 4.2.0rc1-DEVELOPERBUILD started.
Copyright Andrew Tridgell and the Samba Team 1992-2013
Unable to setup corepath for winbindd: Operation not permitted
initialize_winbindd_cache: clearing cache and re-creating with version number 2
[1/2 in 0s] samba3.blackbox.smbclient_tarmode (s3dc)
[2/2 in 12s] samba3.blackbox.smbclient_tar (s3dc)
smbd child process 21404 exited with value 0
nmbd child process 21402 exited with value 0
winbindd child process 21403 exited with value 0

ALL OK (24 tests in 2 testsuites)

A summary with detailed information can be found in:
'testonly' finished successfully (28.270s)


  • project considered finished
  • talk with ddiss about other possible work
  • settle on stat visualisation project
  • email Steve French about samba plugin for PCP


  • make libarchive test script work on older (v5.10) Perl
  • look at ddiss FSCTL_GET/SET_COMPRESSION branch
  • setup PCP on my machine
  • no package, build from source
  • pmcd daemon started
  • read doc about collectors.

Author: Aurélien Aptel <>

Created: 2013-08-12 Mon 20:03

Emacs 24.3.1 (Org mode 8.0.2)

Validate XHTML 1.0