TODO
Table of Contents
- Setup
- Work
- TODO dup extent
- Implementation
smbd_smb2_ioctl_send()
IOCTL_DEV_TYPE_MASK
FSCTL_NETWORK_FILESYSTEM
FSCTL_DUP_EXTENTS_TO_FILE
- VFS
- TODO ask about version bump in
vfs.h
- DONE ask semantic of
fid_volatile
- DONE use ddiss tests on his
wip_fsctl_dup_extents
branch - TODO run them on windows 10 & fix them
- tevent, send/recv
tevent_req_post(req, ev)
- Tests
- SMB2 call
test_setup_create_fill()
- TODO kernel cifs bug bnc#799133
- TODO patch
linux/fs/namei.c
(fixlookup_one_len @len doc
) - TODO report bug
gdb
python api,check_typedef
segfault
- TODO dup extent
1 Setup
The Samba project (as in do stuff with Windows shares) is splitted in 3 subprojects:
- Samba (the *nix server
smbd
, the standalonesmbclient
, …) - the CIFS kernel module (to mount remote shares on the system)
- the CIFS utils (
mount.cifs
and other smallcifs
tools)
1.1 Samba
You'll need some packages for the default (auto-)configuration to work:
$ zypper install git libacl-devel gnutls-devel python-devel $ zypper install -t pattern devel_basis
If you want to be able to run smbd
without root privilege add a
prefix and a localstatedir your user can write to, and use >1024
ports.
Also don't forget to install and run it from the installed place (not
the samba bin/
dir), otherwise the librairies path hardcoded in the
binary (using rpath) won't work and smbd
will try to look for the
dev libs in you system folder.
$ git clone git://git.samba.org/samba.git samba $ cd samba $ ./buildtools/bin/waf configure -C -j8 \ --enable-developer \ --enable-debug \ --bundled-libraries=ALL \ --prefix=/tmp/smb/ \ --localstatedir=/tmp/smbstate $ ./buildtools/bin/waf build $ ./buildtools/bin/waf install
1.1.1 Rtags
Rtag is a clever code tag server based on clang that can do smart autocompletion, symbol lookup, etc. It's a server that runs in the background and rescan any watched files (inotify). You can then make query using the client from your editor.
$ zypper install llvm-clang llvm-devel llvm-clang-devel libopenssl-devel
$ git clone https://github.com/Andersbakken/rtags.git $ cd rtags $ git checkout v2.0 # do this if the submodule command fails $ rm -rf src/rct/CMakeFiles $ git submodule init $ git submodule update $ mkdir build && cd build $ cmake ..
If cmakes fails because of the CURSES_CURSES_LIBRARY
variable copy
the path to the ncurses.so
from one of the other variable to it and
re-run cmake
.
$ cd .. $ make
Don't forget to add rtags bin
dir to your $PATH
.
I've also added 2 shell functions to setup and remove the "man-in-the-middle" compiler configuration:
function start-rtags-wrapper() { ln -s ~/prog/rtags/bin/gcc-rtags-wrapper.sh ~/bin/gcc ln -s ~/prog/rtags/bin/gcc-rtags-wrapper.sh ~/bin/c++ ln -s ~/prog/rtags/bin/gcc-rtags-wrapper.sh ~/bin/cc ln -s ~/prog/rtags/bin/gcc-rtags-wrapper.sh ~/bin/g++ } function stop-rtags-wrapper() { rm -f ~/bin/{gcc,cc,g++,c++} }
To get the samba project in rtags you need to configure samba with the rtags wrapper in place.
$ start-rtags-wrapper $ ./buildtools/waf.. # configure samba so that the produced makefiles use the wrapper
Once the configuration is complete, you can now start the rtags deamon in a new shell and let it run there (it doesn't run in the background).
$ rdm
Go back to your original shell, we can start compiling and let rtags know about samba:
$ make
1.1.2 Packaging for SUSE
The SUSE samba package is made obviously from the samba sources but it also adds patches specific to SUSE, or backported bug fix, etc…
There is a separate repo that hosts the scripts, patches and various resources to manage the creation of the package. This repo is still using svn at the time of writing.
$ svn co https://svn.suse.de/svn/samba
This repo has a very detailed README.
Samba predates the Open Build System (OBS) which is usually used to host a project packaging metadata like this. Samba still has an OBS but the package metadata on the OBS is generated by the scripts in the SVN, which makes the SVN a meta-meta-data repo for Samba… All the editing is done in SVN so the Samba OBS project is never manually edited unless you want to quickly edit something.
More explantions from Jim McDonough follows (slightly edited for better reading):
<jmcd> if you are in the svn tree <jmcd> pick a branch, say 4.1.12.SLE12_GA <jmcd> this is what's shipped on SLE12 <jmcd> plus any shipped fixes <jmcd> so from that tree, in branches/4.1.12.SLE12_GA <jmcd> run 'make setup' <jmcd> when comeplete, it will tell you where the sources are <jmcd> make setup will dl the release tarball and setup the quilt environement <jmcd> quilt manages the patches <jmcd> so what's prepped for you is the upstream sources only <jmcd> it will tell you "The sources are prepared in build/pac/stable/samba/samba-4.1.12/ " <jmcd> from that directory, you can do a "quilt push", one by one, or "quilt push -a" to push all patches <jmcd> that's the basic environment we use to merge patches <jmcd> or add patches <jmcd> first, all of the patches that are from (or based on) upstream are applied <jmcd> then, suse-specific ones are applied <jmcd> so if you're trying to backport a patch from upstream, you'd only push the upstream ones <jmcd> "quilt push" can also take a number <aaptel> so what im left with is whats compiled and packaged for suse right? <jmcd> yes <jmcd> but <jmcd> only what's actually known to svn <jmcd> so if you're back in the directory <svn>/branches/4.1.12.SLE12_GA <jmcd> you say "make DIST=sle12", for example <jmcd> this will prep the directory to do send to the build service <jmcd> meanwhile, on the build service <jmcd> obs or ibs for short (opensuse build service) and (internal build service) <jmcd> ok, make sure you have access to build.opensuse.org and build.suse.de <jmcd> ok, so let's say you're going to test out a patch for SLE12 <jmcd> first, let's start with the external repository <jmcd> you might later change where you want this... <jmcd> in some directory <jmcd> "osc branch network:samba:MAINTAINED:SLE_12 samba" <jmcd> this will create a new project/package for you <jmcd> it will tell you at the end the command to actually checkout the code <jmcd> which schould create a new directory home:aaptel:branches:network:samba:MAINTAINED:SLE_12/samba <jmcd> (this can be specifed if you want something shorter) <jmcd> just by branching from the existing project <jmcd> it cloned the attributes <jmcd> it's all just linked until you change something, so it's fairly lightweight <aaptel> so how was it configured for building? <aaptel> say for example my patch adds a waf configure flag <aaptel> that i want to use in the build <aaptel> where would i change that? <jmcd> well <jmcd> that you'd change in the samba.spec <jmcd> which is in the directory you just checked out via osc <jmcd> but that gets generated from svn (this is specific to samba, or the kernel, most projects just have you change that spec file directly) <jmcd> but you could do a quick test that way :-) <jmcd> so in the samba case, we really do all the editing in the svn subtree <jmcd> source code gets chagned via patches that are managed with quilt (a later step in learning :-) <jmcd> but if you want to do a quick test on configure changes, youy can just edit the file and commit it with osc <jmcd> if you want to build it locally (which, given your hardware, is likely faster) <jmcd> from inside the package directory, you can run: osc build <jmcd> or, like make, osc build -j8 <jmcd> to allow parallel build <jmcd> if you explore the command, you can build for more platforms <jmcd> and architectures <jmcd> but in this case, only 32 and 64-bit intel is available, I think, and only for SLE12 <jmcd> another project, network:samba:STABLE, contains the latest released upstream <jmcd> and typicall matches what we have in svn trunk <jmcd> so basically, to get from svn -> osc <jmcd> you would go to the root directory of the svn branch you want to build <jmcd> "make clean" <jmcd> "make DIST=sle12" (or just 'make' for opensuse) <jmcd> then you rsync build/pac/stable/samba/samba-4.1.12/ to the osc directory <jmcd> the source is what the "make" command retunred to you <jmcd> you'll notice that the content looks roughly the same <jmcd> "osc status" will tell you what's changed <jmcd> "osc commit" will push it to the server, and initiate any builds required from changes <jmcd> "osc build" will build it locally based on what you have, committed or not
1.2 CIFS kernel module
The CIFS module is part of the linux kernel so you can just clone the kernel from Linus repo.
$ git clone git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git linux-git
1.3 Packaging for SUSE
Again, SUSE has their own kernels/packages. So there is a repo that hosts the script, patches, etc to produce the final SUSE kernel source for any supported architecture.
$ git clone git://kerncvs.suse.de/kernel-source.git linux-suse-git
1.3.1 Expanded repo
The final expanded&patched source is also available on a git repo, to make diffs and other source work more convienient I guess. You can add it as a new remote on an existing linux repo:
$ git remote add suse-expanded git://kerncvs.suse.de/kernel.git $ git remote update
Each branch corresponds to a SLE version.
1.3.2 Adding a patch
To add a patch to a SLE kernel look at the README of the kernel-source
repo. Basically make a new branch in the expanded repo, export the
diff to one of the folder of the kernel-source repo (eg
patch.fixes). Rewrite it to the proper patch format (patch-tag -e, see
README). Add an entry in series.conf
.
Once it's done you can run ./scripts/tar-up.sh &&
./scripts/osc_wrapper build
. It will create rpm package you can copy
to the right SLE system and install with rpm -i kernel-default-VERSION kernel-default-base-VERSION
.
Reboot the SLE system and choose the new kernel in the GRUB menu. uname -a
will
give you the commit id the kernel was built from.
linux-suse-git % git show commit 31432b3 ^^^^^^^ Merge: bbf8716 5f8e921 Author: Jiri Kosina <jkosina@suse.cz> Date: Fri Oct 30 14:57:37 2015 +0100 Merge remote-tracking branch 'origin/users/tiwai/SLE11-SP3/for-next' into SLE11-SP3 linux-suse-git % ssh root@sles11sp3 'uname -a' Linux sles11sp3 3.0.101-0-default #1 SMP Fri Oct 30 13:57:37 UTC 2015 (31432b3) x86_64 x86_64 x86_64 GNU/Linux ^^^^^^^
To uninstall I use zypper rm kernel-default-base-VERSION kernel-default-VERSION
.
To list available/installed kernels you can use:
$ zypper se -s 'kernel*'
You can install a version listed by zypper like this:
$ zypper in kernel-default-2.6.32.10-0.4.1
1.4 CIFS utils
The mount.cifs
utility was separated from samba for various reasons.
$ git clone git://git.samba.org/cifs-utils.git cifs-utils
The packaging metadata is hosted on OBS.
1.5 Virtual machines
People in SUSE use KVM so if you want to share VMs and stuff, use it:
https://www.suse.com/documentation/sles-12/singlehtml/book_virt/book_virt.html
1.6 Windows
You can find free legal Windows VM for XP, 7, 8, 8.1 on microsoft webdev portal:
http://dev.modern.ie/tools/vms/
Don't forget to make a snapshot because the license in those VM still
expires in 30 days. Get the VirtualBox images and convert them
to QEMU qcow2
format:
$ mkdir -p vm/win7 $ mv IE11.Win7.For.Linux.VirtualBox.zip vm/win7 $ cd vm/win7 $ unzip *zip \ && tar vxf *ova \ && qemu-img convert -O qcow2 *vmdk win7.qcow2 \ && rm *zip *ova *ovf *vmdk
Just make a new KVM machine with the qcow2
as a hdd.
Alternately, you can find many Windows OS ISOs (including server editions) on Lutze's D:/iso
.
For any other Windows version/edition, ask for a MSDN account by
sending a email with your needs at msdn-admin@novell.com
.
1.7 SUSE
You can find current and old ISO for SUSE Linux Enterprise (SLE) on here:
You can skip the registration part at the installation.
1.8 SMB protocol
- CIFS – The ancient version of SMB that was part of Microsoft Windows NT 4.0 in 1996. SMB1 supersedes this version.
- SMB 1.0 (or SMB1) – The version used in Windows 2000, Windows XP, Windows Server 2003 and Windows Server 2003 R2
- SMB 2.0 (or SMB2) – The version used in Windows Vista (SP1 or later) and Windows Server 2008
- SMB 2.1 (or SMB2.1) – The version used in Windows 7 and Windows Server 2008 R2
- SMB 3.0 (or SMB3) – The version used in Windows 8 and Windows Server 2012
- SMB 3.02 (or SMB3) – The version used in Windows 8.1 and Windows Server 2012 R2
- SMB 3.11 – The version used in Windows 10 and Windows Server 2016
2 Work
2.1 TODO dup extent
2.1.1 Implementation
2.1.2 smbd_smb2_ioctl_send()
contrary to the name, send handles recved ioctl packets
2.1.3 IOCTL_DEV_TYPE_MASK
this mask corresponds to the the "general" packet category
2.1.4 FSCTL_NETWORK_FILESYSTEM
actual value of CIFS related packets
2.1.5 FSCTL_DUP_EXTENTS_TO_FILE
actual value of DUP_EXTENTS
request packets.
Note: dup_extents
is in the FSCTL_FILESYSTEM
category (not netfs)
this category is handled in
smb2_ioctl_filesys()
in s3/smbd/smb2_ioctl_filesys.c
#define FSCTL_DUP_EXTENTS_TO_FILE (FSCTL_FILESYSTEM | FSCTL_ACCESS_WRITE | 0x0344 | FSCTL_METHOD_BUFFERED) struct fsctl_dup_extents_to_file { uint64_t fid_volatile; uint64_t source_off; uint64_t target_off; uint64_t byte_count; }
2.1.6 VFS
the access to the system fs is abstracted away in vfs.
each vfs implements the vfs api via function pointers (struct vfs_fn_pointers
).
when samba needs to do io, it goes through the list of loaded vfs until it finds one who implements the call needed.
you can use the next available vfs method as your own method
(e.g. SMB_VFS_NEXT_FS_CAPABILITIES
)
2.1.7 TODO ask about version bump in vfs.h
2.1.8 DONE ask semantic of fid_volatile
file handle are the same returned by smb2_open
calls for example.
<ddiss> I think _volatile refers to the characteristics of the hande (it changes across open/close, rather than being like an inode number)
2.1.9 DONE use ddiss tests on his wip_fsctl_dup_extents
branch
2.1.10 TODO run them on windows 10 & fix them
the ioctl is made on the target file (the file that will be copied
into), the file_volatile
is the source.
2.1.11 tevent, send/recv
all io is made async-ly. the _send
functions are called first and they
add a _recv
callback to call using tevent.
_send
sets up callbacks_done
is the callback_recv
returns the results, usually called by_send
, abstracts away the request internal data
2.1.12 tevent_req_post(req, ev)
Finish a request before the caller had the change to set the callback.
An implementation of an async request might find that it can either finish the request without waiting for an external event, or it can not even start the engine. To present the illusion of a callback to the user of the API, the implementation can call this helper function which triggers an immediate event. This way the caller can use the same calling conventions, independent of whether the request was actually deferred.
2.1.13 Tests
source4/smb2/ioctl.c
2.1.14 SMB2 call
to do a network call:
struct smb2_create io; NTSTATUS status; io.in.xxx = xxx; io.in.xxx = xxx; io.in.xxx = xxx; status = smb2_create(tree, mem_ctx, &io); // result in io.out
2.1.15 test_setup_create_fill()
create a remote file of <size>
, filled with 0.
the created file has a fid which is available after the call.
2.2 TODO kernel cifs bug bnc#799133
The setup on LUTZE (Windows Server 2012) is the following.
C:\sspshare (shared as "sspshare") | + dir1 | + dir11 (only accessible dir for LURCH\bill)
- The permissions set on the server are neither
C:\sspshare
norC:\sspshare\dir1
are accessible toLURCH\bill
.LURCH\bill
only has full access tosspshare\dir1\dir11
. - dir11 should be mountable using bill credentials, but it's not:
mount.cifs kernel mount options: ip=10.160.5.42,unc=\\LUTZE\sspshare,user=bill,,domain=LURCH,prefixpath=dir1/dir11,pass=******** mount error(2): No such file or directory Refer to the mount.cifs(8) manual page (e.g. man mount.cifs)
patch @ bso8950_mount_restricted_root_subdir branch on git://git.samba.org/ddiss/linux.git
When mounting the remote share HOST/share/sub/path
, the CIFS client
sends listings requests on each of the path components (/share/*
,
/share/sub/*
, /share/sub/path/*
).
Since in this case the user only has access to the most inner subpath
and not the paths above, the first listing requests (/share/*
) fails
and returns ENOENT
("no such file or dir"). The CIFS client aborts
there.
On the other hand, using smbclient I can cd
to share/sub/path
and
ls
there. It directly sends a listing request for
share/sub/path/*
, which succeeds:
smbclient //LUTZE/sspshare/ -U 'LURCH\bill%xxxx' Domain=[LURCH] OS=[Windows Server 2012 Standard 9200] Server=[Windows Server 2012 Standard 6.2] smb: \> cd dir1/dir11 smb: \dir1\dir11\> ls . D 0 Tue Sep 2 04:57:22 2014 .. D 0 Tue Sep 2 04:57:22 2014 dir111 D 0 Tue Sep 2 04:57:29 2014 file111 A 20 Tue Sep 2 04:57:22 2014 34824 blocks of size 2097152. 19174 blocks available
On the wire, we see a FIND_FIRST2
on dir1/dir11/
SMB 198 Trans2 Request, FIND_FIRST2, Pattern: \dir1\dir11\* SMB 586 Trans2 Response, FIND_FIRST2, Files: . .. dir111 file111
The patch is supposed to create disconnected directories ("fake"
directory) of each component of the path the CIFS cannot access. But
the patch looks for access/permission errors (EACCESS
). In my setup,
in the case of incompatible permissions the server sends "no such file
of directory" instead. In other words the patch code is never
triggered.
I've tried changing the error check to also trigger on ENOENT
.
-if (IS_ERR(dentry) && PTR_ERR(dentry) == -EACCES && *s) { +if (IS_ERR(dentry) && (PTR_ERR(dentry) == -EACCES || PTR_ERR(dentry) == -ENOENT) && *s) { dentry = create_root_dis_dentry(sb, rinode, full_path); }
It mounts successfully but listing the directoy I'm supposed to have access to still fails.
$ ls /mnt Status code returned 0xc0000022 NT_STATUS_ACCESS_DENIED ls: reading directory /mnt: Permission denied
And on the wire:
SMB 154 Trans2 Request, FIND_FIRST2, Pattern: \* SMB 138 Trans2 Response, FIND_FIRST2, Error: STATUS_ACCESS_DENIED
Now if I try to ls /mnt/dir1/dir11
(which is incorrect because we mounted dir1/dir11
to /mnt
):
$ ls /mnt/dir1/dir11 Status code returned 0xc0000034 NT_STATUS_OBJECT_NAME_NOT_FOUND ls: cannot access /mnt/dir1/dir11: No such file or directory
On the wire:
SMB 154 Trans2 Request, QUERY_PATH_INFO, Query File All Info, Path: \dir1 SMB 105 Trans2 Response, QUERY_PATH_INFO, Error: STATUS_OBJECT_NAME_NOT_FOUND
CIFS is walking along the requested path and starts to list each one it seems. Stops at the first one that fails.
If we try to list /mnt/xxx
with user bill
, the QUERY_PATH_INFO
is done on \xxx
. The expected query would be
\dir1\dir11\xxx
. Doing the same as administrator
does the right
thing and queries \dir1\dir11\xxx
.
Conclusion: somehow the prefixpath
is missing when doing queries.
Either:
- something is not setup right at mount time
- something is not done properly when doing a lookup
- both
The actual lookup is done in cifs_lookup() in dir.c
.
cifs_get_root
loops over each element of the path and calls
lookup_one_len()
which calls cifs_lookup()
.
If an error occurs (an intermediate element of the path cannot be accessed), the patch is triggered: it fetches the final element inode and makes the corresponding dentry.
Problem: instead of returning a dentry with a name of dir11
,
cifs_get_root()
returns a dentry with /
(IIUC, because it's
disconnected). During the listing of the directory,
build_path_from_dentry()
is called with this dentry and builds an
empty path instead of /dir1/dir11
. This empty path is used to issue
the incorrect FindFirst
packet.
The commit that introduced the bug made a big change: superblock always must point to the root of the share, that's why we create new dentry for each intermediate path element.
- I think the fix is to create the whole fake hierarchy, and not just the last element?
- Is the patch misusing the
d_splice_alias()
? - Where are stored disconnected dentry?
The callstack is the following when we ls
(getdents64
syscall -> iterate_dir()
-> cifs_readdir()
).
#0 cifs_readdir (file=0xffff880006ffed00, ctx=0xffff88000005fef0) at fs/cifs/readdir.c:770 #1 0xffffffff810dff40 in iterate_dir (file=0xffff880006ffed00, ctx=0xffff88000005fef0) at fs/readdir.c:42 #2 0xffffffff810e0471 in SYSC_getdents64 (count=<optimized out>, dirent=<optimized out>, fd=<optimized out>)
2.2.1 DONE test patch on master kernel
Doesn't work.
2.2.2 DONE update bugzilla
2.2.3 DONE VFS inner workings
Each struct super_block
has a generic (void*
) "super block info" ptr. CIFS
uses struct cifs_sb_info
.
superblock has metadata of the FS.
struct file
instance created with eachopen()
call.- superblock has pointer to root dentry.
struct dentry
are the fs hierarchy (hold file names & parent/children pointers to other dentries & pointer to inode)struct inode
metadata on file
Multiple dentry can point to the same inode.
real_<syscall>()
does the VFS switching.
struct dentry { struct hlist_bl_node d_hash; /* lookup hash list */ struct dentry *d_parent; /* parent directory */ struct qstr d_name; struct inode *d_inode; /* Where the name belongs to - NULL is * negative */ const struct dentry_operations *d_op; struct super_block *d_sb; /* The root of the dentry tree */ struct list_head d_lru; /* LRU list */ struct list_head d_child; /* child of parent list XXX: should be called siblings! */ struct list_head d_subdirs; /* our children XXX: all children, including files */ }; struct inode { umode_t i_mode; unsigned short i_opflags; kuid_t i_uid; kgid_t i_gid; unsigned int i_flags; const struct inode_operations *i_op; struct super_block *i_sb; struct address_space *i_mapping; /* Stat data, not accessed from path walking */ unsigned long i_ino; /* * Filesystems may only read i_nlink directly. They shall use the * following functions for modification: * * (set|clear|inc|drop)_nlink * inode_(inc|dec)_link_count */ union { const unsigned int i_nlink; unsigned int __i_nlink; }; dev_t i_rdev; loff_t i_size; struct timespec i_atime; struct timespec i_mtime; struct timespec i_ctime; spinlock_t i_lock; /* i_blocks, i_bytes, maybe i_size */ unsigned short i_bytes; unsigned int i_blkbits; blkcnt_t i_blocks; /* Misc */ unsigned long i_state; struct mutex i_mutex; unsigned long dirtied_when; /* jiffies of first dirtying */ unsigned long dirtied_time_when; struct hlist_node i_hash; struct list_head i_wb_list; /* backing dev IO list */ struct list_head i_lru; /* inode LRU list */ struct list_head i_sb_list; union { struct hlist_head i_dentry; struct rcu_head i_rcu; }; u64 i_version; atomic_t i_count; atomic_t i_dio_count; atomic_t i_writecount; const struct file_operations *i_fop; /* former ->i_op->default_file_ops */ struct file_lock_context *i_flctx; struct address_space i_data; struct list_head i_devices; union { struct pipe_inode_info *i_pipe; struct block_device *i_bdev; struct cdev *i_cdev; char *i_link; }; }; struct file { union { struct llist_node fu_llist; struct rcu_head fu_rcuhead; } f_u; struct path f_path; struct inode *f_inode; /* cached value */ const struct file_operations *f_op; /* * Protects f_ep_links, f_flags. * Must not be taken from IRQ context. */ spinlock_t f_lock; atomic_long_t f_count; unsigned int f_flags; fmode_t f_mode; struct mutex f_pos_lock; loff_t f_pos; struct fown_struct f_owner; const struct cred *f_cred; struct file_ra_state f_ra; u64 f_version; /* needed for tty driver, and maybe others */ void *private_data; struct address_space *f_mapping; };
2.2.4 Debugging CIFS in gdb
- Setup
$ xsamba kernel run $ gdb vmlinux (gdb) target remote localhost:1234
- Superblock fs type list
(gdb) set $sb1 = $container_of(superblocks, "struct super_block", "s_list") (gdb) p $sb1->s_id $3 = "tmpfs", '\000' <repeats 26 times> (gdb) set $sb2 = $container_of(superblocks->next->next, "struct super_block", "s_list") (gdb) p $sb2->s_id $4 = "rootfs", '\000' <repeats 25 times> (gdb) set $sb3 = $container_of(superblocks->next->next->next, "struct super_block", "s_list") (gdb) p $sb3->s_id $5 = "bdev", '\000' <repeats 27 times>
2.2.5 Docs
- http://web.archive.org/web/20121029133332/http://thecoffeedesk.com/geocities/rkfs.html
- http://www.win.tue.nl/~aeb/linux/vfs/
- http://unix.stackexchange.com/questions/4402/what-is-a-superblock-inode-dentry-and-a-file
- http://www.ibm.com/developerworks/library/l-virtual-filesystem-switch/
- http://web.archive.org/web/20150505112327/http://www.ibm.com/developerworks/linux/library/l-linux-filesystem/
- http://www.tldp.org/LDP/tlk/fs/filesystem.html
- http://www.makelinux.net/books/lkd2/ch12lev1sec7
2.2.6 How was the bug introduced
The bug was introduced with commit f87d39d
.
Author: Steve French <sfrench@us.ibm.com> Date: Fri May 27 03:50:55 2011 +0000 [CIFS] Migrate from prefixpath logic Now we point superblock to a server share root and set a root dentry appropriately. This let us share superblock between mounts like //server/sharename/foo/bar and //server/sharename/foo further. Reviewed-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Pavel Shilovsky <piastry@etersoft.ru> Signed-off-by: Steve French <sfrench@us.ibm.com>
2.2.7 What the patch does
the cifs superblock info has a new linked list of "disconnected root dentries"
struct list_head rtdislist; /* list of disconnected root dentries */ spinlock_t rtdislock; /* lock for disconnected root dentry list */
Quoting Shirish Pargaonkar 2014-03-12 17:42:32 UTC:
If during mounting a share, if the share path is accessible but if any of the intermediate paths to the share is inaccessible i.e. returns error EACCES,
- get inode info for the share path using query path info
- look for a dentry and if not found, add - using
d_obtain_alias()
- if server does not support unique ids/inode numbers, add to the begining of the list maintained in superblock, an element consisting of full path, pointer to this dentry, and a pointer to the inode.
The implication that intermediate share path are inaccessible ->
return EACCES
is false.
During lookup, for a case where server does support unique ids/inode numbers,
- obtain inode info using query path info
- splice (
d_splice_alias()
) the dentry corrosponding to this inode if any. if spliced, remove the element from the list maintained in the superblock.for a case where server does not support unique id/inode number,
- search for an entry for inode based on the full path if no such entry, obtain inode info using query path info
- splice (
d_splice_alias()
) the dentry corrosponding to this inode if any if successful, remove the element from the list maintained in the superblock.That is one way an element will come off of the list maintained in the superblock. If the dentry does not get ever get spliced, whenever either it is deleted with zero
d_count
(unmount e.g.) or it is released (superblock is being deleted), cifsd_delete
andcifs_d_release
respectively, whichever applies first, will take the element off the list and free it.If the superblock is intact and anonymous root dentry does not get deleted during unmount (
d_count
is not zero i.e. has child dentries with referenece to the parent) i.e.d_delete
does not get called, the dentries would remain till vfs code deletes them as needed (as theird_count
starts approaching zero).Whenever we have the same mount again with a anonymous dentry, the elements are added to the head of the list maintained in the superblock, so stale elements will not be reachable.
2.2.8 TODO grok what d_obtain_alias()
does
2.2.9 TODO grok what d_splice_alias()
does
2.2.10 TODO build_path_from_dentry, rkfs
$ mkdir -p /tmp/x $ touch /tmp/x/{a,b,c} $ strace ls /tmp/x stat("/tmp/x", {st_mode=S_IFDIR|0755, st_size=33, ...}) = 0 open("/tmp/x", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 3 fstat(3, {st_mode=S_IFDIR|0755, st_size=33, ...}) = 0 fcntl(3, F_SETFD, FD_CLOEXEC) = 0 getdents64(3, /* 5 entries */, 4096) = 120 lstat("/tmp/x/a", {st_mode=S_IFREG|0644, st_size=0, ...}) = 0 lstat("/tmp/x/b", {st_mode=S_IFREG|0644, st_size=0, ...}) = 0 lstat("/tmp/x/c", {st_mode=S_IFREG|0644, st_size=0, ...}) = 0 getdents64(3, /* 0 entries */, 4096) = 0 close(3) = 0
open() calls:
- file_operations->open()
- theres also inode_operations->create()