Exploring Samba's new registry-based configuration
Samba's new registry-based configuration system conserves resources and lets the administrator configure entire clusters with a single command.
As most Linux users know, Samba  is an open source file and print system that provides interoperability with Windows environments. For more that 15 years, Samba has managed configuration settings through the plain-text file smb.conf. With the release of Samba 3.2.0 in July 2008, this paradigm of plain-text-only configuration is no longer the only option. A new configuration back end can store the configuration data in Samba's internal registry database. The default behavior is the same as before, but if you explicitly enable registry-based configuration through smb.conf, you can manage your Samba settings through a Windows-style registry.
Registry-based configuration opens many new options, such as remote administration and administration of Samba from Windows computers.
Why a Registry?
Samba has always maintained a registry database so that Windows clients could access the registry over the WINREG RPC interface to retrieve information for the connection. (Figure 1 shows WINREG access of a Windows client connecting to a Windows server.) Until recently, however, Samba did not use the registry for any other purposes.
The conventional text-based configuration system (see the box titled "Traditional Samba") is very flexible, very convenient, and very easy to use. In fact, you can configure your Samba implementation with nothing more than your favorite text editor. However, some special use cases create new demands that reduce the effectiveness of the text configuration model.
Samba's traditional plain-text configuration uses the very simple syntax of INI files well known to Windows users. This approach is also used by several Linux/Unix programs.
The smb.conf configuration file consists of sections and parameters. The file format is line oriented: Each line represents a comment, a configuration parameter, or the beginning of a new section.
- Lines starting with a semicolon (;) or a hash sign (#) are comment lines.
- A new section is initiated by a name in brackets: .
- A parameter consists of a parameter name and the assigned parameter value, separated by an equals sign (=), and it is associated with the last section started. The value can be a boolean value (yes/no), an integer value, or a general string.
Sections can be started multiple times, and the same parameter can be specified several times. The last occurrence of a parameter in one section overrides earlier instances of it in the same section.
The section has a special meaning. Whereas all other sections define shared resources, the section contains the parameters that control the overall behavior of the Samba daemon and the default values for the share parameters. The two other share names that have a special meaning are and . These sections dynamically create shares when they are configured appropriately. The manual page of the smb.conf configuration file has all the details on the syntax and the semantics.
The text configuration can be structured with the meta directives include and config file. Whereas config file simply drops all configuration data read so far and switches the configuration source to the specified file, the include directive builds up a whole tree of config files. An include file is parsed when its include statement is encountered within the parent file. The options read from the included file are activated in place of the include file. This means that options read before the include could be overwritten by parameters inside the include; on the other hand, side parameters set inside the include can be overwritten by parameters that occur after the include.
An interesting dynamic aspect is added with the possibility of expanding run-time macros in the configuration file. Table 1 shows some of the most important macros. These macros are useful for tasks such as configuring individual logging for specific clients.
Listing 1 shows an example configuration file that demonstrates sections, comments, includes, and macros. It should be noted that the server reads the complete configuration at startup and builds up the list of shares. Then the server periodically checks to see whether one of the configuration files has changed and, if so, reloads the whole configuration.
01 [global] 02 netbios name = fileserver 03 work group = samba 04 passdb backend = user 05 06 ; debugging options 07 log level = 3 08 max log size = 10000 09 debug hires timestamp = yes 10 11 # include client specific configuration 12 include = /etc/samba/smb.conf.%I 13 14 [homes] 15 valid users = %S 16 brosweable = no 17 writeable = yes 18 19 [share1] 20 path = /data/share1 21 read only = no
One disadvantage that often appears on large networks is memory consumption: A client is only connected to one share, but the Samba daemon process that serves this connection builds up the entire list of shares the same. On a 64-bit machine, a share structure internally consumes about 1KB of memory. So when the configuration contains a thousand shares, the smbd daemon wastes around 1MB of RAM. When a thousand clients are connected to the server, 1GB of RAM is wasted.
Worse still, when the configuration file is changed, a thousand clients will start re-reading smb.conf, which will be some 250KB or more. All these near-simultaneous read operations will have a noticeable effect not only on the I/O performance of the server, but also on the CPU load because the thousand shares read from the file must be compared with and integrated into the internal list of shares already constructed the first time. These performance problems have hit production servers in the past, and admins have created workarounds, but these improvised solutions are just band-aids.
Usability issues include:
- One always needs to read and write the whole file; Samba cannot access individual parameters or sections.
- The system offers no protection against data corruption on simultaneous write access. This problem arises for creators of graphical or text-based configuration tools.
- In clustered environments, one has to copy the configuration files to all cluster nodes.
These disadvantages disappear when the configuration parameters are stored in a database. The cluster configuration would suggest a Trival Database (TDB)  structure, which could then be distributed to all cluster nodes automatically by CTDB . The approach that had been started in 2007 and was officially released in July 2008 with Samba 3.2 is to use Samba's registry database, because it has a data model that fits pretty nicely on the schema of sections and parameters of smb.conf. Samba stores its registry in a TDB database called registry.tdb
The Windows Registry
The registry organizes its data in a tree structure of keys: a registry key consists of a name, a list of its subkeys, and a list of its registry values. A value comprises a name and the value data. Samba now stores the configuration data in the following registry key
which is sometimes called the smbconf key. The sections in an smb.conf file correspond to the subkeys in this key; the configuration parameters match the registry values in the corresponding subkey.
One should imagine the contents of the smbconf key as a parsed but not otherwise interpreted or activated smb.conf file. Just like the smb.conf file syntax, the registry does not differentiate between the global section and the share definitions. Because the registry is a database with keys and values, it does not support a fixed order or multiple parameter records.
The include statements need special treatment – they are not configuration parameters as such, but meta directives that are activated at the time of parsing. An include directive can be specified multiple times with different values, and the order and placement of the real parameters are very important in this case. Because the parameters stored in the registry do not have a guaranteed order, registry configuration can only offer a compromise with respect to the treatment of include statements: For registry configuration, Samba maintains one ordered list of include statements per section that is only evaluated at the end of the section, after the parameters have been activated.
Activating Registry Configuration
In current Samba code, the smb.conf text configuration file is still the initial source of configuration. The administrator can enable the registry configuration in three stages, always starting with smb.conf.
The parameter registry shares = yes in the section of smb.conf tells Samba to read share definitions from the registry. In contrast to the text-based configuration, smbd does not read all registry-based share definitions when launching; it only picks up the individual shares at run time – a share is loaded just as the client connects to it.
This approach solves the memory consumption problem. If a share is defined in the registry and also in smb.conf, these definitions are not merged, and the text-based variant wins.
smbd does not pick this list up from the registry when launching, in contrast to shares defined in a configuration file, but loads it at run time. If shares with the same name are in the registry and in smb.conf, the text-based variant wins. So when registry shares are active, it is advisable not to use text-defined shares at all in order to avoid confusion.
A special new semantic for the directive include = registry in the section of the smb.conf file, which is frequently used for clustering, mixes global registry options with the text-based configuration. The full semantics of the include statement are kept for the global options: parameters that occur before include = registry in smb.conf can be overwritten by registry parameters, and global parameters from the registry can be overwritten by global parameters from smb.conf that are specified after include = registry.
Setting include = registry implicitly activates registry shares. Be careful when you mix text and registry-based configurations – for example, the lock directory parameter is a sure-fire way of telling the registry to shoot itself in the foot because, among other things, it sets the location where Samba looks for its registry database.
The recommended way of using include = registry is to have an initial portion of configuration in the section of smb.conf and conclude the section and the file with include = registry. Listing 2 shows a minimal configuration for use in clustered environments.
smb.conf for a Cluster
01 [global] 02 clustering = yes 03 include = registry
To change the whole configuration file using config file, admins can use a new meta directive, config backend = registry in the global section of smb.conf, to stipulate a registry-only configuration. This directive tells Samba to discard any settings it has parsed from smb.conf and rely entirely on the global parameters read from the registry. Again, the directive implies registry shares = yes.