doc/manual.docbook
author Paul Crowley <paul@lshift.net>
Thu, 15 Oct 2009 12:47:39 +0100
changeset 163 8d73bcd75243
parent 162 1c0bc7d33648
child 179 243dd21d0dbc
permissions -rw-r--r--
remove precursors to Docbook manual

<?xml version="1.0" encoding="utf-8"?>
<article xmlns="http://docbook.org/ns/docbook" version="5.0" xml:lang="en"
  xmlns:xlink="http://www.w3.org/1999/xlink">
<info>
  <title>Sharing Mercurial repositories with mercurial-server</title>
  <author><firstname>Paul</firstname><surname>Crowley</surname></author>
  <copyright><year>2009</year><holder>Paul Crowley, LShift Ltd</holder></copyright>
</info>
<section>
<title>About mercurial-server</title>
<para>
Home page: <link xlink:href="http://www.lshift.net/mercurial-server.html"/>
</para>
<para>
mercurial-server gives your developers remote read/write access to
centralized <link xlink:href="http://hg-scm.org/">Mercurial</link>
repositories using SSH public key authentication; it provides convenient
and fine-grained key management and access control.
</para>
<para>
Though mercurial-server is currently targeted at Debian-based systems such
as Ubuntu, other users have reported success getting it running on other
Unix-based systems such as Red Hat. Running it on a non-Unix system such as
Windows is not supported. You will need root privileges to install it.
</para>
</section>
<section>
<title>Step by step</title>
<para>
mercurial-server authenticates users not using passwords but using SSH
public keys; everyone who wants access to a mercurial-server repository
will need such a key. In combination with <command>ssh-agent</command> (or
equivalents such as the Windows program <link
xlink:href="http://the.earth.li/~sgtatham/putty/0.60/htmldoc/Chapter9.html#pageant">Pageant</link>),
this means that users will not need to type in a password to access the
repository. If you're not familiar with SSH public keys, the <link
xlink:href="http://sial.org/howto/openssh/publickey-auth/">OpenSSH Public
Key Authentication tutorial</link> may be helpful.
</para>
<section>
<title>Installing mercurial-server</title>
<para>
In what follows, we assume that your username is <systemitem
class="username">jay</systemitem>, that you usually sit at a machine called
<systemitem class="systemname">spoon</systemitem> and you wish to
install mercurial-server on <systemitem
class="systemname">jeeves</systemitem>. We assume that you have created your SSH public key, set up your SSH agent with this key, and that this key gives you access to <systemitem
class="systemname">jeeves</systemitem>.  
</para>
<para>First install mercurial-server on <systemitem
class="systemname">jeeves</systemitem>:</para>
<screen><computeroutput>jay@spoon:~$ </computeroutput><userinput>scp mercurial-server_0.6.1_amd64.deb jeeves:</userinput>
<computeroutput>mercurial-server_0.6.1_amd64.deb 100%
jay@spoon:~$ </computeroutput><userinput>ssh -A jeeves</userinput>
<computeroutput>jay@jeeves:~$ </computeroutput><userinput>sudo dpkg -i mercurial-server_0.6.1_amd64.deb</userinput>
<computeroutput>[sudo] password for jay: 
Selecting previously deselected package mercurial-server.
(Reading database ... 144805 files and directories currently installed.)
Unpacking mercurial-server (from .../mercurial-server_0.6.1_amd64.deb) ...
Setting up mercurial-server (0.6.1) ...
jay@jeeves:~$ </computeroutput></screen>
<para>
mercurial-server is now installed on the repository host.  Next, we need to give you permission to access its repositories.
</para>
<screen><computeroutput>jay@jeeves:~$ </computeroutput><userinput>ssh-add -L > my-key</userinput>
<computeroutput>jay@jeeves:~$ </computeroutput><userinput>sudo mkdir -p /etc/mercurial-server/keys/root/jay</userinput>
<computeroutput>jay@jeeves:~$ </computeroutput><userinput>sudo cp my-key /etc/mercurial-server/keys/root/jay/spoon</userinput>
<computeroutput>jay@jeeves:~$ </computeroutput><userinput>sudo -u hg /usr/share/mercurial-server/refresh-auth</userinput>
<computeroutput>jay@jeeves:~$ </computeroutput><userinput>exit</userinput>
<computeroutput>Connection to jeeves closed.
jay@spoon:~$ </computeroutput></screen>
<para>
You can now create repositories on the remote machine and have complete
read-write access to all of them.
</para>
</section>
<section>
<title>Creating repositories</title>
<para>
To store a repository on the server, clone it over.
</para>
<screen><computeroutput>jay@spoon:~$ </computeroutput><userinput>cd myproj</userinput>
<computeroutput>jay@spoon:~/myproj$ </computeroutput><userinput>hg clone . ssh://hg@jeeves/jays/project</userinput>
<computeroutput>searching for changes
remote: adding changesets
remote: adding manifests
remote: adding file changes
remote: added 119 changesets with 284 changes to 61 files
jay@spoon:~/myproj$ </computeroutput><userinput>hg pull ssh://hg@jeeves/jays/project</userinput>
<computeroutput>pulling from ssh://hg@jeeves/jays/project
searching for changes
no changes found
<computeroutput>jay@spoon:~/myproj$ </computeroutput><userinput>cd ..</userinput>
jay@spoon:~$ </computeroutput></screen>
</section>
<section>
<title>Adding other users</title>
<para>
At this stage, no-one but you has any access to any repositories you
create on this system. In order to give anyone else access, you'll need a
copy of their SSH public key; we'll assume you have that key in
<filename>~/sam-saucer-key.pub</filename>.  To manage access, you make changes to the special <filename
class='directory'>hgadmin</filename> repository.
</para>
<screen><computeroutput>jay@spoon:~$ </computeroutput><userinput>hg clone ssh://hg@jeeves/hgadmin</userinput>
<computeroutput>destination directory: hgadmin
no changes found
updating working directory
0 files updated, 0 files merged, 0 files removed, 0 files unresolved
jay@spoon:~$ </computeroutput><userinput>cd hgadmin</userinput>
<computeroutput>jay@spoon:~/hgadmin$ </computeroutput><userinput>mkdir -p keys/users/sam</userinput>
<computeroutput>jay@spoon:~/hgadmin$ </computeroutput><userinput>cp ~/sam-saucer-key.pub keys/users/sam/saucer</userinput>
<computeroutput>jay@spoon:~/hgadmin$ </computeroutput><userinput>hg add</userinput>
<computeroutput>adding keys/users/sam/saucer
jay@spoon:~/hgadmin$ </computeroutput><userinput>hg commit -m "Add Sam's key"</userinput>
<computeroutput>jay@spoon:~/hgadmin$ </computeroutput><userinput>hg push</userinput>
<computeroutput>pushing to ssh://hg@jeeves/hgadmin
searching for changes
remote: adding changesets
remote: adding manifests
remote: adding file changes
remote: added 1 changesets with 1 changes to 1 files
jay@spoon:~/hgadmin$ </computeroutput></screen>
<para>
Sam can now read and write to your
<uri>ssh://hg@jeeves/jays/project</uri> repository.
Most other changes to access control can be made simply by making and
pushing changes to <filename
class='directory'>hgadmin</filename>, and you can use Mercurial to
cooperate with other root users in the normal way.
</para>
<para>
If you prefer, you could give them access by
logging into <systemitem class="systemname">jeeves</systemitem>,
putting the key in the right place under <filename
class='directory'>/etc/mercurial-server/keys</filename>, and re-running
<userinput>sudo -u hg /usr/share/mercurial-server/refresh-auth</userinput>.
However, using <filename
class='directory'>hgadmin</filename> is usually more convenient if you need to make more than a very few changes; it also makes it easier to share administration with others and provides a log of all changes.
</para>
</section>
</section>
<section>
<title>Access control</title>
<para>
Out of the box, mercurial-server supports two kinds of users: "root" users and normal users.  If you followed the steps above, you are a "root" user because your key is under <filename class='directory'>keys/root</filename>, while the other user you gave access to is a normal user since their key is under <filename class='directory'>keys/users</filename>.  Keys that are not in either of these directories will by default have no access to anything.
</para>
<para>
Root users can edit <filename
class='directory'>hgadmin</filename>, create new repositories and read and write to existing ones.  Normal users cannot access <filename
class='directory'>hgadmin</filename> or create new repositories, but they can read and write to any other repository.
</para>
<section>
<title>Using access.conf</title>
<para>
mercurial-server offers much more fine-grained access control than this division into two classes of users.  Let's suppose you wish to give Pat access to the <filename
class='directory'>widget</filename> repository, but no other.  We first copy Pat's SSH public key into the <filename
class='directory'>keys/pat</filename> directory in <filename
class='directory'>hgadmin</filename>.  This tells mercurial-server about Pat's key, but gives Pat no access to anything because the key is not under either <filename
class='directory'>keys/root</filename> or <filename
class='directory'>keys/users</filename>.  To grant this key access, we must give mercurial-server a new access rule, so we create a file in <filename
class='directory'>hgadmin</filename> called <filename>access.conf</filename>, with the following contents:</para>
<programlisting># Give Pat access to the "widget" repository
write repo=widget user=pat
</programlisting>
<para>
Pat will have read and write access to the <filename
class='directory'>widget</filename> repository as soon as we add, commit, and push these files.
</para>
<para>
Each line of <filename>access.conf</filename> has the following syntax:
</para>
<programlisting><replaceable>rule</replaceable> <replaceable>condition</replaceable> <replaceable>condition...</replaceable>
</programlisting>
<para>
Blank lines and lines that start with <code>#</code> are ignored. Rule is
one of
</para>
<itemizedlist>
<listitem>
<literal>init</literal>: allow reads, writes, and the creation of new repositories
</listitem>
<listitem>
<literal>write</literal>: allow reads and writes
</listitem>
<listitem>
<literal>read</literal>: allow only read operations
</listitem>
<listitem>
<literal>deny</literal>: deny all requests
</listitem>
</itemizedlist>
<para>
A condition is a globpattern matched against a relative path. The two most
important conditions are
</para>
<itemizedlist>
<listitem>
<code>user=<replaceable>globpattern</replaceable></code>: path to the user's key
</listitem>
<listitem>
<code>repo=<replaceable>globpattern</replaceable></code>: path to the repository
</listitem>
</itemizedlist>
<para>
<code>*</code> only matches one directory level, where <code>**</code>
matches as many as you want. More precisely, <code>*</code> matches zero or
more characters not including <code>/</code> while <code>**</code> matches
zero or more characters including <code>/</code>. So
<code>projects/*</code> matches <filename
class='directory'>projects/foo</filename> but not <filename
class='directory'>projects/foo/bar</filename>, while
<code>projects/**</code> matches both.
</para>
<para>
When considering a request, mercurial-server steps through all the rules in
<filename>/etc/mercurial-server/access.conf</filename> and then all the
rules in <filename>access.conf</filename> in <filename
class='directory'>hgadmin</filename>
looking for a rule which matches on every condition. The first match
determines whether the request will be allowed; if there is no match in
either file, the request will be denied.
</para>
<para>
By default, <filename>/etc/mercurial-server/access.conf</filename> has the
following rules:
</para>
<programlisting>init user=root/**
deny repo=hgadmin
write user=users/**
</programlisting>
<para>
These rules ensure that root users can do any operation on any repository,
that no other users can access the <filename
class='directory'>hgadmin</filename> repository,
and that those with keys in <filename
class='directory'>keys/users</filename> can read or write to any repository
but not create repositories.  Some examples of how these rules work:
</para>
<itemizedlist>
<listitem>
User <filename class='directory'>root/jay</filename> creates a repository
<filename class='directory'>foo/bar/baz</filename>. This matches the first
rule and so will be allowed.
</listitem>
<listitem>
User <filename class='directory'>root/jay</filename> changes repository
<filename class='directory'>hgadmin</filename>. Again, this matches the
first rule and so will be allowed; later rules have no effect.
</listitem>
<listitem>
User <filename class='directory'>users/sam</filename> tries to read
repository <filename class='directory'>hgadmin</filename>. This does not
match the first rule, but matches the second, and so will be denied.
</listitem>
<listitem>
User <filename class='directory'>users/sam</filename> tries to create
repository <filename class='directory'>sams-project</filename>. This does
not match the first two rules, but matches the third; this is a
<literal>write</literal> rule, which doesn't grant the privilege to create
repositories, so the request will be denied.
</listitem>
<listitem>
User <filename class='directory'>users/sam</filename> writes to existing
repository <filename class='directory'>projects/main</filename>. Again,
this matches the third rule, which allows the request.
</listitem>
<listitem>
User <filename class='directory'>pat</filename> tries to write to existing
repository <filename class='directory'>widget</filename>. Until we change
the <filename>access.conf</filename> file in <filename
class='directory'>hgadmin</filename>, this will match no rule, and so will
be denied.
</listitem>
<listitem>
Any request from a user whose key not under the <filename
class='directory'>keys</filename> directory at all will always be denied,
no matter what rules are in effect; because of the way SSH authentication
works, they will be prompted to enter a password, but no password will
work. This can't be changed.
</listitem>
</itemizedlist>
</section>
<section>
<title>/etc/mercurial-server and hgadmin</title>
<para>
mercurial-server consults two distinct locations to collect information about what to allow: <filename
class='directory'>/etc/mercurial-server</filename> and its own <filename
class='directory'>hgadmin</filename> repository.  This is useful for several reasons:
</para>
<itemizedlist>
<listitem>
Some users may not need the convenience of access control via mercurial; for these users updating <filename
class='directory'>/etc/mercurial-server</filename> may offer a simpler route.
</listitem>
<listitem>
<filename class='directory'>/etc/mercurial-server</filename> is suitable
for management with tools such as <link
xlink:href="http://reductivelabs.com/products/puppet">Puppet</link>
</listitem>
<listitem>
If a change to <filename
class='directory'>hgadmin</filename> leaves you "locked out", <filename
class='directory'>/etc/mercurial-server</filename> allows you a way back in.
</listitem>
<listitem>
At install time, all users are "locked out", and so some mechanism to allow some users in is needed.
</listitem>
</itemizedlist>
<para>
Rules in <filename>/etc/mercurial-server/access.conf</filename> are checked before those in <filename
class='directory'>hgadmin</filename>, and keys in <filename class='directory'>/etc/mercurial-server/keys</filename> will be present no matter how <filename
class='directory'>hgadmin</filename> changes.
</para>
<para>
We anticipate that once mercurial-server is successfully installed and
working you will usually want to use <filename
class='directory'>hgadmin</filename> for most
access control tasks. Once you have the right keys and
<filename>access.conf</filename> set up in <filename
class='directory'>hgadmin</filename>, you
can delete <filename>/etc/mercurial-server/access.conf</filename> and all
of <filename class='directory'>/etc/mercurial-server/keys</filename>,
turning control entirely over to <filename
class='directory'>hgadmin</filename>.
</para>
<para>
<filename>/etc/mercurial-server/remote-hgrc</filename> is in the
<systemitem>HGRCPATH</systemitem> for all remote access to mercurial-server
repositories. This file contains the hooks that mercurial-server uses for
access control and logging. You can add hooks to this file, but obviously
breaking the existing hooks will disable the relevant functionality and
isn't advisable.
</para>
</section>
<section>
<title>File and branch conditions</title>
<para>
mercurial-server supports file and branch conditions, which restrict an
operation depending on what files it modifies and what branch the work is
on. </para>
<caution>
The way these conditions work is subtle and can be counterintuitive. Unless
you need what they provide, ignore this section, stick to user and repo
conditions, and then things are likely to work the way you would expect. If
you do need what they provide, read what follows very carefully.
</caution>
<para>
File and branch conditions are added to the conditions against which a rule
matches, just like user and repo conditions; they have this form:
</para>
<itemizedlist>
<listitem>
<code>file=<replaceable>globpattern</replaceable></code>: file within the repo
</listitem>
<listitem>
<code>branch=<replaceable>globpattern</replaceable></code>: Mercurial branch name
</listitem>
</itemizedlist>
<para>
However, in order to understand what effect adding these conditions will
have, it helps to understand how and when these rules are applied.
</para>
<para>
The rules file is used to make three decisions:
</para>
<itemizedlist>
<listitem>
Whether to allow a repository to be created
</listitem>
<listitem>
Whether to allow any access to a repository
</listitem>
<listitem>
Whether to allow a changeset
</listitem>
</itemizedlist>
<para>
When the first two of these decisions are being made, nothing is known
about any changsets that might be pushed, and so all file and branch
conditions automatically succeed for the purpose of such decisions. For the
third condition, every file changed in the changeset must be allowed by a
<literal>write</literal> or <literal>init</literal> rule for the changeset
to be allowed.
</para>
<para>
This means that doing tricky things with file conditions can have
counterintuitive consequences:
</para>
<itemizedlist>
<listitem>
<para>You cannot limit read access to a subset of a repository with a <literal>read</literal>
rule and a file condition: any user who has access to a repository can read
all of it and its full history. Such a rule can only have the effect of
masking a later <literal>write</literal> rule, as in this example:</para>
<programlisting>read repo=specialrepo file=dontwritethis
write repo=specialrepo
</programlisting>
<para>
allows all users to read <literal>specialrepo</literal>, and to write to all files
<emphasis>except</emphasis> that any changeset which writes to
<filename>dontwritethis</filename> will be rejected.
</para>
</listitem>
<listitem>
For similar reasons, don't give <literal>init</literal> rules file conditions.
</listitem>
<listitem>
<para>Don't try to deny write access to a particular file on a particular
branch&#x2014;a developer can write to the file on another branch and then merge
it in. Either deny all writes to the branch from that user, or allow them
to write to all the files they can write to on any branch.
</para>
<programlisting>write user=docs/* branch=docs file=docs/*
</programlisting>
<para>
This rule grants users whose keys are in the <filename
class='directory'>docs</filename> subdirectory the power to push changesets
into any repository only if those changesets are on the
<literal>docs</literal> branch and they affect only those files directly
under the <filename class='directory'>docs</filename> directory. However,
the rules below have more counterintuitive consequences.
</para>
<programlisting>write user=docs/* branch=docs
write user=docs/* file=docs/*
read user=docs/*
</programlisting>
<para>
These rules grant users whose keys are in the <filename
class='directory'>docs</filename> subdirectory the power to change any file directly under the <filename class='directory'>docs</filename> directory, or any file at all in the <literal>docs</literal> branch.  Indirectly, however, this adds up to the power to change any file on any branch, simply by making the change on the docs branch and then merging the change into another branch.
</para>
</listitem>
</itemizedlist>
</section>
</section>
<section>
<title>How mercurial-server works</title>
<para>
All of the repositories controlled by mercurial-server are owned by a
single user, the <systemitem
class="username">hg</systemitem> user, which is why all URLs for
mercurial-server repositories start with <uri>ssh://hg@...</uri>.
Each SSH key that has access to the repository has an entry in
<filename>~hg/.ssh/authorized_keys</filename>; this is how the SSH daemon
knows to give that key access. When the user connects over SSH, their
commands are run in a custom restricted shell; this shell knows which key
was used to connect, determines what the user is trying to do, checks the
access rules to decide whether to allow it, and if allowed invokes
Mercurial internally, without forking.
</para>
<para>
This restricted shell also ensures that certain Mercurial extensions are
loaded when the user acts on a repository; these extensions check the
access control rules for any changeset that the user tries to commit, and
log all pushes and pulls into a per-repository access log.
</para>
<para>
<command>refresh-auth</command> recurses through the <filename
class='directory'>/etc/mercurial-server/keys</filename> and the <filename
class='directory'>keys</filename> directory in the
<filename
class='directory'>hgadmin</filename> repository, creating an entry in
<filename>~hg/.ssh/authorized_keys</filename> for each one. This is redone
automatically whenever a change is pushed to <filename
class='directory'>hgadmin</filename>.
</para>
</section>
<section>
<title>Security</title>
<para>
mercurial-server relies entirely on <command>sshd</command> to grant access to remote users.
As a result, it runs no daemons, installs no setuid programs, and no part
of it runs as <systemitem
class="username">root</systemitem> except the install process: all programs run as the user
<systemitem
class="username">hg</systemitem>. Any attack on mercurial-server can only be started if the attacker
already has a public key in <filename>~hg/.ssh/authorized_keys</filename>,
otherwise <command>sshd</command> will bar the way.
</para>
<para>
No matter what command the user tries to run on the remote system via SSH,
mercurial-server is run. It parses the command line the user asked for, and
interprets and runs the corresponding operation itself if access is
allowed, so users can only read and add to history within repositories;
they cannot run any other command. In addition, every push and pull is
logged with a datestamp, changeset ID and the key that performed the
operation.
</para>
<para>
However, while the first paragraph holds no matter what bugs
mercurial-server contains, the second depends on the relevant code being
correct; though the entire codebase is short, mercurial-server is a fairly
new program and may harbour bugs. Backups are essential!
</para>
</section>
<section>
<title>Legalese</title>
<para>
This program is free software; you can redistribute it and/or modify it
under the terms of the GNU General Public License as published by the Free
Software Foundation; either version 2 of the License, or (at your option)
any later version.
</para>
<para>
This program is distributed in the hope that it will be useful, but
WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for
more details.
</para>
<para>
You should have received a copy of the GNU General Public License along
with this program; if not, write to the Free Software Foundation, Inc., 51
Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
</para>
</section>
<section>
<title>Thanks</title>
<para>
Thanks for reading this far. If you use mercurial-server, please tell me about
it.
</para>
<para>
Paul Crowley, <email>paul@lshift.net</email>, 2009
</para>
</section>
</article>