doc/manual.docbook
author Paul Crowley <paul@lshift.net>
Thu, 15 Oct 2009 09:23:51 +0100
changeset 141 e326e29b48ab
parent 138 44ee9dc3bba9
child 142 fb64f9ac44c5
permissions -rw-r--r--
fix output that should read "repository-host"

<?xml version="1.0" encoding="utf-8"?>
<article xmlns="http://docbook.org/ns/docbook" version="5.0" xml:lang="en"
  xmlns:xlink="http://www.w3.org/1999/xlink">
<info>
  <title>Sharing Mercurial repositories with mercurial-server</title>
  <author><firstname>Paul</firstname><surname>Crowley</surname></author>
  <copyright><year>2009</year><holder>Paul Crowley, LShift Ltd</holder></copyright>
</info>
<section>
<title>About mercurial-server</title>
<para>
Home page: <link xlink:href="http://www.lshift.net/mercurial-server.html"/>
</para>
<para>
mercurial-server gives your developers remote read/write access to
centralized <link xlink:href="http://hg-scm.org/">Mercurial</link>
repositories using SSH public key authentication; it provides convenient
and fine-grained key management and access control.
</para>
<para>
Though mercurial-server is currently targeted at Debian-based systems such
as Ubuntu, other users have reported success getting it running on other
Unix-based systems such as Red Hat. Running it on a non-Unix system such as
Windows is not supported. You will need root privileges to install it.
</para>
</section>
<section>
<title>Step by step</title>
<para>
mercurial-server authenticates users not using passwords but using SSH
public keys; everyone who wants access to a mercurial-server repository
will need such a key. In combination with <command>ssh-agent</command> (or
equivalents such as the Windows program <link
xlink:href="http://the.earth.li/~sgtatham/putty/0.60/htmldoc/Chapter9.html#pageant">Pageant</link>),
this means that users will not need to type in a password to access the
repository. If you're not familiar with SSH public keys, the <link
xlink:href="http://sial.org/howto/openssh/publickey-auth/">OpenSSH Public
Key Authentication tutorial</link> may be helpful.
</para>
<section>
<title>Installing mercurial-server</title>
<para>
In what follows, we assume that your username is <systemitem
class="username">jay</systemitem>, that you usually sit at a machine called
<systemitem class="systemname">my-workstation</systemitem> and you wish to
install mercurial-server on <systemitem
class="systemname">repository-host</systemitem>. We assume that you have created your SSH public key, set up your SSH agent with this key, and that this key gives you access to <systemitem
class="systemname">repository-host</systemitem>.  
</para>
<para>First install mercurial-server on <systemitem
class="systemname">repository-host</systemitem>:</para>
<screen>
<computeroutput>jay@my-workstation:~$ </computeroutput><userinput>scp mercurial-server_0.6.1_amd64.deb repository-host:</userinput>
<computeroutput>mercurial-server_0.6.1_amd64.deb 100%
jay@my-workstation:~$ </computeroutput><userinput>ssh -A repository-host</userinput>
<computeroutput>jay@repository-host:~$ </computeroutput><userinput>sudo dpkg -i mercurial-server_0.6.1_amd64.deb</userinput>
<computeroutput>[sudo] password for jay: 
Selecting previously deselected package mercurial-server.
(Reading database ... 144805 files and directories currently installed.)
Unpacking mercurial-server (from .../mercurial-server_0.6.1_amd64.deb) ...
Setting up mercurial-server (0.6.1) ...
jay@repository-host:~$ </computeroutput></screen>
<para>
mercurial-server is now installed on the repository host.  Next, we need to give you permission to access its repositories.
</para>
<screen>
<computeroutput>jay@repository-host:~$ </computeroutput><userinput>ssh-add -L > my-key</userinput>
<computeroutput>jay@repository-host:~$ </computeroutput><userinput>sudo mkdir -p /etc/mercurial-server/keys/root/jay</userinput>
<computeroutput>jay@repository-host:~$ </computeroutput><userinput>sudo cp my-key /etc/mercurial-server/keys/root/jay/my-workstation</userinput>
<computeroutput>jay@repository-host:~$ </computeroutput><userinput>sudo -u hg /usr/share/mercurial-server/refresh-auth</userinput>
<computeroutput>jay@repository-host:~$ </computeroutput><userinput>exit</userinput>
<computeroutput>Connection to repository-host closed.
jay@my-workstation:~$ </computeroutput></screen>
<para>
You can now create repositories on the remote machine and have complete
read-write access to all of them.
</para>
</section>
<section>
<title>Creating repositories</title>
<para>
To store a repository on the server, clone it over.
</para>
<screen>
<computeroutput>jay@my-workstation:~$ </computeroutput><userinput>cd my-mercurial-project</userinput>
<computeroutput>jay@my-workstation:~/my-mercurial-project$ </computeroutput><userinput>hg clone . ssh://hg@repository-host/repository/name</userinput>
<computeroutput>searching for changes
remote: adding changesets
remote: adding manifests
remote: adding file changes
remote: added 119 changesets with 284 changes to 61 files
jay@my-workstation:~/my-mercurial-project$ </computeroutput><userinput>hg pull ssh://hg@repository-host/repository/name</userinput>
<computeroutput>pulling from ssh://hg@repository-host/repository/name
searching for changes
no changes found
<computeroutput>jay@my-workstation:~/my-mercurial-project$ </computeroutput><userinput>cd ..</userinput>
jay@my-workstation:~$ </computeroutput></screen>
</section>
<section>
<title>Adding other users</title>
<para>
As things stand, no-one but you has any access to any repositories you
create on this system. In order to give anyone else access, you'll need a
copy of their SSH public key; we'll assume you have that key in
<filename>~/sam-key.pub</filename>.  To manage access, you make changes to the special <literal>hgadmin</literal> repository.
</para>
<screen>
<computeroutput>jay@my-workstation:~$ </computeroutput><userinput>hg clone ssh://hg@repository-host/hgadmin</userinput>
<computeroutput>destination directory: hgadmin
no changes found
updating working directory
0 files updated, 0 files merged, 0 files removed, 0 files unresolved
jay@my-workstation:~$ </computeroutput><userinput>cd hgadmin</userinput>
<computeroutput>jay@my-workstation:~/hgadmin$ </computeroutput><userinput>mkdir -p keys/users/sam</userinput>
<computeroutput>jay@my-workstation:~/hgadmin$ </computeroutput><userinput>cp ~/sam-key.pub keys/users/sam/their-workstation</userinput>
<computeroutput>jay@my-workstation:~/hgadmin$ </computeroutput><userinput>hg add</userinput>
<computeroutput>adding keys/users/sam/their-workstation
jay@my-workstation:~/hgadmin$ </computeroutput><userinput>hg commit -m "Add Sam's key'"</userinput>
<computeroutput>jay@my-workstation:~/hgadmin$ </computeroutput><userinput>hg push</userinput>
<computeroutput>pushing to ssh://hg@repository-host/hgadmin
searching for changes
remote: adding changesets
remote: adding manifests
remote: adding file changes
remote: added 1 changesets with 1 changes to 1 files
jay@my-workstation:~/hgadmin$ </computeroutput></screen>
<para>
Sam can now read and write to your
<literal>ssh://hg@repository-host/repository/name</literal> repository.
Most other changes to access control can be made simply by making and
pushing changes to <literal>hgadmin</literal>, and you can use Mercurial to
cooperate with other root users in the normal way.
</para>
<para>
If you prefer, you could give them access by
logging into <systemitem class="systemname">repository-host</systemitem>,
putting the key in the right place under <filename
class='directory'>/etc/mercurial-server/keys</filename>, and re-running
<userinput>sudo -u hg /usr/share/mercurial-server/refresh-auth</userinput>.
However, using <literal>hgadmin</literal> is usually more convenient if you need to make more than a very few changes; it also makes it easier to share administration with others and provides a log of all changes.
</para>
</section>
</section>
<section>
<title>Access control</title>
<para>
Out of the box, mercurial-server supports two kinds of users: "root" users and normal users.  If you followed the steps above, you are a "root" user because your key is under <filename class='directory'>keys/root</filename>, while the other user you gave access to is a normal user since their key is under <filename class='directory'>keys/users</filename>.  Keys that are not in either of these directories will by default have no access to anything.
</para>
<para>
Root users can edit <literal>hgadmin</literal>, create new repositories and read and write to existing ones.  Normal users cannot access <literal>hgadmin</literal> or create new repositories, but they can read and write to any other repository.
</para>
<section>
<title>Using access.conf</title>
<para>
mercurial-server offers much more fine-grained access control than this division into two classes of users.  Let's suppose you wish to give Pat access to the <literal>widget</literal> repository, but no other.  We first copy Pat's SSH public key into the <filename
class='directory'>keys/widget/pat</filename> directory in <literal>hgadmin</literal>.  Now mercurial-server knows about Pat's key, but will give Pat no access to anything because the key is not under either <filename
class='directory'>keys/root</filename> or <filename
class='directory'>keys/users</filename>.  To grant this key access, we must give mercurial-server a new access rule, so we create a file in <literal>hgadmin</literal> called <filename>access.conf</filename>, with the following contents:</para>
<programlisting>
    write repo=widget user=widget/**
</programlisting>
<para>
Pat will have read and write access as soon as we add, commit, and push these files.
</para>
<para>
Each line of <filename>access.conf</filename> has the following syntax:
</para>
<programlisting>
<replaceable>rule</replaceable> <replaceable>condition</replaceable> <replaceable>condition...</replaceable>
</programlisting>
<para>
Blank lines and lines that start with <literal>#</literal> are ignored. Rule is one of
</para>
<itemizedlist>
<listitem>
<literal>init</literal>: allow reads, writes, and the creation of new repositories
</listitem>
<listitem>
<literal>write</literal>: allow reads and writes
</listitem>
<listitem>
<literal>read</literal>: allow only read operations
</listitem>
<listitem>
<literal>deny</literal>: deny all requests
</listitem>
</itemizedlist>
<para>
When considering a request, mercurial-server steps through all the rules in <filename>/etc/mercurial-server/access.conf</filename> and then all the rules in <filename>access.conf</filename> in <literal>hgadmin</literal> looking for a rule which matches on every condition.  If it does not find such a rule, it denies the request; otherwise it checks whether the rule grants sufficient privilege to allow it.
</para>
<para>
By default, <filename>/etc/mercurial-server/access.conf</filename> has the following rules:
</para>
<programlisting>
    init user=root/**
    deny repo=hgadmin
    write user=users/**
</programlisting>
<para>
These rules ensure that root users can do any operation on any repository, that no other users can access the <literal>hgadmin</literal> repository, and that those with keys in <filename class='directory'>keys/users</filename> can read or write to any repository but not create repositories.
</para>
<para>
A condition is a globpattern matched against a relative path. The two most
important conditions are
</para>
<itemizedlist>
<listitem>
<code><literal>user=</literal><replaceable>globpattern</replaceable></code>: path to the user's key
</listitem>
<listitem>
<code><literal>repo=</literal><replaceable>globpattern</replaceable></code>: path to the repository
</listitem>
</itemizedlist>
<para>
"*" only matches one directory level, where "**" matches as many as you
want. More precisely, "*" matches zero or more characters not including "/"
while "**" matches zero or more characters including "/".
</para>
</section>
<section>
<title>/etc/mercurial-server and hgadmin</title>
<para>
mercurial-server consults two distinct locations to collect information about what to allow: <filename
class='directory'>/etc/mercurial-server</filename> and its own <literal>hgadmin</literal> repository.  This is useful for several reasons:
</para>
<itemizedlist>
<listitem>
Users may not need the sophistication of access control via mercurial; for these users updating <filename
class='directory'>/etc/mercurial-server</filename> may offer a simpler route.
</listitem>
<listitem>
<filename
class='directory'>/etc/mercurial-server</filename> is suitable for management by some other route, such as with  <link
xlink:href="http://reductivelabs.com/products/puppet">Puppet</link>
</listitem>
<listitem>
If a change to <literal>hgadmin</literal> leaves you "locked out", <filename
class='directory'>/etc/mercurial-server</filename> allows you a way back in.
</listitem>
<listitem>
At install time, all users are "locked out", and so some mechanism to allow some users in is needed.
</listitem>
</itemizedlist>
<para>
Rules in <filename>/etc/mercurial-server/access.conf</filename> take precedence over those in <literal>hgadmin</literal>, and obviously keys in <filename class='directory'>/etc/mercurial-server/keys</filename> cannot be affected by changes to <literal>hgadmin</literal>.
</para>
<para>
We anticipate that once mercurial-server is successfully installed and
working most users will want to use <literal>hgadmin</literal> for most
access control tasks. Once you have the right keys and
<filename>access.conf</filename> set up in <literal>hgadmin</literal>, you
can delete <filename>/etc/mercurial-server/access.conf</filename> and all
of <filename class='directory'>/etc/mercurial-server/keys</filename>,
turning control entirely over to <literal>hgadmin</literal>.
</para>
<para>
<filename>/etc/mercurial-server/remote-hgrc</filename> is in the
<systemitem>HGRCPATH</systemitem> for all remote access to mercurial-server
repositories. This file contains the hooks that mercurial-server uses for
access control and logging. You can add hooks to this file, but obviously
breaking the existing hooks will disable the relevant functionality and
isn't advisable.
</para>
</section>
<section>
<title>File and branch conditions</title>
<para>
mercurial-server supports file and branch conditions, which restrict an
operation depending on what files it modifies and what branch the work is
on. </para>
<caution>
The way these conditions work is subtle and can be counterintuitive. Unless
you need what they provide, ignore this section, stick to user and repo
conditions, and then things are likely to work the way you would expect.
</caution>
<para>
File and branch conditions are added to the conditions against which a rule
matches, just like user and repo conditions; they have this form:
</para>
<itemizedlist>
<listitem>
<code><literal>file=</literal><replaceable>globpattern</replaceable></code>: file within the repo
</listitem>
<listitem>
<code><literal>branch=</literal><replaceable>globpattern</replaceable></code>: Mercurial branch name
</listitem>
</itemizedlist>
<para>
However, in order to understand what effect adding these conditions will
have, it helps to understand how and when these rules are applied.
</para>
<para>
The rules file is used to make three decisions:
</para>
<itemizedlist>
<listitem>
Whether to allow a repository to be created
</listitem>
<listitem>
Whether to allow any access to a repository
</listitem>
<listitem>
Whether to allow a changeset, which is on a some branch
</listitem>
<listitem>
Whether to allow a changeset which changes a particular file
</listitem>
</itemizedlist>
<para>
When the first two of these decisions are being made, nothing is known
about what files might be changed, and so all file and branch conditions
automatically succeed for the purpose of such decisions. This means that
doing tricky things with file conditions can have counterintuitive
consequences:
</para>
<itemizedlist>
<listitem>
<para>You cannot limit read access to a subset of a repository with a "read"
rule and a file condition: any user who has access to a repository can read
all of it and its full history. Such a rule can only have the effect of
masking a later "write" rule, as in this example:</para>
<programlisting>
   read repo=specialrepo file=dontwritethis
   write repo=specialrepo
</programlisting>
<para>
allows all users to read specialrepo, and to write to all files
<emphasis>except</emphasis> that any changeset which writes to
<filename>dontwritethis</filename> will be rejected.
</para>
</listitem>
<listitem>
For similar reasons, don't give <literal>init</literal> rules file conditions.
</listitem>
<listitem>
<para>Don't try to deny write access to a particular file on a particular
branch - a developer can write to the file on another branch and then merge
it in. Either deny all writes to the branch from that user, or allow them
to write to all the files they can write to on any branch. In other words,
something like this will have the intended effect:
</para>
<programlisting>
   write user=docs/* branch=docs file=docs/*
</programlisting>
<para>
But something like this will not have the intended effect; it will
effectively allow these users to write to any file on any branch, by
writing it to "docs" first:
</para>
<programlisting>
  write user=docs/* branch=docs
  write user=docs/* file=docs/*
  read user=docs/*
</programlisting>
</listitem>
</itemizedlist>
</section>
</section>
<section>
<title>How mercurial-server works</title>
<para>
All of the repositories controlled by mercurial-server are owned by a
single user, the <literal>hg</literal> user, which is why all URLs for
mercurial-server repositories start with <literal>ssh://hg@...</literal>.
Each SSH key that has access to the repository has an entry in
<filename>~hg/.ssh/authorized_keys</filename>; this is how the SSH daemon
knows to give that key access. When the user connects over SSH, their
commands are run in a specially crafted restricted shell; this shell knows
which key was used to connect, determines what the user is trying to do,
and checks the access rules to decide whether to allow it.  
</para>
<para>
This restricted shell also ensures that certain Mercurial extensions are
loaded when the user acts on a repository; these extensions check the
access control rules for any changeset that the user tries to commit, and
log all pushes and pulls into a per-repository access log.
</para>
<para>
<command>refresh-auth</command> recurses through the <filename
class='directory'>/etc/mercurial-server/keys</filename> and the <filename
class='directory'>keys</filename> directory in the
<literal>hgadmin</literal> repository, creating an entry in
<filename>~hg/.ssh/authorized_keys</filename> for each one. This is redone
automatically whenever a change is pushed to <literal>hgadmin</literal>.
</para>
</section>
<section>
<title>Security</title>
<para>
mercurial-server relies entirely on sshd to grant access to remote users.
As a result, it runs no daemons, installs no setuid programs, and no part
of it runs as root except the install process: all programs run as the user
hg. Any attack on mercurial-server can only be started if the attacker
already has a public key in <filename>~hg/.ssh/authorized_keys</filename>,
otherwise sshd will bar the way.
</para>
<para>
No matter what command the user tries to run on the remote system via SSH,
mercurial-server is run. It parses the command line the user asked for, and
interprets and runs the corresponding hg operation itself if access is
allowed, so users can only read and add to history within repositories;
they cannot run any other hg command. In addition, every push and pull is
logged with a datestamp, changeset ID and the key that performed the
operation.
</para>
<para>
However, while the first paragraph holds no matter what bugs
mercurial-server contains, the second depends on the relevant code being
correct; though the entire codebase is short, mercurial-server is a fairly
new program and may harbour bugs. Backups are essential!
</para>
</section>
<section>
<title>Legalese</title>
<para>
This program is free software; you can redistribute it and/or modify it
under the terms of the GNU General Public License as published by the Free
Software Foundation; either version 2 of the License, or (at your option)
any later version.
</para>
<para>
This program is distributed in the hope that it will be useful, but
WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for
more details.
</para>
<para>
You should have received a copy of the GNU General Public License along
with this program; if not, write to the Free Software Foundation, Inc., 51
Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
</para>
</section>
<section>
<title>Thanks</title>
<para>
Thanks for reading this far. If you use mercurial-server, please tell me about
it.
</para>
<para>
Paul Crowley, <email>paul@lshift.net</email>, 2009
</para>
</section>
</article>