Fossilfossil Versus Git

Fossilfossil Versus Git

Fossil/Fossil Versus Git

1.0 Don't Stress!

The feature sets of Fossil and Git overlap in many ways. Both are distributed version control systems which store a tree of check-in objects to a local repository clone. In both systems, the local clone starts out as a full copy of the remote parent. New content gets added to the local clone and then later optionally pushed up to the remote, and changes to the remote can be pulled down to the local clone at will. Both systems offer diffing, patching, branching, merging, cherry-picking, bisecting, private branches, a stash, etc.

Fossil has inbound and outbound Git conversion features, so if you start out using one DVCS and later decide you like the other better, you can easily move your version-controlled file content.¹

In this document, we set all of that similarity and interoperability aside and focus on the important differences between the two, especially those that impact the user experience.

Keep in mind that you are reading this on a Fossil website, and though we try to be fair, the information here might be biased in favor of Fossil, if only because we spend most of our time using Fossil, not Git. Ask around for second opinions from people who have used both Fossil and Git.

2.0 Differences Between Fossil And Git

Differences between Fossil and Git are summarized by the following table, with further description in the text that follows.

File versioning only / VCS, tickets, wiki, docs, notes, forum, UI, RBAC
Sprawling, incoherent, and inefficient / Self-contained and efficient
Ad-hoc pile-of-files key/value database / The most popular database in the world
Portable to POSIX systems only / Runs just about anywhere
Bazaar-style development / Cathedral-style development
Designed for Linux kernel development / Designed for SQLite development
Many contributors / Select contributors
Focus on individual branches / Focus on the entire tree of changes
One check-out per repository / Many check-outs per repository
Remembers what you should have done / Remembers what you actually did
SHA-2 / SHA-3

2.1 Featureful

Git provides file versioning services only, whereas Fossil adds an integrated wiki, ticketing & bug tracking, embedded documentation, technical notes, and a web forum, all within a single nicely-designed skinnable web UI, protected by a fine-grained role-based access control system. These additional capabilities are available for Git as 3rd-party add-ons, but with Fossil they are integrated into the design. One way to describe Fossil is that it is "GitHub-in-a-box."

For developers who choose to self-host projects (rather than using a 3rd-party service such as GitHub) Fossil is much easier to set up, since the stand-alone Fossil executable together with a 2-line CGI script suffice to instantiate a full-featured developer website. To accomplish the same using Git requires locating, installing, configuring, integrating, and managing a wide assortment of separate tools. Standing up a developer website using Fossil can be done in minutes, whereas doing the same using Git requires hours or days.

Fossil is small, complete, and self-contained. If you clone Git's self-hosting repository, you get just Git's source code. If you clone Fossil's self-hosting repository, you get the entire Fossil website — source code, documentation, ticket history, and so forth.² That means you get a copy of this very article and all of its historical versions, plus the same for all of the other public content on this site.

2.2 Efficient

Git is actually a collection of many small tools, each doing one small part of the job, which can be recombined (by experts) to perform powerful operations. Git has a lot of complexity and many dependencies, so that most people end up installing it via some kind of package manager, simply because the creation of complicated binary packages is best delegated to people skilled in their creation. Normal Git users are not expected to build Git from source and install it themselves.

Fossil is a single self-contained stand-alone executable with hardly any dependencies. Fossil can be run inside a minimally configured chroot jail, from a Windows memory stick, off a Raspberry Pi with a tiny SD card, etc. To install Fossil, one merely puts the executable somewhere in the $PATH. Fossil is straightforward to build and install, so that many Fossil users do in fact build and install "trunk" versions to get new features between formal releases.

Some say that Git more closely adheres to the Unix philosophy, summarized as "many small tools, loosely joined," but we have many examples of other successful Unix software that violates that principle to good effect, from Apache to Python to ZFS. We can infer from that that this is not an absolute principle of good software design. Sometimes "many features, tightly-coupled" works better. What actually matters is effectiveness and efficiency. We believe Fossil achieves this.

Git fails on efficiency once you add to it all of the third-party software needed to give it a Fossil-equivalent feature set. Consider GitLab, a third-party extension to Git wrapping it in many features, making it roughly Fossil-equivalent, though much more resource hungry and hence more costly to run than the equivalent Fossil setup. GitLab's basic requirements are easy to accept when you're dedicating a local rack server or blade to it, since its minimum requirements are more or less a description of the smallest thing you could call a "server" these days, but when you go to host that in the cloud, you can expect to pay about 8⨉ as much to comfortably host GitLab as for Fossil.³ This difference is largely due to basic technology choices: Ruby and PostgreSQL vs C and SQLite.

The Fossil project itself is hosted on a very small VPS, and we've received many reports on the Fossil forum about people successfully hosting Fossil service on bare-bones $5/month VPS hosts, spare Raspberry Pi boards, and other small hosts.

2.3 Durable

The baseline data structures for Fossil and Git are the same, modulo formatting details. Both systems manage a directed acyclic graph (DAG) of Merkle tree / block chain structured check-in objects. Check-ins are identified by a cryptographic hash of the check-in comment, and each check-in refers to its parent via its hash.

The difference is that Git stores its objects as individual files in the .git folder or compressed into bespoke pack-files, whereas Fossil stores its objects in a SQLite database file using a hybrid NoSQL/relational data model of the check-in history. Git's data storage system is an ad-hoc pile-of-files key/value database, whereas Fossil uses a proven, heavily-tested, general-purpose, durable SQL database. This difference is more than an implementation detail. It has important practical consequences.

With Git, one can easily locate the ancestors of a particular check-in by following the pointers embedded in the check-in object, but it is difficult to go the other direction and locate the descendants of a check-in. It is so difficult, in fact, that neither native Git nor GitHub provide this capability short of groveling the commit log. With Git, if you are looking at some historical check-in then you cannot ask "What came next?" or "What are the children of this check-in?"

Fossil, on the other hand, parses essential information about check-ins (parents, children, committers, comments, files changed, etc.) into a relational database that can be easily queried using concise SQL statements to find both ancestors and descendants of a check-in. This is the hybrid data model mentioned above: Fossil manages your check-in and other data in a NoSQL block chain structured data store, but that's backed by a set of relational lookup tables for quick indexing into that artifact store. (See "Thoughts On The Design Of The Fossil DVCS" for more details.)

Leaf check-ins in Git that lack a "ref" become "detached," making them difficult to locate and subject to garbage collection. This detached head state problem has caused untold grief for countless Git users. With Fossil, detached heads are simply impossible because we can always find our way back into the block chain using one or more of the relational indices it automatically manages for you.

This design difference shows up in several other places within each tool. It is why Fossil's timeline is generally more detailed yet more clear than those available in Git front-ends. (Contrast this Fossil timeline with its closest equivalent in GitHub.) It's why there is no inverse of the cryptic @~ notation in Git, meaning "the parent of HEAD," which Fossil simply calls "prev", but there is a "next" special check-in name in Fossil. It is why Fossil has so many built-in status reports to help maintain situational awareness, aid comprehension, and avoid errors.

These differences are due, in part, to Fossil's start a year later than Git: we were able to learn from its key design mistakes.

2.4 Portable

Fossil is largely written in ISO C, almost purely conforming to the original 1989 standard. We make very little use of C99, and we do not knowingly make any use of C11. Fossil does call POSIX and Windows APIs where necessary, but it's about as portable as you can ask given that ISO C doesn't define all of the facilities Fossil needs to do its thing. (Network sockets, file locking, etc.) There are certainly well-known platforms Fossil hasn't been ported to yet, but that's most likely due to lack of interest rather than inherent difficulties in doing the port. We believe the most stringent limit on its portability is that it assumes at least a 32-bit CPU and several megs of flat-addressed memory.⁴ Fossil isn't quite as portable as SQLite, but it's close.

Over half of the C code in Fossil is actually an embedded copy of the current version of SQLite. Much of what is Fossil-specific after you set SQLite itself aside is SQL code calling into SQLite. The number of lines of SQL code in Fossil isn't large by percentage, but since SQL is such an expressive, declarative language, it has an outsized contribution to Fossil's user-visible functionality.

Fossil isn't entirely C and SQL code. Its web UI uses JavaScript where necessary.⁵ The server-side UI scripting uses a custom minimal Tcl dialect called TH1, which is embedded into Fossil itself. Fossil's build system and test suite are largely based on Tcl.⁶ All of this is quite portable.

About half of Git's code is POSIX C, and about a third is POSIX shell code. This is largely why the so-called "Git for Windows" distributions (both first-party and third-party) are actually an MSYS POSIX portability environment bundled with all of the Git stuff, because it would be too painful to port Git natively to Windows. Git is a foreign citizen on Windows, speaking to it only through a translator.⁷

While Fossil does lean toward POSIX norms when given a choice — LF-only line endings are treated as first-class citizens over CR+LF, for example — the Windows build of Fossil is truly native.

The third-party extensions to Git tend to follow this same pattern. GitLab isn't portable to Windows at all, for example. For that matter, GitLab isn't even officially supported on macOS, the BSDs, or uncommon Linuxes! We have many users who regularly build and run Fossil on all of these systems.

2.5 Linux vs. SQLite

Fossil and Git promote different development styles because each one was specifically designed to support the creator's main software development project: Linus Torvalds designed Git to support development of the Linux kernel, and D. Richard Hipp designed Fossil to support the development of SQLite. Both projects must rank high on any objective list of "most important FOSS projects," yet these two projects are almost entirely unlike one another, so it is natural that the DVCSes created to support these projects also differ in many ways.

In the following sections, we will explain how four key differences between the Linux and SQLite software development projects dictated the design of each DVCS's low-friction usage path.

When deciding between these two DVCSes, you should ask yourself, "Is my project more like Linux or more like SQLite?"

2.5.1 Development Organization

Eric S. Raymond's seminal essay-turned-book "The Cathedral and the Bazaar" details the two major development organization styles found in FOSS projects. As it happens, Linux and SQLite fall on opposite sides of this dichotomy. Differing development organization styles dictate a different design and low-friction usage path in the tools created to support each project.

Git promotes the Linux kernel's bazaar development style, in which a loosely-associated mass of developers contribute their work through a hierarchy of lieutenants who manage and clean up these contributions for consideration by Linus Torvalds, who has the power to cherry-pick individual contributions into his version of the Linux kernel. Git allows an anonymous developer to rebase and push specific locally-named private branches, so that a Git repo clone often isn't really a clone at all: it may have an arbitrary number of differences relative to the repository it originally cloned from. Git encourages siloed development. Select work in a developer's local repository may remain private indefinitely.

All of this is exactly what one wants when doing bazaar-style development.

Fossil's normal mode of operation differs on every one of these points, with the specific designed-in goal of promoting SQLite's cathedral development model:

Personal engagement: SQLite's developers know each other by name and work together daily on the project.

Trust over hierarchy: SQLite's developers check changes into their local repository, and these are immediately and automatically synchronized up to the central repository; there is no "dictator and lieutenants" hierarchy as with Linux kernel contributions. D. Richard Hipp rarely overrides decisions made by those he has trusted with commit access on his repositories. Fossil allows you to give some users more power over what they can do with the repository, but Fossil only loosely supports the enforcement of a development organization's social and power hierarchies. Fossil is a great fit for flat organizations.

No easy drive-by contributions: Git pull requests offer a low-friction path to accepting drive-by contributions. Fossil's closest equivalent is its unique bundle feature, which requires higher engagement than firing off a PR.⁸ This difference comes directly from the initial designed purpose for each tool: the SQLite project doesn't accept outside contributions from previously-unknown developers, but the Linux kernel does.

No rebasing: When your local repo clone syncs changes up to its parent, those changes are sent exactly as they were committed locally. There is no rebasing mechanism in Fossil, on purpose.

Sync over push: Explicit pushes are uncommon in Fossil-based projects: the default is to rely on autosync mode instead, in which each commit syncs immediately to its parent repository. This is a mode so you can turn it off temporarily when needed, such as when working offline. Fossil is still a truly distributed version control system; it's just that its starting default is to assume you're rarely out of communication with the parent repo.
This is not merely a reflection of modern always-connected computing environments. It is a conscious decision in direct support of SQLite's cathedral development model: we don't want developers going dark, then showing up weeks later with a massive bolus of changes for us to integrate all at once. Jim McCarthy put it well in his book on software project management, Dynamics of Software Development: "Beware of a guy in a room."

Branch names sync: Unlike in Git, branch names in Fossil are not purely local labels. They sync along with everything else, so everyone sees the same set of branch names. Fossil's design choice here is a direct reflection of the Linux vs. SQLite project outlook: SQLite's developers collaborate closely on a single coherent project, whereas Linux's developers go off on tangents and occasionally sync changes up with each other.

Private branches are rare: Private branches exist in Fossil, but they're normally used to handle rare exception cases, whereas in many Git projects, they're part of the straight-line development process.

Identical clones: Fossil's autosync system tries to keep local clones identical to the repository it cloned from.

Where Git encourages siloed development, Fossil fights against it. Fossil places a lot of emphasis on synchronizing everyone's work and on reporting on the state of the project and the work of its developers, so that everyone — especially the project leader — can maintain a better mental picture of what is happening, leading to better situational awareness.