Source Control Musings
The associated image (you may have guessed) is not of a real book but was created via and is of course a satirical take on the world of version control. Note in particular the poignant pun on the word 'source' in the sub-title - feel free to laugh out loud if you're not already doing so :-)
This post will hopefully pave the way for some more technically oriented content regarding source control systems - specifically subversion - over the following weeks and months. Source control is something that is close to my heart because a) I feel it is fundamental to software development and is at the center of any good development process and b) in my current role I am somewhat responsible for items of this nature.
In recent months I have become somewhat frustrated with the source control system currently used at the office and, while it would take a pretty thorough pitch to sell another versioning system (not to mention the accompanying migration process and user training) it would be negligent not to investigate our options.
The current versioning system we use is Visual SourceSafe 6.0 - a relic from the golden days of software development. Let's be straight - this is most definitely not a bad piece of software - it was developed before active directory was invented and before the prevalence of the internet as a factor of organizations' development models. It is simply out-dated and, alas, my organization does not have access to SourceSafe 2005 or (more is the pity) visual studio team system. I have no doubt that the latter would be a great step forward, but it is outside of our licensing agreement and unlikely to be an option in the near future.
Visual SourceSafe 6.0 Limitations
- No Atomic Commits. We all know what this means - the possibility of having an inconsistent versioning state after x of y files fail during check-in.
- Size Limitation. The recommendation is that a VSS database should not exceed 5GB and it has been suggested that performance is affected after 3GB. The VSS engine necessitates storing the entire file for each version (of a given file) checked in, rather than using a differential as other versioning systems now do. This means that 5GB is a very realistic issue for organizations with a lengthy development history.
- Connection Limitations. VSS is built on file-sharing and does not play well across large networks - especially the internet. If you have ever attempted to use source control over VPN you know what I'm talking about. Sure, you could work in disconnected mode, sweeping your issues under the rug...
- Client-centric extensibility. While VSS does provide an API for extending its functionality, it is far from easy to add basic (and valuable) functionality to the tool. If you do go to the trouble of developing an add-on, it must be installed at the client level...for every single client. If you miss a single client then any policy enforcement etc. being done by the add-in is broken.
- Lack of LDAP Integration. Management of database access and permissions has to be done internally. While this is a sign of the times rather than a design flaw - it would be nice to group access and permissions in Active Directory, requiring less time configuring VSS.
- Granularity of Privileges. As an administrator on our versioning system, one thing I cannot stand is the grouping of privileges for users. For example, to allow a user to add a new file to a repository, they must also be given delete privileges (note: delete, not destroy).
- Versioning Model. The Lock-Update-Unlock model of source control may be a little out-dated and does not allow for intuitive concurrent updates of a single file by multiple developers.
- Database Corruption. In the years that my organization has used VSS, there have been zero instances of repository database corruption. However, this is often flagged as a major liability of this software and I am acutely aware of the possibility.
- Atomic Commits. Number One VSS Limitation, number one Requirement…
- Non Proprietary. Selling the migration to a new version control system is a tall task for most organizations; therefore it is important that the cost benefit ratio is high. A full featured versioning system with no financial cost is far more desirable than a pay per seat system.
- IDE Integration. SourceSafe is tied into the heart of the Visual Studio development environment, allowing checkins/updates directly from the IDE without any window switching or additional thought. For a number of our pre asp.net web-based applications (read asp 3.0), and applications written in non-Microsoft languages which are not packed together into solutions and therefore do not have ties to a particular IDE, a third-party check-in/out GUI is acceptable. However, moving forward in the world of .NET (which most of our modern apps are built in) the need for direct tie-ins to Visual Studio are a must. This one's a deal breaker!
- Documentation. If an organization is going to move to a new system, then they should know everything there is to know about that system. I would expect proprietary systems to include high-quality documentation (no matter how you look at it, you're paying for it) but, when browsing free/open source solutions, documentation is sometimes a necessary if involuntary sacrifice.
- Speed. This one is a bit of a no-brainer, but speed is one of my key concerns moving forward. If developers cannot connect to a repository within a reasonable amount of time whether in work or at home, then potential work is lost.
- LDAP Integration. As Active Directory matures and the windows world moves more and more toward integrated security, it would be nice to use the same mechanism for source control permissions as for other applications, without the need to manually add users to the version control system and set specific permissions on a user by user basis. The ability to add a new employee to a 'repository access' or 'repository administration' group and have those permissions control his or her interaction will all repositories is a very desirable addition.
- Intuitive GUI. VSS is many things to many people, but one thing most people agree on is that the user interface, though definitely dated, is easy to use and provides simple access to the core functionality of the application. To me a command prompt is not good enough and when I see think of tools like GIT and Subversion (taking each as its own isolated entity without additional tools) I wonder why anyone would choose to interact with a version control system using a command prompt…seriously, scripting is great and the ability to control an application through a command prompt is definitely useful. But the command line should not be our primary interaction mechanism – specifically considering the extent to which the nature of source control lends itself to GUIs, allowing file and project state to be represented with visual cues etc. etc. etc…..am I alone in thinking that those who still use the command prompt are those who love to defame any product developed in a tool other than emacs :-)
- Repository Migration. Existing repositories must be migratable (yes, I just created a new word…) from their existing source control system - maintaining two (or more) source control systems is not a possibility.
My Thoughts So Far...
I have so far researched and, in a number of cases evaluated, some of the mainstream open source version control systems available. For anyone on the same quest, I suggest you check out Wikipedia's comparison of revision control software ( link ) which provides a solid and seemingly up to date listing of revision control systems including prices and also feature comparisons. As a quick look-up chart this is awesome.
Immediately I discarded CVS as a number of reports suggested that its codebase is unstable and in fact sparked the creation of subversion. Git, the revision control system created (and vigorously defended link ) by Linus Torvalds had much promise but, at time of writing, lacked a supported and mature windows interface. In fact, this was the main reason I discarded most systems - command line tools were preferred to GUI apps, meaning the learning curve is more equivalent to that of learning a new language than that of learning to use a new tool.
The system that stood out and has so far weathered my testing is Subversion. Below I'll list what I like and what I do not like...
The Joys of Subversion
- Atomic Commits. It's all or nothing folks, and I'm loving it!
- Price. Free as in Beer!
- Documentation. Literally a book's worth, and it's kept up-to-date. Combined with Subversion's impressive user community, sources of reference should not be a problem.
- GUI Support. Subversion, like most of the other open source tools in it's category, does not ship with a GUI. However, additional applications like TortoiseSVN - a windows shell add-on - and AnkhSVN - a Visual Studio add-on allow for a pretty comfortable learning curve. There are other tools out there - but these are the two that I use religiously. The ability to interact with the repository (and check in and out files) through windows explorer is extremely progressive and, regardless of whether subversion is adapted by my organization, is something I will continue to use for local files, documentation, side projects etc.
- LDAP Integration. So far I have only scratched the surface on this one, but Subversion's Apache-based installation, coupled with it's use of WebDAV mean that logins can be validated using LDAP. While this is something I have successfully tested, I have thus far only run a shallow test case, testing if a user could or could not log into a repository. Next step: active directory groups and granularity of privileges.
- Triggers. This will be the subject of one of my next blogs. At first I thought the triggering system in subversion was a bit of a dirty hack. Pop a file (any type of file: .bat, .exe, etc.) corresponding with Subversions pre-named hooking events (pre-commit, post-commit, etc.) into the hooks folder in the base of a repository and it will run on that event; i.e. if I pop pre-commit.exe into this folder, for all files being checked into said repository the exe will be executed. Having used this for a while now, I see its ingenuity - it provides the ability to harness almost any existing technology - the Java programming language, the .NET framework etc. - to marshal repository interaction. The ability to force comments when checking in (and even use Java/Perl/.NET regular expressions to compare them to a template) is awesome!!!
- Speed. I need to test this on a grander scale, but so far accessing files over a network or the internet has been pretty solid. In fact, I only just realized that my PC was not supposed to freeze up when accessing my repository through VPN. Who knew?
- Extras. In the course of my research I spent quite a large amount of time at the polarion website. They host some pretty nice (and free) applications including 'Importer for SVN' which purportedly supports migration to subversion from CVS, PCVS, VSS, ClearCase, StarTeam and MKS. Alas, this software failed on a number of our projects though this could be due to the nature of the projects themselves (and the sometimes cyclical structure in VSS) but did not deliver the results I was looking for. More impressive however is their 'Web Client for SVN' which is a very pretty and intuitive web frontend for subversion. Little details, like the ability to download a whole project in zipped format, make this a very useful tool. Further to these I was impressed by SubTrain, Polarion's open source subversion training, containing enough documentation to train new team members (hypothetically for now) on using the system.
- Versioning Model. For me this is a positive. The versioning model differs from that in VSS as no files are exclusively locked (unless forced) and multiple developers can work on a single file without stepping on each others' toes. Once changes are completed, conflicts will be identified and said developers forced to manually merge their changes before a commit is allowed. If there are no conflicts, updates will be automatically merged.
The Negatives of Subversion
- Lack of a Mature Stand-Alone GUI. TortoiseSVN and AnkhSVN are amazing. But I cant help feeling that after 10 years, VSS 6.0's GUI still stands up pretty well in terms of design. Sure, it's ugly, but it has everything right where you need it. Although there are a few attempts to solve this issue (read RapidSVN and SubCommander) I'm not sure that they are mature enough yet to rival VSS's core GUI. SmartSVN appears to be getting pretty close, but it's feature rich 'Professional' version is proprietary, thus ruling it out. Personally, while I appreciate the hard work that goes into such projects, I find it difficult to support proprietary software that is based on top of open source code…
- Lack of a working Migration Tool. Admittedly, this issue may be more to do with the structure of our older projects than with the Migration Tools I have tested, but so far none of the migration applications I have tested (and I have tested pretty much all of those available) where able to migrate a full repository from VSS to SVN. I posted messages on message boards and directly contacted a number of these applications' owners, but to no avail. If all else fails I can program my own migration tool, but I'd much prefer to use a pre-existing tool.
- Granularity of Privileges. Correct me if I am wrong, but by all accounts (including the documentation) it is not possible to allow granular access to projects. Sure, you can provide path based or repository based authentication, but the only privileges are none, read and read-write. Personally I'd like to disable the deletion or addition of certain files within the database. For me, this is a sorely missed feature.
- Versioning Model. For me a positive, for others a negative. For some, the ability for multiple users to work concurrently on an individual file is a drawback - sparking headaches at the thought of maintaining and promoting such files.
Frankly, there is none…yet. I figured finding better version control system would be easy. However, the number of open source projects out there, and the intensity of debate surrounding them, makes it difficult to choose one over another. Subversion if by far my favorite and suits 90% of my requirements. However, I'll wait a while (at least until the release of the next major version of Subversion) before attempting to instigate change in my workplace. If you have any comments or advice, I'd love to hear what you have to say. Experience is definitely a great basis for suggestion...