The Hiltmon

On walkabout in life and technology

Open Source Compiles in an Xcode 5.1 World

So today I needed to work on an older, open-source based C++ application on my Mac and there was no way to compile it under Xcode 5 even though the development tools were installed and working perfectly.

The issue, it seems, is that Xcode 5.1 has finally removed and deprecated a lot of old C++ stuff that is still required by older, popular libraries such as boost and quickfix. It emulates g++ 4.2.1 OK, but is no longer 100% compatible with it. I am quite sure that Apple and the Open Source community will eventually get these to work with the new compiler.

But I needed it now. And no end of futzing with compiler options and paths would work.

Fortunately, you can run Xcode 4 side-by-side with Xcode 5.1 on OS X Mavericks. And Xcode 4 comes with a real g++ 4.2.1 which does contain the deprecated code and compatibility.

To get this to work, download Xcode 4.6.3 from Apple Developer downloads. Once downloaded, drag and drop the Xcode install onto your Desktop (not applications) and rename it Xcode4. Then drag the renamed application to your Applications folder. You now have Xcode 5.1 (named Xcode) and Xcode 4.6.3 (named Xcode4) side by side.

To make things easier, Apple has provided the xcode-select command to enable you to choose which Xcode install is the one used in system compiles or from the command-line.

To use Xcode 4 and the older g++, just select Xcode 4:

sudo xcode-select -s /Applications/Xcode4.app/Contents/Developer

Running g++ -v gives me:

Using built-in specs.
Target: i686-apple-darwin11
Configured with: /private/var/tmp/llvmgcc42/llvmgcc42-2336.11~182/src/configure --disable-checking --enable-werror --prefix=/Applications/Xcode.app/Contents/Developer/usr/llvm-gcc-4.2 --mandir=/share/man --enable-languages=c,objc,c++,obj-c++ --program-prefix=llvm- --program-transform-name=/^[cg][^.-]*$/s/$/-4.2/ --with-slibdir=/usr/lib --build=i686-apple-darwin11 --enable-llvm=/private/var/tmp/llvmgcc42/llvmgcc42-2336.11~182/dst-llvmCore/Developer/usr/local --program-prefix=i686-apple-darwin11- --host=x86_64-apple-darwin11 --target=i686-apple-darwin11 --with-gxx-include-dir=/usr/include/c++/4.2.1
Thread model: posix
gcc version 4.2.1 (Based on Apple Inc. build 5658) (LLVM build 2336.11.00)

To use Xcode 5 and the new LLVM/Clang g++, just select Xcode 5:

sudo xcode-select -s /Applications/Xcode.app/Contents/Developer

Running g++ -v now gives:

Configured with: --prefix=/Applications/Xcode.app/Contents/Developer/usr --with-gxx-include-dir=/usr/include/c++/4.2.1
Apple LLVM version 5.1 (clang-503.0.38) (based on LLVM 3.4svn)
Target: x86_64-apple-darwin13.1.0
Thread model: posix

To simplify the process, I added the following to my ~/.bash_profile:

# Switch xcodes
alias setxcode4="sudo xcode-select -s /Applications/Xcode4.app/Contents/Developer"
alias setxcode5="sudo xcode-select -s /Applications/Xcode.app/Contents/Developer"

Now all I have to do is type

setxcode4

Or

setxcode5

And my password to switch environments.

With both environments installed and a quick command, you too can work on old code using the old g++ compiler, and switch back to Xcode 5 and llvm/clang for newer projects.

Follow the author as @hiltmon on Twitter and @hiltmon on App.Net. Mute #xpost on one.

Develop Locally, Stage Nearby, Production Anywhere

One of the things that surprised me as I moved back into the land of corporate software is the number of developers I talk to that commute and work in offices and yet develop on remote computers. I thought, as you probably do, that by now all software development would be local, as in on the developer’s own computer.

Instead, it is not uncommon for corporate developers to rely on remote shells or desktop virtualization to access remote development computers and use them to code. It’s slow, unproductive, frustrating and so pre-1990’s.

Aside: Back then, remote development was the norm. Mainframes were too expensive to give to developers and most production platforms were not available in desktop form. You had to develop remotely.

In this post, I want to point out how delusional remote development is in the 21st century and how local development can meet all Corporate needs.

The Remote Development Delusion

There are several reasons given by Corporates for this old-school behavior, all of which are bogus.

bo·gus ˈbōgəs/ adjective
1. not genuine or true; fake.

  • Security and Access Control: The argument is that companies want to be sure that their developers do not run off with their source code, or that nothing is lost when computers get stolen or lost, that only authorized developers access the right parts of the code and that they can track who did what.

    Bogus because the average developer can still copy the code, or even rewrite it from scratch. Modern source code control systems and good network access control will resolve any security concern. Encrypted hard drives and startup passwords secure computers, even after theft.

  • Big or Sensitive Data: The argument is that the amount of data processed by developers is huge, or the data itself is so sensitive, that they need to develop close to the data.

    Bogus because networks are plenty fast, developer databases do not need to be that huge and storage is cheap, so making local (or nearby) copies of data is simple. Bogus because sensitive data can be changed via scripts, or fake data generated.

  • Cost Savings: The argument is that companies can share development resources such as licenses and installations and can therefore purchase cheaper computers for their developers.

    Bogus because the productivity losses in remote development by far outweigh the cost of a few measly licenses or computers. If a company cannot afford its development tools, it should not be using them. And developer tools these days are cheap or free, as are powerful computers, and are licensed for local development.

  • It’s Policy: The argument is that the company has a policy of remote development.

    Bogus because the policy argument is always bogus. Policies are made-up rules created by business people who generally know nothing about reality, or were written so long ago that no-one has cared to change them. Staff are forced to follow these bogus policies or be fired. It has nothing to do with right, productivity, cost or any other reality.

In each case, the argument put forward for remote development is bogus. Development on remote computers is slow, unproductive, subject to outages, and guaranteed to frustrate good developers and encourage them to leave.

The Local Development Model

The local development model’s mantra is simple, to:

  1. Develop Locally
  2. Stage Nearby
  3. Production Anywhere

Developers code and compile on their own computers, using locally installed tools. All projects are staged (and continuously integrated) on servers nearby (as in the same office or on the same network). And then code can be deployed anywhere to run in production.

It’s simple, local development makes sense and is the most productive way to develop.

  • It’s responsive, developers press a key and the response is instant.
  • Developers can work anywhere, any time. Give a developer a laptop and they can work from home or at night.
  • Developers can use their favorite tools and productivity enhancers to speed up development.
  • Developers are only constrained by their own abilities, and not by the resources available on shared development machines, network flakiness or the slowness of remote operation.
  • All development tools can and do run locally, and are designed and licensed to do so. Local development is the expected model.

Staging nearby also makes sense.

  • For large and complex code bases you can run continuous integration tools to build and report on errors far quicker than attempting production builds.
  • You can use virtualization to spin up clean staging virtual machines on cheap retail hardware.
  • If a developer does break something, it’s close bye and easy to recover. And no other developers are affected.

Only production need be remote, and that can and usually is anywhere.

Secure Local Development for Corporates

But corporates need their controls and security blankets. It turns out that the local development model can easily be made as secure and controlled as needed without punishing developers and throttling productivity.

Lets take each core remote development argument one at a time.

Security and Access Control

Corporates want to be sure that the code and data is secure and only authorized personnel can access it. Easy:

  • All modern computers and Operating Systems support encrypted hard drives. A password is needed to unlock the hard disk and yet another is needed to log in. A stolen encrypted hard drive is useless, and many systems can be remote wiped.
  • All modern operating systems are built to keep folks out, and only to enable access by authorized users. Strong passwords are good enough, and even biometric access is available.
  • If the data and access are secured, then communication needs to be as well. All systems come with built-in encrypted Virtual Private Networking to secure communications.
  • Access to code can be controlled using an in-house hosted Source Code Control system. Developers can only see the projects that the company wants them to see, and the code is saved, logged, tracked and maintained in-house.

In short, no problem here.

Big or Sensitive Data

Corporates argue that their data is too large, or is so sensitive that it needs to be tightly secured so development has to happen remotely. We’ve already covered securing code, the same applies to data, so what about data size? Well, assuming the developer needs access to the whole data set, easy:

  • Modern hard drives are cheap and huge. And to be clear, there are exceptionally few corporate databases that cannot fit easily on a modern laptop drive. Big data is real, but rare. And even if the data is bigger than modern laptop drives, desktops with lots of drive bays or even SAN’s are ridiculously cheap.
  • If data is too sensitive, run a script on a copy of the database to remove or change sensitive information. For example, replace all credit card numbers with fake ones. Or use scripts to generate accurate yet fake data for developers. There are lots of tools out there to help do this.
  • If the data is stored in a commercial database product, and the company is so cheap it does not buy licenses for developers, then maybe they cannot run it locally. Fine, but modern networks are fast, and running a one additional license as a development database server nearby, close to staging, accessible via VPN, is cheap.
  • And if the company really wants to be tight, it can create and clone secure virtual machines that developers can spin up and run whenever they need data access.

It is rare that a developer needs access to the entirety of a database, and even rarer that the data does not fit on a local disk. Big data is not a problem. And there are solutions to sensitive data, so that too is not a problem.

Cost Savings

But what about the cost of all these powerful laptops, tool licenses and staging servers? I could point out that maintaining development servers, licensing and the infrastructure to support remote development is just as high cost (oh, I just did). But:

  • Computers are cheap, ridiculously so. And for the price you get lots of fast cores, oodles of RAM and massive hard disks. People and people’s time is expensive. What would you rather spend money on?
  • Modern development tools assume local development and so license terms and conditions expect this. The cost of giving each developer a licensed copy vs running a the same licensing centrally is exactly the same.
  • And then there is the free and open source option. Instead of using expensive proprietary tools, try using open source languages that all the cool companies are using. The cost of getting support on Open Source is negligible and the products are being used by millions.

Choosing between needing more expensive developers using cheaper computers versus having fewer, more productive ones is easy when you look at the cost. Local development is cheaper.

Policies

And then there is the legacy of policy. A business that is too rigid to change in the face of changing technologies, costs and realities has no right to survive. A business that is constrained by policy cannot evolve.

If a company wants to hang on to their developers, and get development done quicker and better, local development is the way to go. And if the only thing that’s holding the company back is a few silly written rules, er policies, burn them.

Develop Locally, Stage Nearby, Production Anywhere

Back in the 1970’s, 1980’s and early 1990’s, remote development was the norm. Because it had to be. Mainframes and server platforms were expensive and unavailable on desktops. Computers were not secure and networks were unavailable or slow. And the tools were designed for this model.

The downsides were the same as modern remote development: key press delays, slowdowns as other users took too many resources, layers of security and technology needed to enable remote access and all sorts of flakiness and problems. The upside, well, it was the only way to develop.

But since then we’ve moved on. A computer on every desk (and now lap and pocket) is the norm. Servers and server platforms can run on laptops as well as on big iron in production. Computers are secure and fast, encrypted networking is ubiquitous.

There is no downside to local development. The last 20 years have been spent making local development a reality. The upside is huge. Faster, anytime, anywhere, productive developers result.

There is no downside to nearby (or even local) staging. The upside is also huge. Better, faster builds, quicker identification of issues, and better testing all result.

Welcome to the 21st century. Develop Locally, Stage Nearby, Production Anywhere. Remote development’s time has passed.

Follow the author as @hiltmon on Twitter and @hiltmon on App.Net. Mute #xpost on one.

From Tool Maker to Tool User

Abstract: Programming is still seen as a single-language, single-platform tool-maker role. That has not been true for a long time. Over the years, the platforms, tools and libraries available have multiplied. We’re tool users now, where a programmer can pick up a new platform with ease, and use the tools available to create complex, reliable products a lot quicker. That which was impossible then is commonplace now.

The role of the programmer has changed over the years, from being a certified single language, single platform tool-maker into a master of application styles, technologies, use cases and platform tool-users. And when you think about it, the tremendous change in the tools we can use and master has steered us here. But we, the industry and the job market still view programmers as singular tool makers (just look at most programming job ads).

Heck, I still see myself as a tool-maker. But that has not been true for a long time. I realize now that I am a tool-user that creates useful products. To use a carpentry analogy, I no longer make the plane, I use a plane and other tools to make tables.

As an exercise, I decided to look at what I do and the tools I use in 2014 against 10 years ago, and the 10 before that, and, since I am old enough, 10 years before that too. I wanted to find out when we stopped being a tool-makers and became a tool-users. I wanted to understand how we can do so much today that we could not do then.

Before you read on, think about your own experiences, the tools you made or used and what you use today. When did you change?

1984

I was close to ending High School and had a Sinclair ZX81. On it, I could create simple BASIC programs, mainly utilities and games. There was no IDE, no libraries, no database and the command-line environment was rudimentary. If I needed anything, I had to create it first.

1 task, 1 platform, 1 language, 0 libraries, 0 IDEs, 0 databases.

100% tool maker, 0% tool user.

1994

I was working for a Systems Integration company. As a firm, we were engaged to write those large, complex core applications for large complex clients such as utilities and government departments. I, however, worked on one part of one product for weeks at a time.

Our desktops were Intel PC’s running new fangled Windows enabling us to run multiple emulated VT-100 terminal sessions to the hosts we were developing on. The platform was UNIX, the code written in C (and some C++, lex and yacc), glued by shell scripts, the editor was vi, and the database was this new one called Oracle.

Being able to see multiple VT-100 terminal sessions on one screen was leading edge, a huge improvement over the teletypes from before and a huge productivity boost. And using a database meant that data storage and retrieval were written for us. But the tools were rudimentary. We needed to write everything, from the protocols between systems to the libraries that processed the data.

1 task, 1 platform, 2 languages, 1 library, 1 IDE, 1 database.

90% tool maker, 10% tool user.

2004

I was working at a Finance firm, designing and developing the platform for their business. My time was allocated to a project at a time, with interruptions for support.

It was a Windows shop. Windows on the desktop, Windows on the server. But code was being written in several languages, C# and HTML for the web applications, C++ for the core mathematics, and Perl to glue it all together. My IDE was Visual Studio for the C# and C++, UltraEdit for Perl. We were using the standard C# and C++ libraries from Microsoft, and the magnificent variety of tools available on CPAN for Perl. And the database was SQL Server, back when it still acted like the Sybase it was.

On the surface, the changes between 1994 and 2004 seem fundamentally small, but were huge. Fundamentally, still using a C-like language, still writing scripts and still using a database. But Web Services libraries built into C# meant that writing server interactions was so much faster and easier. XML serialization made integration easier. Perl could do a whole bunch more and a lot quicker than shell scripts given the CPAN libraries. And the IDE was a dream compared to vi with integrated syntax coloring, multiple panes, integrated debugging and a built-in database manager.

But the biggest change was the availability and capability of the libraries at hand. Need to read a CSV, there was a library for that. Need to talk to a server, the protocol was written and included in the platform. As programmers, we had moved to being more tool users than makers because the tools had mostly been written. And I do not think we, or the industry, had realized it yet.

1 task, 1 platform, 3 languages, 3 libraries, 2 IDEs, 1 database.

20% tool maker, 80% tool user.

2014

I work at another Finance firm during the day and my own business in my spare time. I am developing yet another internal platform as well as web applications and native mobile applications for clients. The net variety of work is brilliant.

Our client platforms include Windows, OS X, Linux, iOS and Android. Our servers are mostly Linux with a few Windows Servers as needed. Code is being written in Ruby, C++, JavaScript, Python, R and of course Objective-C. The IDE’s in use include Xcode, Visual Studio, vim, Sublime Text and the lovely TextMate 2. And when it comes to libraries, we use a lot, including Rails, Sinatra, JSON, NumPy, Pandas, STL, Apple’s Foundation Kit and more. And our databases run on SQL Server, PostgreSQL, MongoDB and Redis.

Again, the fundamentals look the same since 2004. Still working the web services angle, C-like languages and scripting tools. But so much more productive. There is a library and a platform and a tool for everything now, no need to write your own. The time taken from idea to delivery has been halved and halved again. It’s all about choosing the right tools and using them effectively.

3 tasks, 5 platforms, 6+ languages, 10+ libraries, 4 IDEs, 4 databases.

0% tool maker, 100% tool user.

From Maker to User

When I started working we had to be proficient in one platform, one language and one database. It took weeks and months to go from requirements to design to delivery because we had to write everything ourselves and our tools were simple. Support and maintenance were hard. We were tool-makers.

In contrast, these days we regularly switch between languages, platforms, and tools to create web, native and hybrid products. The time taken from idea to delivery is days or weeks because it often involves finding the right library, learning how to use it and coding the use case we need. The platforms that we deploy contain so much functionality that most of the work is done for us. And we can do so much more in the same amount of time. We have become tool-users.

Back then, the tool-maker had to be able to program at a low level and know all the nuances of the platform. Now the tool-user needs know what tools and libraries are out there, to learn them and to use them, oftentimes platform agnostically. Aside: But we still hire as if we are looking for tool-makers, singular platform/language programmers.

We still need the tool-makers, the ones who create the platforms, languages, libraries and IDE’s. But the majority of us have lost that skill if we ever had it. The point is that using these tools, the products we tool-users create are better, faster, more reliable, easier to extend than ever before and we produce them so much quicker.

And because we use these tools, we can do a lot more every day and create a lot more products. Just look at the scope of work and the numbers above. Back then we worked on a single component of a single product for weeks at a time. Now, we work on a portfolio of products simultaneously.

Back then, as tool-makers the work we do now was not possible. Now as tool-users, it is what we do all the time.

Follow the author as @hiltmon on Twitter and @hiltmon on App.Net. Mute #xpost on one.

Easy SSH to Local Linux VM

TL;DR: Run avahi on the VM, ssh to the VM name dot local.

Many of the Simple C++ Projects I work on, whether I use Xcode or not, land up needing to be recompiled on a Linux Virtual Machine prior to being deployed to the production Linux server. I do this because I want to be sure the I know what needs to be set up in the Linux environment, that the compile succeeds without errors in a production-like environment and the code works properly before deploying.

I spin up these virtual machines all the time. But connecting to them to copy the code over and perform these compiles is a hassle because their IP addresses change all the time. Aside: Setting a static IP address does not work because I often clone these virtual machines to try different settings requiring a new IP address to be set up and remembered manually. Also, static IP addresses sometimes conflict on different networks.

But there is a solution. avahi.

The Problem in Detail

I spin up Linux VM’s in my local VMware Fusion using the exact same Linux distribution as production. Since these are never to be exposed to the wide world out there, I set the network to “Share with my MAC” which makes it a local VM. I could use “Bridged Networking” but this problem then recurs when I work from home or a coffee shop.

I yum install all the packages as per production, even set up a deploy user. I then need to ssh into that deploy user on that VM as if I were ssh-ing into production to build and test the code.

The problem is that each time the Linux VM boots, or I use it outside the office, it gets a new DHCP IP address. Which means that I need to find out what the IP address is every time before I can SSH in. Too many steps:

  • Log in
  • Run an ifconfig
  • Find the IP address (172.16.112.141 this time)
  • Press ⌃⌘ to release the mouse
  • Open a terminal locally
  • ssh deploy@172.16.112.141 this time!

What a Pain! Next time I use this VM, I’ll have to perform the same dance.

The Solution in Detail

Run avahi on the VM.

When you build the Linux VM (I use CentOS 6), first set the host name to a unique name. Edit /etc/sysconfig/network as root and set the HOSTNAME attribute:

NETWORKING=yes
HOSTNAME=witch.noverse.local

You may need to restart some distributions after doing this.

Then execute the following commands as root to install avahi

$ yum -y install avahi
...
$ service avahi-daemon start
...
$ chkconfig avahi-daemon on

This installs the avahi daemon, starts it and sets it to start on every reboot.

What the avahi daemon does is publishes the VM’s basename (the first part of the hostname before the dot) on the ZeroConf (or to use Apple’s word Bonjour) network.

Which means you can always see it as basename.local. So, instead of ssh-ing to an new IP address every time, just ssh to the basename and add a .local. For example, this works for the above Linux VM:

ssh deploy@witch.local

Then, no matter where you are or what IP address the VM gets, you can always access it by the same name. You can even add this to your .ssh/config file as a shortcut, which never changes!

I have been totally surprised how much fiddling and time this simple trick has saved me.

Follow the author as @hiltmon on Twitter and @hiltmon on App.Net. Mute #xpost on one.

I Choose Not to Be Anonymous

A simple premise.

  1. I always post from my own domains or accounts, all of which are traceable back to me.
  2. I sign and add my true byline, which is me, to all my work.
  3. I always use my name, hiltmon, when commenting or responding on other sites or services.
  4. Anyone can always get in touch with me via this web site, email, Twitter, App.Net, Facebook, Google+, Linked-In and other services, I am not hiding.
  5. There is only one hiltmon, and it is I. Google me.

I choose not to be anonymous. Here’s why.

There has been an ongoing debate on the internet for years whether anonymous posting or commenting is a good or a bad thing. Debating the goodness or badness of it is a distraction, it’s existence and the people’s choices are the real issues.

I think its a choice not to be anonymous, but not all of us have the same choices. Anonymity is absolutely necessary if the poster is being discriminated against, living under some kind of martial law, a victim of a crime, a political refugee or someone who, by posting, will be incarcerated or killed. These folks should have access to anonymous posting so that their stories and the truth can get out. So that we, the rest of the internet, can learn about it, and maybe do something about it. We should not stand by on principle while they may suffer or die for nothing.

But for all other cases, I believe that anonymous posting is not a good idea, a bad choice. Consider this, you read a blog post by an anonymous author. How do you know if they are credible? How do you know if the post is truthful or spin or outright lies? How can you be sure that the post is not some corporate marketing placement, an agenda or a scam? If it’s anonymous, you have no idea. If the post has a byline, you can check the source and determine credibility for yourself.

Anonymous commenting on the other hand is rife with abuse, and that abuse bothers me too. The trolling, the hate, the stalking, the flame wars all work because the commenter assumes they cannot be traced, and so they can “get away” with things. It’s childish and immature. It defaces sites, devalues other people’s work, and shows complete disrespect. There are places for this sort of behavior, Hacker News and Reddit come to mind, but nowhere else. Aside: I am very pleased that the commenters on my site are rarely anonymous and do not abuse the service.

It takes courage to put your name on things the world can see, copy and save for later. It takes integrity to stand behind your own comments and writing. It takes self-respect to put yourself out there and respect others.

I cannot (and have no right to) tell folks what to do on the internet, but maybe I can set an example.

I choose not to be anonymous.

Follow the author as @hiltmon on Twitter and @hiltmon on App.Net. Mute #xpost on one.

Magical Migrations

As I am writing this, I am using magical migrations to perform a full data transplant across three servers, each with 100+GB of data, with a single command. Yep, I’m chilling and writing a blog post while replacing the entire data scaffolding of my firm as if it were just another cold Saturday. Magical indeed.

TL;DR: Rails migrations and Capistrano deploys create a stable, reliable, testable, reversible and no-fault way to manage and execute production database changes, no matter the platform size or complexity.

When I started at the current firm, I knew I would be designing and developing a wholly new, proprietary platform. At the time, all I knew was that it would be big and complex and that I did not yet have a handle on what it would be.

But I did know these to be true (for that matter not-knowing that you do not know something is knowledge in itself):

  • I did not know the complete requirements at the start, and most likely would not know the complete requirements in the middle.
  • I would not get the database design and architecture right the first time, and probably not the second time either.
  • Things will change. The business, requirements and architecture will change and continue to do so. Which means I needed to be flexible in choosing what to do and how to do things.
  • I will have more than one database server, many in fact. Some as backups or read-slaves, some as replicas, some as islands, and some for testing and modeling.
  • I need to be able to track, manage and automate the myriad of changes to all of these servers using tools because there is no way I could do it in my head.
  • I would probably be the only person doing this for the first year or so.

In the past, the way I have done this was to create separate database projects for each database and then created SQL files to save the queries and data definition commands to change these databases. Then, in preparation for deploy, we’d backup the production databases to a staging area, then manually run the SQL to migrate the database and then run a test suite to see if we got it all. If it all worked, we’d then spend a weekend doing this on the production servers. More often than not, something had been forgotten, not placed in the database project, or not updated as things changed, and we’d leave production deploy alone until the staging model was rebuilt and tried again.

And that was better than the older ad-hoc method of just having a dedicated Database Administrator patch the databases manually on deployment days (which is how it worked in the dark past and still does in a lot of firms).

I needed a way to automate all database changes, a way that was integrated in my development workflow and a way to deploy these changes with ease.

The solution: Rails migrations and Capistrano deploys.

Aside: Fortunately I am using Rails for a few of the web servers on the platform which made choosing Rails migrations easy. But I am also running a bunch of Sinatra servers for web services, Python programs for analytics, C++ programs for high-speed work and looking at Node.js and GoLang for future projects that all access our databases. And who knows what else will access them in the future.

For the main database, I created a single master Ruby on Rails project. I then share that model between all my Rails projects, see my Rails Tricks – Sharing the Model. All other projects just assume the current schema. For other databases, more Rails projects, several of which have no web interface at all, or are in support of a non-Ruby project.

Creating and Managing Database Schema Migrations

But lets focus on the main database.

In my environment, this runs on a pair of servers in a remote data center for production and on an additional database server used for analytics. Development is done locally, as is staging. All three of these main servers are accessible from the same code base, so they all need to have the exact same schema. And all have over 100GB of data in them which is business critical and so I cannot screw up deploys and changes.

All database changes are in Rails migrations.

All of them. No exceptions.

Create a new table, its a migration. Add a column, another migration. New index, another migration. Add seed data, a migration.

As a developer then, it’s easy to use and test. Create and edit a migration (or Rails model for new tables) and rake db:migrate to apply the changes. Not perfect, rake db:rollback, correct the model or migration and rake db:migrate again.

I never, ever use SQL directly on the database to make schema changes. It violates the protocol. All changes are in migrations, all of them.

A few Rails migration tips:

  • Create all tables as Rails models which create migrations. It adds a bit of code to the project, but makes it easy to test the database or create web views of the data later. For example:
1
$ rails generate model Table description:string price:decimal{12-2}
  • Do not be afraid to use raw SQL in migrations. That which Rails migrations cannot do, can be done with SQL in migrations. I use it all the time to transform or backup data. For example:
1
2
3
4
5
6
...
# Fixup codes
execute "UPDATE securities SET coupon_type = 'FIX' WHERE coupon_type = 'LKD'"
execute "UPDATE securities SET coupon_type = 'FIX' WHERE coupon_type = 'ADJ'"
execute "UPDATE securities SET coupon_type = 'FIX' WHERE coupon_type = 'SPC'"
...
  • Make all up migrations reversible. This is easy as Rails takes care of most of these as long as the migrations are non-destructive. For destructive migrations, such as when you are moving data to new tables or removing columns, I create temporary tables or dump files to save the data being deleted, and reverse these in the down part. Only when I am very sure that these changes are fixed and deployed to production do I create migrations to drop these temporary tables – these being the only non-reversible migrations I have. For example, in an up migration for a changing table, I first back it up:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
def up
  execute %Q{
     CREATE TABLE ex_table (
         id     INTEGER     NOT NULL,
         column_1 character varying(255),
         ...
     CONSTRAINT ex_table_pkey UNIQUE(id)
 );}

    execute %Q{
     INSERT INTO ex_table (id, column_1, ...)
     SELECT id, column_1, ...
     FROM table;
 }

  remove_column :table, :column_1
end

In some cases, I even dump data to files to enable reverses or just to have backup copies. For example, in a down (reverse) migration where I dropped a table in the up, I may have:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
def up
  # Save the data for Justin Case :)
  execute "COPY (SELECT cusip, price_date, price, yield) FROM historical_prices) TO '/tmp/hp.csv' WITH CSV;"
  drop_table :historical_prices
end
  
def down
    create_table :historical_prices, { id: false } do |t|
      t.string :cusip, limit: 9
      t.date :price_date
      t.decimal :price, precision: 12, scale: 4
      t.decimal :yield, precision: 12, scale: 6
    end

    # Get it back
    execute "COPY historical_prices (cusip, price_date, price, yield) FROM '/tmp/hp.csv' WITH CSV;"

    add_index :historical_prices, :cusip
    add_index :historical_prices, :price_date
end
  • One big negative of Rails migrations when creating tables is that they create an id field automatically. This is useful if running a Rails web app or using ActiveModel. But for tables that have no need of these, or are being accessed by applications that do not need id columns, here’s how to get rid of it: Just add { id: false } to the create_table line as in:
1
2
3
4
5
6
7
8
9
10
class CreateNoIDTable < ActiveRecord::Migration
  def change
      create_table :no_id_table, { id: false } do |t|
          t.string :my_key, limit: 9
          t.string :my_text

          t.timestamps
      end
  end
end

You can also get rid of the Rails created_at and updated_at columns by commenting out the t.timestamps code.

  • Seed data in migrations. Sometimes you need to put data into tables when creating them, for example when creating code to string reference tables. Put the data into the migration and have the migration load it. For example, using Rails model creates:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
class CreatePurposeClasses < ActiveRecord::Migration
  def up
      create_table :purpose_classes do |t|
          t.string :class_code, limit: 4
          t.string :class_name, limit: 32

          t.timestamps
      end
      
      add_index :purpose_classes, :class_code, unique: true

      # Populate
      PurposeClass.create!(class_code: 'AUTH', class_name: 'Authority')
      PurposeClass.create!(class_code: 'BAN', class_name: 'Bond Anticipation Note')
      PurposeClass.create!(class_code: 'BLDG', class_name: 'Building ')
      ...

Deploying Database Schema Migrations

I have been using Capistrano for years to deploy Rails applications. And it works well. I use it now to deploy my main database Rails project that contains all the necessary migrations to all servers.

It takes just one command to send the code over and one more to make all the changes and I can be sure all my staging and production databases are perfect.

To deploy:

$ cap production deploy

To migrate:

$ cap production deploy:migrate

I prefer to do this in two steps in case the initial deploy fails, in which case Capistrano rolls it back safely without affecting the database. I do worry about Mr Murphy.

In order to choose which server and database to migrate, I use Capistrano’s multistage extension. Each database gets a stage. For example, I have the following at the top of my main project’s config/deploy.rb file:

1
2
3
4
# Enable multi-stage support
set :stages, %w(staging analytics production)
set :default_stage, "production"
require 'capistrano/ext/multistage'

I then have separate files in the config/deploy/ folder for each server (stage) that sets the server names and roles. For example the config/deploy/analytics.rb file sets the :db role to the analytics database server, and the config/deploy/production.rb file sets :db to the production server.

I can then easily run:

$ cap analytics deploy:migrate
$ cap production deploy:migrate

I really rely on Capistrano’s ability to rollback code deploy errors, and on Rails migrations ability to rollback database migration errors to prevent massive failure situations.

The Benefits

I get a lot of benefits from using Rails migrations and Capistrano deploys (Magic Migrations):

  • Database management is part of my development process, not an add-on, separate process to be scheduled and managed.
  • I do not need to ‘remember’ the state of databases.
  • I do not need to document changes, I have a log of them in the migration files.
  • I do not need to insert data into tables after migrations as the seed data is included.
  • One command to deploy and one to migrate. I get my weekends back.
  • If anything goes wrong, it can be undone and rolled back as if nothing has happened.
  • I just know that each database has the latest and correct schema for all my systems and can access them with confidence.

So here we are, I’m running one of these right now that is significantly changing the core architecture of my firm’s platform and instead of manually doing the work, or staring helpless at the process, or even having to worry if it fails, I’m writing this post.

Follow the author as @hiltmon on Twitter and @hiltmon on App.Net. Mute #xpost on one.

View on an Insider as CEO for Microsoft

Today, Microsoft announced Satya Nadella as its new CEO, a 22-year veteran of the business and a true Insider. I wish him the best of luck and success. But is choosing an Insider as CEO good for a mature Microsoft or bad?

I think its a bad call based on documented history of other mature companies and my own selfish (and very positive) wishes for the firm.

And then there’s this, which makes no Outsider sense and means nothing to anyone, from his first letter to employees:

I believe over the next decade computing will become even more ubiquitous and intelligence will become ambient. The coevolution of software and new hardware form factors will intermediate and digitize — many of the things we do and experience in business, life and our world.

Ambient? Coevolution? Intermediate? Huh?

Bill Gates’ “a PC on every desktop” was a far better start.

Why an Insider could be good for Microsoft

Well, it is possible. I think the biggest reason an Insider is good for Microsoft is that the Microsoft customer has been happy with “The Microsoft Way” since Windows 95 and is loath to change. This has been proven by the measurable disaster that is Windows 8 Metro and Microsoft’s response in 8.2 bringing back the old ways. An Insider understands the Microsoft customer’s comfort zone and will work within these constraints, whereas an Outsider would likely challenge this status quo in order to assert their vision. The vast majority of Microsoft’s customers do not want change, nor, typically, do Insiders.

On the other side of the business, Investors and Wall Street treat Microsoft as a blue-chip stock. They expect stable revenues, stable growth, stable product lines and a stable management team. An Insider delivers here too, an Outsider may shake things up too much. These folks too hate change.

With an Insider at the helm, we’ll get more of the Microsoft same. And that, friends, is what Microsoft customers, users and investors want. And that could be good for Microsoft.

In short: Insiders bring less change, and more customer comfort. Beige is safe.

Why an Insider is bad for Microsoft

If you follow the five stages of business growth1 (Existence, Survival, Success, Take-Off and Resource Maturity), Microsoft has certainly hit maturity. Which means it’s at the top, flat part of the S-curve where growth stagnates and begins to turn down (see Technology Life Cycle). If nothing is done, the business will slowly die, as lots of other companies have done in the past.

The only way to grow a mature business is to research and develop a new strategy or product-line and ride up a new S-curve. Insiders, traditionally, have been more worried about maintaining market share and existing product lines and are averse or blind to new strategies. This is not good for Microsoft. Outsiders bring new ideas, new research and the will to try new things. Usually they encourage and create the magical “innovation” thingamajig that creates new S-curves and grows businesses. Without this drive, Microsoft will slowly shrivel away.

An Outsider also comes in without belief, history or baggage. An Insider commonly believes the internal Cargo Cult view of the business which is regularly different to reality. They carry the baggage of years of politics, inefficiencies and compromises that led to the current stagnation of the business. And they unintentionally wear blinders to the truth, faults and opportunities because these things do not appear in their limited field of vision. They are, after all, only human. Which is bad for Microsoft. An Outsider comes in with no such preconceptions, no history and fresh new ideas, no blinders. An Outsider can see the current business faults as their field of vision is not limited in any way. And they can, and are expected, to fix them.

In short: Outsiders bring change, more growth and that is good for a mature business. Beige is boring, old and belongs in the past, time for a new color palette.

Which will it be? Good or Bad?

Will the new CEO try to squeeze the most out of the current S-curve, or grow new S-curves in spite of his Insider status? Will he have the insight to see beyond Insider blindness and then have courage and opportunity to research and chase new S-curves. And will the customer base, investor base and organization help or hinder?

We’ll see. Time will tell.

In my humble opinion, however, an Insider was the wrong choice. No matter how amazing Nadella is (and this author assumes he is seriously good), he carries Insider baggage, Insider views and Insider tendencies. And this will be bad for a large, mature business with no new S-curves to grow on, a lot of stable and falling S-curve businesses facing stiff competition, and a whole bunch of legacy customers and baggage to carry forward.

Then again, I am not part of the Microsoft customer majority. I seek innovation and change and cool new technologies, not more of the same. I want Microsoft to change and grow and leverage the amazing talents it has. I want it to compete and shake up the status quo. I just don’t see an Insider making that happen.

And thustly begins the end of Microsoft.

Follow the author as @hiltmon on Twitter and @hiltmon on App.Net. Mute #xpost on one.


  1. The Five Stages of Business Growth, Churchill and Lewis, Harvard Business Review, May-June 1983.

First They Came For…

A Modern Version

First they came for the record stores
and I didn’t speak out
because I used iTunes online.

Then they came for the bookstores
and I didn’t speak out
because I used Kindle online.

Then they came for the technology stores
and I didn’t speak out
because I used Amazon online.

Then they came for the grocery stores
and I didn’t speak out
because I used FreshDirect online.

Then they came for the restaurants
and I didn’t speak out
because I used SeamlessWeb online.

Then they came for the shoe stores
and I didn’t speak out
because I used Zappos online.

Then they came for the home-wares stores
and I didn’t speak out
because I used Soap online.

Then they came for the stationery stores
and I didn’t speak out
because I used Staples online.

Then they came for the drug stores
and I didn’t speak out
because I used CVS online.

Then they came for the furniture stores
and I didn’t speak out
because I used IKEA online.

Then they came for the newspapers
and I didn’t speak out
because I used a web browser online.

Then they came for the coffee shops
and I didn’t speak out
because I make my own.

And then there was nowhere to go
and nothing left to do.

The Hiltmon 2014

The Original Version

First they came for the Socialists, and I did not speak out
Because I was not a Socialist.

Then they came for the Trade Unionists, and I did not speak out
Because I was not a Trade Unionist.

Then they came for the Jews, and I did not speak out
Because I was not a Jew.

Then they came for me
and there was no one left to speak for me.

Follow the author as @hiltmon on Twitter and @hiltmon on App.Net. Mute #xpost on one.

More TextMate 2 Basics

Previously, I wrote about the TextMate 2 Basics that I use all the time, and I recommend you read that post first. This post follows up with a mish-mash of more tools, ideas and tricks that I also use surprisingly frequently.

Text Tools

TextMate 2 tries to reduce the number of keystrokes you need to make to generate good code. There are many ways it does this but today I want to talk about auto-pairs and completions.

Auto-pairing

Auto-pairing is when you type in a character that should be paired in code, and the editor ensures the paired character is also inserted. For example, pressing the ( key inserts a ) as well (and where the caret is between the paired characters). This works well for all brace keys ([, ( and {) and quote (' and ") characters.

But TextMate takes it to the next level:

  • Wrap Selected: Select some text and press the opening key in an auto-pair set. Instead of replacing the selection with the key pressed like other editors do, TextMate wraps that selection in the correct pair.

    For example, to to correctly bracket an expression such as a / b + c, select the b + c bit using ⌥⌘← and hit ( to get a / (b + c).

  • String Interpolation: In Ruby, you can interpolate a variable’s value in a string using the #{} construct. TextMate is aware of the context and if you press the # key inside a string that can be interpolated, TextMate uses auto-pairing to wrap the selection in the interpolation braces.

    For example, to convert user_name in the following line puts "Name: user_name", select user_name and press # to get puts "Name: #{user_name}".

    Note also that if the string does not support interpolation (a single quoted string), pressing # inserts a # character only. Smart.

Of course the big problem with auto-pairing is that in most other editors, you need to then navigate past the closing pair character to continue working. In TextMate, if you type the closing character manually, it knows, and just moves your caret along without duplicating the close. Or you can use ⌘↩ to go to a new line, leaving the closed pairs behind, or ⌘→ to navigate over all the closes.

Tab and Esc Completions

TextMate has two kinds of completions, “tab” (⇥) completions and “esc” (⎋) completions.

Tab completions were invented in TextMate and have improved in TextMate 2. Tab completions operate by typing in a few letters and pressing the tab (⇥) key. TextMate attempts to match the characters before the cursor to the tab completions available for that language or context, and if a match is found, it puts the completion in. For example, in Ruby, typing def⇥ will insert def function_name\n\nend and highlights function_name for you to overtype.

You can find the currently available tab completions in the cog menu at the bottom for each language. The best way to learn them is to see what is available and then start using them. For example, in Ruby, I always use cla⇥ to create Ruby classes, mod⇥ to create Ruby modules and ea⇥ for quick each loops. I strongly recommend you check out and learn the tab completions for our favorite languages in TextMate. You will save a ton of keystrokes.

Note that TextMate is aware at all times on the language context you are in. Which means that different language completions are available in different parts of a code file. For example, in a Rails .erb file, HTML completions are available unless you are inside a <% ... %> construct, in which case, Ruby tab completions work.

Esc completion saves you keystrokes within a code file by completing function and variable names that exist in the current file. To get project level completions, you need to look at ctags which I intend to cover in a future post.

Start typing a function or variable name that already exists and press the esc (⎋) key to see the first recommended match. Press ⎋ again to find another match or cycle through the choices.

Just remember, in completions, tab (⇥) is for code, esc (⎋) is for names.

The File Browser

The File Browser has changed completely in TextMate 2 and it took me a while to get used to it. This is because the new file browser is more like a Finder window than the old project file manager.

To switch between the editor and the File Browser without using the mouse, hit ⌃⌘⇥ (Control-Command-Tab). You can then use the arrow keys to navigate the tree. Use ⌘↓ to open the selected file (just like Finder). When in the editor, to reveal the current file in the file browser, hit ⌃⌘r (“Reveal”).

If you mouse instead (like I do), single-click the file icon to open it in the editor. Single-clicking the name just selects it (just like Finder). Double-clicking the file name opens the file too. Once you get used to single-clicking the icon, opening files becomes so much easier.

Editor Tabs

The tabs at the top are also smart in TextMate 2. If you have the file browser open and request a file that is in the same tree as the currently open folder, it opens in a new tab. If not, it opens in a new window. TextMate therefore tabs or windows depending on the context of the file, automatically determining project membership. This even works when you use mate <filename> from the command-line.

We all open a lot of tabs as we work, especially in Rails development. And it’s a pain to close each tab individually. If you ⌘-click on a tab close button, all other saved tabs will close.

You can also drag a tab out of the tab bar (or double-click on it) to move it to a new window. If you want it back, use the Window / Merge All Windows menu. Currently this merges all windows into one, here’s hoping they use the smart logic for creating tabs and windows to find a way to merge to project-based windows someday.

Fonts and Themes

Since we all have different tastes and preferences, TextMate comes with a lovely set of built-in themes which are fully customizable. You can change the theme from the View / Theme menu.

To install a new theme, download a .tmTheme file and double-click it. Then look for the new theme on the View / Theme menu. I use my own CombinedCasts.tmTheme theme (See Multiple Themes in TextMate 2 – The Hiltmon) but there are hundreds out there to choose from.

In my case, I have set TextMate to be the default editor for all my script file formats, from .sh to .rb and .py. I also use QuickLook a lot to browse code files before opening them. If TextMate is the default for a file, it’s QuickLook generator is used and it now renders code files using your selected theme. A nice touch.

One recent change in TextMate 2 has been the way it accesses and uses fonts. I highly recommend using the View / Fonts / Show Fonts menu at least once after changing a theme to select the font and size that you prefer. Some old themes use incorrect names for fonts and the new model guesses incorrectly. If you have a global .tm_properties file and set your font there, make sure the name is the correct there too.

Script Writing Tips

If you use TextMate to write Shell, Ruby or Python scripts, use the environment string (or “shebang” #!) to make things easier to run:

  • Start each file with the right environment string. This helps the shell know what runtime to use to execute the file. Type env⇥ at the top of the file and TextMate will insert the correct environment “shebang” into the file. It will also mark the file as executable on first save. So instead of running ruby my_script.rb, you can just type my_script.rb on the command line to run it.
  • Speaking of the command line, hitting ⌃⇧O (the letter O) will open a fresh terminal session in the current folder of the file. But if you prefer the TextMate output window, ⌘r in TextMate will launch the environment specified in the file’s “shebang” and display the output in the output window.

Quick Keyboard Tips

Three additional keyboard tricks I use a lot:

  • ⌃": Toggles between single-quotes (''), double quotes("") and quoted strings (%Q{}). I use this a lot in Ruby as I normally create strings without interpolation (single-quotes) and then need to change it later. It works on the quotes surrounding the cursor.
  • ⌃_: Toggles the selected name between CamelCase (HomeBrew), underscores (home_brew) and nerdCase (homeBrew), useful when you use the wrong convention in code (especially if you code in multiple languages).
  • ⌃⌥⌘V: Brings up TextMate’s clipboard history. Now you can copy, copy, copy, then switch files and paste, paste, paste.

Some Useful Bundle Tips

Source Code Control

If you use git, mercurial or subversion, install the matching bundle. You then get the following benefits:

  • TextMate shows the current status of each file in the file browser. You can see which files have uncommitted changes instantly.
  • The ⌘y key brings up a menu of Source Code Control commands that you can use instead of leaving the editor and using the command-line. Options include seeing the changes to be committed, amending commits, viewing branches and of course, the ability to commit changes. The git bundle even has a config option to enable you to change your global settings.

The TODO Bundle

I don’t know about you, but I often leave TODO and HACK comments in my code to remind me about things, and then forget about them. The TODO bundle in TextMate 2 searches the current project tree for comments with TODO, FIXME, CHANGED and RADAR and displays them in the output window. Just hit ⌃⇧T. You’ll never forget again.

To add your own markers, go to the cog menu, find the TODO bundle and click Preferences…. Add a new marker and replace the regular expression to catch the rest of the comment. Hit Done, restart TextMate and run TODO again to see your new marker matches.

SQL Bundle

I only found this one recently, but it comes in handy a lot. I write a lot of code that queries PostgreSQL databases, which means I write a lot of SQL. Many of the processes and views I write contain embedded SQL statements.

Before finding this bundle, I used to have to copy the SQL from TextMate into NaviCat, test and run it, then copy it back. And I could never be sure that I got it all until I ran the program.

The SQL bundle supports MySQL and PostgreSQL only, but it works rather well. Start by setting up your connections in Cog Menu / SQL / Preferences…. Then, to test a SQL statement, just select it in TextMate and hit ⌃⇧Q. If TextMate picks the wrong database, open Cog Menu / SQL / Preferences… and highlight the correct database then click Done. ⌃⇧Q will run the SQL against the correct database now.

The output window is also a Database Browser, so you can use it to browse databases, their tables and see what their fields are named while coding.

The SQL bundle is pretty basic, but for quick and dirty views and tests it works great.

Fin

So those are some more TextMate 2 features that I use all the time. Hopefully you found a few more gems in there that could help you out.

If you missed the first post, check out TextMate 2 Basics.

If you have any awesome TextMate 2 features or keys you cannot live without, please share them in the comments.

Follow the author as @hiltmon on Twitter and @hiltmon on App.Net. Mute #xpost on one.

Reminder: Update Your Tools

At the start of every new year, I spend a the time to update all my tools and recompile all my products with them. At the cost of a few hours (or at worst, days) work, I get rid of most the maintenance hassles I faced before I used to do this and gain the benefits of all the new features and performance from updated tools.

I understand that, for all of us, updating our tools is surprisingly hard. Things that used to work now fail, incompatibilities have to be addressed and it takes time, perceived as unproductive time, to do. And then there is the risk that updating production servers will cause downtime for no valid reason. It’s easy to feel that we should wait for the next release, or when we have some mythical down-time, or when we’re forced to do so as the versions of our tools become obsolete. And it’s hard to justify updating tools when there’s real work to be done and no current perceived benefit in changing tools.

On the other hand, updating our tools now means we get the performance and feature benefits of these new tools now. We get the incompatibilities from last year out of the way, meaning that the products we make with these tools can move forward more easily (we’re less likely to get stuck with a version dependency). We get to use newer and better versions of our libraries. Our servers get updated, run better and require less maintenance. We get to learn and practice the updated tools ideas and technologies. And the start of the year is the best time to do this as it’s one of the slowest business times.

For me, that’s actually quite a lot of work, but so worth it. Amongst other things, I have:

  • Upgraded to Ruby 2.1.0 as it’s the latest and fastest.
  • Moved our Python code to 3.3.
  • Moved all our Rails applications to 4.0.2 as that is current.
  • Updated to the latest SASS, Rake, Sinatra, Node.JS and other tools.
  • Updated all my Homebrew installs (brew upgrade). This entailed a non-trivial upgrade of PostgreSQL from 9.1 to 9.2 (I had to export and import all my databases)
  • Updated all my CentOS servers and their tools to the new PostgreSQL, Ruby and Rails.
  • Moved all our Windows C++ and C# applications to Visual Studio 2010 from 2008 (we run one version behind on all Microsoft products – been burned before)
  • Recompiled all our C++ projects using the latest clang and gcc on each platform.
  • Updated all Objective-C applications to the latest Xcode and recompiled with ARC and the latest libraries.

What did I get out of all of this work? On the surface, nothing much. The same old applications doing the same old things. But each and every one of them is now ready to take advantage of new tools, new technologies and new features in 2014. And most seem to run just a little better and a little faster and use a tad less memory than before. And best of all I have peace of mind, I know there are no legacy dependencies that are holding me back.

There is no good time to do this, so do it now. Get it over with for the year. Update your tools now. Or you never will.

Follow the author as @hiltmon on Twitter and @hiltmon on App.Net. Mute #xpost on one.