The Hiltmon

On walkabout in life and technology

Social Network Precedents for Connecting and Friending

As are we all, I too am active on a bunch of social networks. But I too do not connect or friend everyone on every network. I think we all want some semblance of control as to what we share and with whom we share it. To help me decide, I created my precedents, a set of guidelines as to whom I will connect with on which network and what I share there.

Social networks to me are all about news and conversations. Business, friendship, tech nerdery, scotch, jokes, embarrassing personal photos, interesting articles, they all have their place and their network. And the precedents help me decide which and where.

If you have requested friendship on a social network and not received a response, chances are you did not fit the profile in my precedents. I assume that if I have requested the same from you, that I have not fitted the profile of your precedents. This does not mean we should not be connected, just that my preferred network may differ from yours. We should find our precedents, the networks we choose in common.

So, for the record, here are my precedents:

  • LinkedIn is for business, and business alone, my professional network. The businesses in question are my consulting business (Noverse) or my hedge fund employer. I prefer to have emails in my business account, taken a face to face meeting or had at least one business, professional phone call before connecting with someone on LinkedIn. I do not accept connections from strangers on LinkedIn, even though many people say that doing so is “good for business”.

  • I use Facebook and Foursquare for social friends and family only. No business allowed, this one is for fun. I share my personal experiences, make silly jokes and stay in touch with far flung family and friends. In order to make the cut here, we need to have enjoyed at least one social occasion, at least one drink, or be blood relatives.

  • Twitter, App.Net, Pinboard, and Game Center (as Hiltmon) are our public networks, and I will connect with anyone, anytime on any of these services. I am most active on Twitter, and I find it’s a great way to get news, interesting articles, and get in touch with folks anytime. I have met and conversed with some amazing people on Twitter and look forward to new conversations on it.

  • Although this site is not a true social network, I use Disqus comments here, and it alerts me via email when a new comment is posted. I am happy to talk to anyone via the comments here, but prefer to take longer discussions offline onto Twitter or email.

  • I am also somewhat active on Flickr and Instagram, and I have a Skype account. The precedent for connecting on these is that I have a connection with you on another service first, and have had at least one conversation before we connect on these networks.

  • Since we all have to, I too have a Google+ account, but no one else seems to use it. So far I have connected to all who request, but so few do. I still have no idea how, when, where or why to use it, but it’s there, in case.

  • And then there is email, old school, anywhere, anytime with anyone. I am always up for a good email. But there’s no guarantee of a response. See my about page on where to email me.

Of course, there are folks I never agree to connect with, including recruiters, spammers, cold callers, advertisers, anonymous users and the like. I wish to engage and converse on social networks, not be sold to, advertised to or hassled on them. I get enough of that elsewhere. I’m sure you all feel the same way.

So there they are, the precedents I use to select who I connect with on which network.

You, dear reader, you I do want to connect with. Pick a social network where our precedents meet, send a message, and say hello. And if there is a network I am not on, and I fit your precedents to connect there, let me know about it and I will sign up.

Follow the author as @hiltmon on Twitter and @hiltmon on App.Net. Mute #xpost on one.

Rebooting OmniFocus

I’ve been using OmniFocus forever to record and track my personal and professional actionable to-dos and ideas. But over the past year, I have been using it less and less, getting less and less tracked and done, and it’s all my fault.

You see, I started to experiment with what could be done with OmniFocus and messed up the whole concept of actually getting things done.

My primary experiment was to create scripts to automatically load actions in from my project files and to merge the company-wide Asana. The big idea was that I could save time and effort by automating task entry and assignment, and let the meat-bag (that’s me) process and review these tasks. If I could spend less time creating tasks and more time performing tasks, I would be more productive.

What a disaster.

The automation dutifully created more tasks than I could handle. The Asana merge, which only imported projects and tasks assigned to me, created even more tasks. Before I knew it, my OmniFocus database was loaded full with too many tasks for me to get ahead of. And that was off-putting.

And these tasks were not mine. The imported tasks from Asana felt were created by other people, because they were. Sure, they needed to be done, and needed to be tracked in Asana. All good. But they did not feel like my tasks.

And these tasks also created more work for me because the Asana projects and contexts did not match how I work and the task descriptions were confusing. It’s a huge difference between how you write tasks for yourself and how others write tasks for you. And then there are the tasks that are tagged in Asana as assigned to me but are really like the Carbon Copy (cc) in email, just for my information, yet they too came into OmniFocus as tasks for me to do. And that made me want to use OmniFocus less.

I no longer felt I owned my own task list.

Time for a reboot.

Here’s what I did:

  1. Backed Up the Old Database: I backed up the old database using File / Back Up Database… just in case I messed things up.
  2. Export the Old Database: Since I wanted to see what was in the old database and decide which tasks to copy over, I needed it in another format. So I exported the old database using File / Export… and selected Plain Text (TaskPaper). I then opened this file in BBEdit.
  3. Nuke the Old Database: I exited OmniFocus and deleted the omnifocus.ofocus file in ~/Library/Application Support/OmniFocus.
  4. Restart and Reset Sync: When I relaunched OmniFocus, it created a new database and wanted to download my old database from the OmniSync server. I hit cancel on that dialog and clicked File / Replace Server Database… to reset sync. Kudos to the Omni developer who created that dialog, the explanation on what do do if I wanted to reset was very clear.
  5. Re-created the Projects my way: I recreated my personal and business folders, my personal and business single actions and my bills action that integrates with Hazel. I then re-created the projects in each folder the way I think about them.
  6. Selected and recreated tasks my way: I then went through the massive TaskPaper formatted file in BBEdit and either rewrote or pasted in my tasks my way. In this way, I got rid of the duplicates, the copies and the confusing ones, and only added back the ones I want to and need to do in a way I will want to do them in the projects they belong to.

It took a rainy Saturday morning to manually reboot my OmniFocus. But now I have clean and tidy, works-my-way projects, contexts and tasks in my OmniFocus database.

I own my own tasks again.

As for the tasks in Asana, well, I get an email whenever a task is assigned to me. And that I can process using OmniFocus’s quick entry if the task is a real one. But this time, it will be mine.

I foresee a very productive week ahead.

Next task: Sign up for the OmniFocus 2 Beta.

Follow the author as @hiltmon on Twitter and @hiltmon on App.Net. Mute #xpost on one.

It’s That Process That Is the Magic

CRINGLY: What’s important to you in the development of a product?

JOBS: One of the things that really hurt Apple was after I left, John Sculley got a very serious disease.

It’s the disease – I’ve seen other people get it too – it’s the disease of thinking that a really great idea is 90% of the work.

And if you just tell all these other people, you know, “Here’s this great idea,” then of course they can go off and make it happen.

And the problem with that is, that there is just a tremendous amount of craftsmanship in between a great idea and a great product.

And as you evolve that great idea, it changes and grows.

It never comes out like it starts, because you learn a lot more as you get into the subtleties of it and you also find there are tremendous trade-offs that you have to make.

There are just certain things you can’t make electrons do. There are certain things you can’t make plastic do or glass do or factories do or robots do.

And as you get into all these things, designing a product is keeping five thousand things in your brain – these concepts – and fitting them all together and kind of continuing to push to fit them together in new and different ways to get what you want.

And every day you discover something new that is a new problem or a new opportunity to fit these things together a little differently.

And it’s that process that is the magic.

Robert X. Cringely’s 1995 interview with Steve Jobs (Unedited)

100% true then. 19 years later, still 100% true. Copied here so I never forget it.

Follow the author as @hiltmon on Twitter and @hiltmon on App.Net. Mute #xpost on one.

How I Use OS X Tools to Build a Linux-only Product

OS X really is a developer’s dream platform, a solid UNIX core on which almost everything compiles with a brilliant graphical environment hosting the most amazing developer tools.

It’s the “almost” in the above sentence that has recently tripped me up.

I have a new vendor library that does not compile or run on OS X (yet!). As this is a big project which I will be working on for months, I want to set up my development environment so that I am comfortable and productive.

The simple solution is to spin up a Linux VM with a GUI or a set of vim shells, and code away. With this simple solution I can tune the environment to match production, compile and build the product using this vendor library and know that it works.

But to use this simple solution, I need to learn different keyboard shortcuts, copy and paste are weird, there’s no integration with my current platform, and the code is locked inside the VM. It violates my preferred Develop Locally, Stage Nearby, Production Anywhere model. In short, doable but sufficiently different to make me pause.

The real issue here is not one of which platform is better or worse, the problem is me. I am working in parallel on a bunch of other OS X only projects using the same toolset in an environment that I have tuned and practiced on for years. Do I want to build the muscle memory and productivity tools on yet another platform, or can I somehow make the current set work? Can I deal with the frustration of having to switch working contexts, keyboard shortcuts and tools to do this? And do I have the time to set it up and develop the productivity mindset.

The answers are all no. I am building core product for a new business, I have deadlines and each hour I spend in setting up or learning a new platform is an hour not spent creating the products we need.

It would be much nicer if I could use my tried and true, native OS X tools and yet still compile and build this pure-Linux product.

So I found a way that works for me.

Sure its a case of a few hacks and hassles, but this is the one and only long term Linux project I have that is part of a lot of other OS X projects (many that deploy to Linux but can be developed locally).

My chosen solution is to do everything as usual on OS X, using the usual tools, local folders and standards, and run a micro-VM with mounted shared folders and ssh to handle the compile and run part (the one and only one thing I cannot do on OS X). I get all the productivity I am used to from OS X and its tools, and still build and run on a production-compatible system.

In Summary

Here’s how it works. Keep in mind that the source code is local, only compile and run happens on a local micro-VM:

  • I launch VMWare Fusion which resumes the minimal Centos 6.5 VM that matches production (no GUI, minimal Linux install, nothing else running). Then hide it away.
  • I use a shortcut key in iTerm 2 to open assh session to this VM and use a bash alias tocd to the mounted, shared folder where the code on my computer can be found.
  • I launch Xcode, open the project locally as usual, and start programming.
  • I compile and build in the ssh terminal. This is the only step different to all other projects.
  • I do everything else using local tools on OS X.

And I am maximally productive because I have not really had to change a thing about how I work.

In Detail

The VM (called ‘Witch’) runs on my laptop’s installation of VMWare Fusion. Since I use minimal Linux installs on our production servers, I did the same here. All I added were the Developer Tools and dependent libraries needed using the standard yum install process. And, of course, I copied over the vendor library and set up ssh access.

The settings for the VM in VMware are such that networking is local to my laptop (nice and safe), and it mounts my shared folders automatically so it can get to the code.

And that’s the secret. The VM “sees” the code as local to it while I see it as local to me.

These shared folders can be found in /mnt/hgfs by default (you need to make sure VMWare tools are installed and running).

I also set an alias in the VM’s .bash_profile that enables me to cd to my project folder easily:

alias cdsc='cd /mnt/hgfs/Projects/Client/ProjectName/'

In iTerm 2, I created a profile with a shortcut key (in this case ⌃⌘W for “Witch”) to launch an ssh terminal session to this VM using my preferred development login.

And finally, I created the Xcode project using my Xcode and the Simple C++ Project Structure and Xcode 4 Code Completion for External Build Projects standards.

Code in Xcode, ⌘⇥ to the shell to compile and run, ⌘⇥ back to code some more. And if I use the Xcode compiler, well, linking fails as the vendor library is not compatible, but all the other IDE features work just fine so compilation errors are easy to detect.

Quick Start this Environment

Since I switch between many projects every day, I automated it such that I can get this environment up in seconds. When I want to work on this project, I:

  • Launch VMWare using Alfred, which resumes the VM automatically.
  • Launch iTerm 2 (on my system that’s ⌃⌘T) then press ⌃⌘W to open the SSH terminal in a tab. I then type cdsc to change to the project folder.
  • Open Xcode, and off I go.

Benefits

The benefits to me are manyfold:

  • I get to use the tools I use on all other projects for this one, thereby maximizing my own productivity.
  • I get to build and run the program in its natural environment so I know it will compile and run in production.
  • It’s all local, on my hard drive, accessible to all my tools which means I can work on this project anywhere, anytime on my laptop, and it gets backed up via Time Machine (and of course pushed to a git server).
  • I do not have to remember which context or environment I am in and the keys and shortcuts that match, I simply remain in an OS X context.
  • And when the vendor sends me an OS X version of the library, it will just work without me changing a thing. I will just have to press ⌘B to build and run it from Xcode.

I think that I did this mainly because I do have so many other projects I work on every day in OS X. I did not want this one to be any different. Instead of learning something new or different, I chose to bend this project to my will. I think if I did more of these pure Linux projects, it would obviously be better to set up, configure and make productive a true Linux development environment. But I need to be writing code and shipping product not learning new things right now.

So I found a way to use my OS X tools and shortcuts and productivity aids to build a very native Linux-only product.

Maybe some of the ideas and tricks in this post can help you with your second platform development.

Follow the author as @hiltmon on Twitter and @hiltmon on App.Net. Mute #xpost on one.

Open Source Compiles in an Xcode 5.1 World

So today I needed to work on an older, open-source based C++ application on my Mac and there was no way to compile it under Xcode 5 even though the development tools were installed and working perfectly.

The issue, it seems, is that Xcode 5.1 has finally removed and deprecated a lot of old C++ stuff that is still required by older, popular libraries such as boost and quickfix. It emulates g++ 4.2.1 OK, but is no longer 100% compatible with it. I am quite sure that Apple and the Open Source community will eventually get these to work with the new compiler.

But I needed it now. And no end of futzing with compiler options and paths would work.

Fortunately, you can run Xcode 4 side-by-side with Xcode 5.1 on OS X Mavericks. And Xcode 4 comes with a real g++ 4.2.1 which does contain the deprecated code and compatibility.

To get this to work, download Xcode 4.6.3 from Apple Developer downloads. Once downloaded, drag and drop the Xcode install onto your Desktop (not applications) and rename it Xcode4. Then drag the renamed application to your Applications folder. You now have Xcode 5.1 (named Xcode) and Xcode 4.6.3 (named Xcode4) side by side.

To make things easier, Apple has provided the xcode-select command to enable you to choose which Xcode install is the one used in system compiles or from the command-line.

To use Xcode 4 and the older g++, just select Xcode 4:

sudo xcode-select -s /Applications/Xcode4.app/Contents/Developer

Running g++ -v gives me:

Using built-in specs.
Target: i686-apple-darwin11
Configured with: /private/var/tmp/llvmgcc42/llvmgcc42-2336.11~182/src/configure --disable-checking --enable-werror --prefix=/Applications/Xcode.app/Contents/Developer/usr/llvm-gcc-4.2 --mandir=/share/man --enable-languages=c,objc,c++,obj-c++ --program-prefix=llvm- --program-transform-name=/^[cg][^.-]*$/s/$/-4.2/ --with-slibdir=/usr/lib --build=i686-apple-darwin11 --enable-llvm=/private/var/tmp/llvmgcc42/llvmgcc42-2336.11~182/dst-llvmCore/Developer/usr/local --program-prefix=i686-apple-darwin11- --host=x86_64-apple-darwin11 --target=i686-apple-darwin11 --with-gxx-include-dir=/usr/include/c++/4.2.1
Thread model: posix
gcc version 4.2.1 (Based on Apple Inc. build 5658) (LLVM build 2336.11.00)

To use Xcode 5 and the new LLVM/Clang g++, just select Xcode 5:

sudo xcode-select -s /Applications/Xcode.app/Contents/Developer

Running g++ -v now gives:

Configured with: --prefix=/Applications/Xcode.app/Contents/Developer/usr --with-gxx-include-dir=/usr/include/c++/4.2.1
Apple LLVM version 5.1 (clang-503.0.38) (based on LLVM 3.4svn)
Target: x86_64-apple-darwin13.1.0
Thread model: posix

To simplify the process, I added the following to my ~/.bash_profile:

# Switch xcodes
alias setxcode4="sudo xcode-select -s /Applications/Xcode4.app/Contents/Developer"
alias setxcode5="sudo xcode-select -s /Applications/Xcode.app/Contents/Developer"

Now all I have to do is type

setxcode4

Or

setxcode5

And my password to switch environments.

With both environments installed and a quick command, you too can work on old code using the old g++ compiler, and switch back to Xcode 5 and llvm/clang for newer projects.

Follow the author as @hiltmon on Twitter and @hiltmon on App.Net. Mute #xpost on one.

Develop Locally, Stage Nearby, Production Anywhere

One of the things that surprised me as I moved back into the land of corporate software is the number of developers I talk to that commute and work in offices and yet develop on remote computers. I thought, as you probably do, that by now all software development would be local, as in on the developer’s own computer.

Instead, it is not uncommon for corporate developers to rely on remote shells or desktop virtualization to access remote development computers and use them to code. It’s slow, unproductive, frustrating and so pre-1990’s.

Aside: Back then, remote development was the norm. Mainframes were too expensive to give to developers and most production platforms were not available in desktop form. You had to develop remotely.

In this post, I want to point out how delusional remote development is in the 21st century and how local development can meet all Corporate needs.

The Remote Development Delusion

There are several reasons given by Corporates for this old-school behavior, all of which are bogus.

bo·gus ˈbōgəs/ adjective
1. not genuine or true; fake.

  • Security and Access Control: The argument is that companies want to be sure that their developers do not run off with their source code, or that nothing is lost when computers get stolen or lost, that only authorized developers access the right parts of the code and that they can track who did what.

    Bogus because the average developer can still copy the code, or even rewrite it from scratch. Modern source code control systems and good network access control will resolve any security concern. Encrypted hard drives and startup passwords secure computers, even after theft.

  • Big or Sensitive Data: The argument is that the amount of data processed by developers is huge, or the data itself is so sensitive, that they need to develop close to the data.

    Bogus because networks are plenty fast, developer databases do not need to be that huge and storage is cheap, so making local (or nearby) copies of data is simple. Bogus because sensitive data can be changed via scripts, or fake data generated.

  • Cost Savings: The argument is that companies can share development resources such as licenses and installations and can therefore purchase cheaper computers for their developers.

    Bogus because the productivity losses in remote development by far outweigh the cost of a few measly licenses or computers. If a company cannot afford its development tools, it should not be using them. And developer tools these days are cheap or free, as are powerful computers, and are licensed for local development.

  • It’s Policy: The argument is that the company has a policy of remote development.

    Bogus because the policy argument is always bogus. Policies are made-up rules created by business people who generally know nothing about reality, or were written so long ago that no-one has cared to change them. Staff are forced to follow these bogus policies or be fired. It has nothing to do with right, productivity, cost or any other reality.

In each case, the argument put forward for remote development is bogus. Development on remote computers is slow, unproductive, subject to outages, and guaranteed to frustrate good developers and encourage them to leave.

The Local Development Model

The local development model’s mantra is simple, to:

  1. Develop Locally
  2. Stage Nearby
  3. Production Anywhere

Developers code and compile on their own computers, using locally installed tools. All projects are staged (and continuously integrated) on servers nearby (as in the same office or on the same network). And then code can be deployed anywhere to run in production.

It’s simple, local development makes sense and is the most productive way to develop.

  • It’s responsive, developers press a key and the response is instant.
  • Developers can work anywhere, any time. Give a developer a laptop and they can work from home or at night.
  • Developers can use their favorite tools and productivity enhancers to speed up development.
  • Developers are only constrained by their own abilities, and not by the resources available on shared development machines, network flakiness or the slowness of remote operation.
  • All development tools can and do run locally, and are designed and licensed to do so. Local development is the expected model.

Staging nearby also makes sense.

  • For large and complex code bases you can run continuous integration tools to build and report on errors far quicker than attempting production builds.
  • You can use virtualization to spin up clean staging virtual machines on cheap retail hardware.
  • If a developer does break something, it’s close bye and easy to recover. And no other developers are affected.

Only production need be remote, and that can and usually is anywhere.

Secure Local Development for Corporates

But corporates need their controls and security blankets. It turns out that the local development model can easily be made as secure and controlled as needed without punishing developers and throttling productivity.

Lets take each core remote development argument one at a time.

Security and Access Control

Corporates want to be sure that the code and data is secure and only authorized personnel can access it. Easy:

  • All modern computers and Operating Systems support encrypted hard drives. A password is needed to unlock the hard disk and yet another is needed to log in. A stolen encrypted hard drive is useless, and many systems can be remote wiped.
  • All modern operating systems are built to keep folks out, and only to enable access by authorized users. Strong passwords are good enough, and even biometric access is available.
  • If the data and access are secured, then communication needs to be as well. All systems come with built-in encrypted Virtual Private Networking to secure communications.
  • Access to code can be controlled using an in-house hosted Source Code Control system. Developers can only see the projects that the company wants them to see, and the code is saved, logged, tracked and maintained in-house.

In short, no problem here.

Big or Sensitive Data

Corporates argue that their data is too large, or is so sensitive that it needs to be tightly secured so development has to happen remotely. We’ve already covered securing code, the same applies to data, so what about data size? Well, assuming the developer needs access to the whole data set, easy:

  • Modern hard drives are cheap and huge. And to be clear, there are exceptionally few corporate databases that cannot fit easily on a modern laptop drive. Big data is real, but rare. And even if the data is bigger than modern laptop drives, desktops with lots of drive bays or even SAN’s are ridiculously cheap.
  • If data is too sensitive, run a script on a copy of the database to remove or change sensitive information. For example, replace all credit card numbers with fake ones. Or use scripts to generate accurate yet fake data for developers. There are lots of tools out there to help do this.
  • If the data is stored in a commercial database product, and the company is so cheap it does not buy licenses for developers, then maybe they cannot run it locally. Fine, but modern networks are fast, and running a one additional license as a development database server nearby, close to staging, accessible via VPN, is cheap.
  • And if the company really wants to be tight, it can create and clone secure virtual machines that developers can spin up and run whenever they need data access.

It is rare that a developer needs access to the entirety of a database, and even rarer that the data does not fit on a local disk. Big data is not a problem. And there are solutions to sensitive data, so that too is not a problem.

Cost Savings

But what about the cost of all these powerful laptops, tool licenses and staging servers? I could point out that maintaining development servers, licensing and the infrastructure to support remote development is just as high cost (oh, I just did). But:

  • Computers are cheap, ridiculously so. And for the price you get lots of fast cores, oodles of RAM and massive hard disks. People and people’s time is expensive. What would you rather spend money on?
  • Modern development tools assume local development and so license terms and conditions expect this. The cost of giving each developer a licensed copy vs running a the same licensing centrally is exactly the same.
  • And then there is the free and open source option. Instead of using expensive proprietary tools, try using open source languages that all the cool companies are using. The cost of getting support on Open Source is negligible and the products are being used by millions.

Choosing between needing more expensive developers using cheaper computers versus having fewer, more productive ones is easy when you look at the cost. Local development is cheaper.

Policies

And then there is the legacy of policy. A business that is too rigid to change in the face of changing technologies, costs and realities has no right to survive. A business that is constrained by policy cannot evolve.

If a company wants to hang on to their developers, and get development done quicker and better, local development is the way to go. And if the only thing that’s holding the company back is a few silly written rules, er policies, burn them.

Develop Locally, Stage Nearby, Production Anywhere

Back in the 1970’s, 1980’s and early 1990’s, remote development was the norm. Because it had to be. Mainframes and server platforms were expensive and unavailable on desktops. Computers were not secure and networks were unavailable or slow. And the tools were designed for this model.

The downsides were the same as modern remote development: key press delays, slowdowns as other users took too many resources, layers of security and technology needed to enable remote access and all sorts of flakiness and problems. The upside, well, it was the only way to develop.

But since then we’ve moved on. A computer on every desk (and now lap and pocket) is the norm. Servers and server platforms can run on laptops as well as on big iron in production. Computers are secure and fast, encrypted networking is ubiquitous.

There is no downside to local development. The last 20 years have been spent making local development a reality. The upside is huge. Faster, anytime, anywhere, productive developers result.

There is no downside to nearby (or even local) staging. The upside is also huge. Better, faster builds, quicker identification of issues, and better testing all result.

Welcome to the 21st century. Develop Locally, Stage Nearby, Production Anywhere. Remote development’s time has passed.

Follow the author as @hiltmon on Twitter and @hiltmon on App.Net. Mute #xpost on one.

From Tool Maker to Tool User

Abstract: Programming is still seen as a single-language, single-platform tool-maker role. That has not been true for a long time. Over the years, the platforms, tools and libraries available have multiplied. We’re tool users now, where a programmer can pick up a new platform with ease, and use the tools available to create complex, reliable products a lot quicker. That which was impossible then is commonplace now.

The role of the programmer has changed over the years, from being a certified single language, single platform tool-maker into a master of application styles, technologies, use cases and platform tool-users. And when you think about it, the tremendous change in the tools we can use and master has steered us here. But we, the industry and the job market still view programmers as singular tool makers (just look at most programming job ads).

Heck, I still see myself as a tool-maker. But that has not been true for a long time. I realize now that I am a tool-user that creates useful products. To use a carpentry analogy, I no longer make the plane, I use a plane and other tools to make tables.

As an exercise, I decided to look at what I do and the tools I use in 2014 against 10 years ago, and the 10 before that, and, since I am old enough, 10 years before that too. I wanted to find out when we stopped being a tool-makers and became a tool-users. I wanted to understand how we can do so much today that we could not do then.

Before you read on, think about your own experiences, the tools you made or used and what you use today. When did you change?

1984

I was close to ending High School and had a Sinclair ZX81. On it, I could create simple BASIC programs, mainly utilities and games. There was no IDE, no libraries, no database and the command-line environment was rudimentary. If I needed anything, I had to create it first.

1 task, 1 platform, 1 language, 0 libraries, 0 IDEs, 0 databases.

100% tool maker, 0% tool user.

1994

I was working for a Systems Integration company. As a firm, we were engaged to write those large, complex core applications for large complex clients such as utilities and government departments. I, however, worked on one part of one product for weeks at a time.

Our desktops were Intel PC’s running new fangled Windows enabling us to run multiple emulated VT-100 terminal sessions to the hosts we were developing on. The platform was UNIX, the code written in C (and some C++, lex and yacc), glued by shell scripts, the editor was vi, and the database was this new one called Oracle.

Being able to see multiple VT-100 terminal sessions on one screen was leading edge, a huge improvement over the teletypes from before and a huge productivity boost. And using a database meant that data storage and retrieval were written for us. But the tools were rudimentary. We needed to write everything, from the protocols between systems to the libraries that processed the data.

1 task, 1 platform, 2 languages, 1 library, 1 IDE, 1 database.

90% tool maker, 10% tool user.

2004

I was working at a Finance firm, designing and developing the platform for their business. My time was allocated to a project at a time, with interruptions for support.

It was a Windows shop. Windows on the desktop, Windows on the server. But code was being written in several languages, C# and HTML for the web applications, C++ for the core mathematics, and Perl to glue it all together. My IDE was Visual Studio for the C# and C++, UltraEdit for Perl. We were using the standard C# and C++ libraries from Microsoft, and the magnificent variety of tools available on CPAN for Perl. And the database was SQL Server, back when it still acted like the Sybase it was.

On the surface, the changes between 1994 and 2004 seem fundamentally small, but were huge. Fundamentally, still using a C-like language, still writing scripts and still using a database. But Web Services libraries built into C# meant that writing server interactions was so much faster and easier. XML serialization made integration easier. Perl could do a whole bunch more and a lot quicker than shell scripts given the CPAN libraries. And the IDE was a dream compared to vi with integrated syntax coloring, multiple panes, integrated debugging and a built-in database manager.

But the biggest change was the availability and capability of the libraries at hand. Need to read a CSV, there was a library for that. Need to talk to a server, the protocol was written and included in the platform. As programmers, we had moved to being more tool users than makers because the tools had mostly been written. And I do not think we, or the industry, had realized it yet.

1 task, 1 platform, 3 languages, 3 libraries, 2 IDEs, 1 database.

20% tool maker, 80% tool user.

2014

I work at another Finance firm during the day and my own business in my spare time. I am developing yet another internal platform as well as web applications and native mobile applications for clients. The net variety of work is brilliant.

Our client platforms include Windows, OS X, Linux, iOS and Android. Our servers are mostly Linux with a few Windows Servers as needed. Code is being written in Ruby, C++, JavaScript, Python, R and of course Objective-C. The IDE’s in use include Xcode, Visual Studio, vim, Sublime Text and the lovely TextMate 2. And when it comes to libraries, we use a lot, including Rails, Sinatra, JSON, NumPy, Pandas, STL, Apple’s Foundation Kit and more. And our databases run on SQL Server, PostgreSQL, MongoDB and Redis.

Again, the fundamentals look the same since 2004. Still working the web services angle, C-like languages and scripting tools. But so much more productive. There is a library and a platform and a tool for everything now, no need to write your own. The time taken from idea to delivery has been halved and halved again. It’s all about choosing the right tools and using them effectively.

3 tasks, 5 platforms, 6+ languages, 10+ libraries, 4 IDEs, 4 databases.

0% tool maker, 100% tool user.

From Maker to User

When I started working we had to be proficient in one platform, one language and one database. It took weeks and months to go from requirements to design to delivery because we had to write everything ourselves and our tools were simple. Support and maintenance were hard. We were tool-makers.

In contrast, these days we regularly switch between languages, platforms, and tools to create web, native and hybrid products. The time taken from idea to delivery is days or weeks because it often involves finding the right library, learning how to use it and coding the use case we need. The platforms that we deploy contain so much functionality that most of the work is done for us. And we can do so much more in the same amount of time. We have become tool-users.

Back then, the tool-maker had to be able to program at a low level and know all the nuances of the platform. Now the tool-user needs know what tools and libraries are out there, to learn them and to use them, oftentimes platform agnostically. Aside: But we still hire as if we are looking for tool-makers, singular platform/language programmers.

We still need the tool-makers, the ones who create the platforms, languages, libraries and IDE’s. But the majority of us have lost that skill if we ever had it. The point is that using these tools, the products we tool-users create are better, faster, more reliable, easier to extend than ever before and we produce them so much quicker.

And because we use these tools, we can do a lot more every day and create a lot more products. Just look at the scope of work and the numbers above. Back then we worked on a single component of a single product for weeks at a time. Now, we work on a portfolio of products simultaneously.

Back then, as tool-makers the work we do now was not possible. Now as tool-users, it is what we do all the time.

Follow the author as @hiltmon on Twitter and @hiltmon on App.Net. Mute #xpost on one.

Easy SSH to Local Linux VM

TL;DR: Run avahi on the VM, ssh to the VM name dot local.

Many of the Simple C++ Projects I work on, whether I use Xcode or not, land up needing to be recompiled on a Linux Virtual Machine prior to being deployed to the production Linux server. I do this because I want to be sure the I know what needs to be set up in the Linux environment, that the compile succeeds without errors in a production-like environment and the code works properly before deploying.

I spin up these virtual machines all the time. But connecting to them to copy the code over and perform these compiles is a hassle because their IP addresses change all the time. Aside: Setting a static IP address does not work because I often clone these virtual machines to try different settings requiring a new IP address to be set up and remembered manually. Also, static IP addresses sometimes conflict on different networks.

But there is a solution. avahi.

The Problem in Detail

I spin up Linux VM’s in my local VMware Fusion using the exact same Linux distribution as production. Since these are never to be exposed to the wide world out there, I set the network to “Share with my MAC” which makes it a local VM. I could use “Bridged Networking” but this problem then recurs when I work from home or a coffee shop.

I yum install all the packages as per production, even set up a deploy user. I then need to ssh into that deploy user on that VM as if I were ssh-ing into production to build and test the code.

The problem is that each time the Linux VM boots, or I use it outside the office, it gets a new DHCP IP address. Which means that I need to find out what the IP address is every time before I can SSH in. Too many steps:

  • Log in
  • Run an ifconfig
  • Find the IP address (172.16.112.141 this time)
  • Press ⌃⌘ to release the mouse
  • Open a terminal locally
  • ssh deploy@172.16.112.141 this time!

What a Pain! Next time I use this VM, I’ll have to perform the same dance.

The Solution in Detail

Run avahi on the VM.

When you build the Linux VM (I use CentOS 6), first set the host name to a unique name. Edit /etc/sysconfig/network as root and set the HOSTNAME attribute:

NETWORKING=yes
HOSTNAME=witch.noverse.local

You may need to restart some distributions after doing this.

Then execute the following commands as root to install avahi

$ yum -y install avahi
...
$ service avahi-daemon start
...
$ chkconfig avahi-daemon on

This installs the avahi daemon, starts it and sets it to start on every reboot.

What the avahi daemon does is publishes the VM’s basename (the first part of the hostname before the dot) on the ZeroConf (or to use Apple’s word Bonjour) network.

Which means you can always see it as basename.local. So, instead of ssh-ing to an new IP address every time, just ssh to the basename and add a .local. For example, this works for the above Linux VM:

ssh deploy@witch.local

Then, no matter where you are or what IP address the VM gets, you can always access it by the same name. You can even add this to your .ssh/config file as a shortcut, which never changes!

I have been totally surprised how much fiddling and time this simple trick has saved me.

Follow the author as @hiltmon on Twitter and @hiltmon on App.Net. Mute #xpost on one.

I Choose Not to Be Anonymous

A simple premise.

  1. I always post from my own domains or accounts, all of which are traceable back to me.
  2. I sign and add my true byline, which is me, to all my work.
  3. I always use my name, hiltmon, when commenting or responding on other sites or services.
  4. Anyone can always get in touch with me via this web site, email, Twitter, App.Net, Facebook, Google+, Linked-In and other services, I am not hiding.
  5. There is only one hiltmon, and it is I. Google me.

I choose not to be anonymous. Here’s why.

There has been an ongoing debate on the internet for years whether anonymous posting or commenting is a good or a bad thing. Debating the goodness or badness of it is a distraction, it’s existence and the people’s choices are the real issues.

I think its a choice not to be anonymous, but not all of us have the same choices. Anonymity is absolutely necessary if the poster is being discriminated against, living under some kind of martial law, a victim of a crime, a political refugee or someone who, by posting, will be incarcerated or killed. These folks should have access to anonymous posting so that their stories and the truth can get out. So that we, the rest of the internet, can learn about it, and maybe do something about it. We should not stand by on principle while they may suffer or die for nothing.

But for all other cases, I believe that anonymous posting is not a good idea, a bad choice. Consider this, you read a blog post by an anonymous author. How do you know if they are credible? How do you know if the post is truthful or spin or outright lies? How can you be sure that the post is not some corporate marketing placement, an agenda or a scam? If it’s anonymous, you have no idea. If the post has a byline, you can check the source and determine credibility for yourself.

Anonymous commenting on the other hand is rife with abuse, and that abuse bothers me too. The trolling, the hate, the stalking, the flame wars all work because the commenter assumes they cannot be traced, and so they can “get away” with things. It’s childish and immature. It defaces sites, devalues other people’s work, and shows complete disrespect. There are places for this sort of behavior, Hacker News and Reddit come to mind, but nowhere else. Aside: I am very pleased that the commenters on my site are rarely anonymous and do not abuse the service.

It takes courage to put your name on things the world can see, copy and save for later. It takes integrity to stand behind your own comments and writing. It takes self-respect to put yourself out there and respect others.

I cannot (and have no right to) tell folks what to do on the internet, but maybe I can set an example.

I choose not to be anonymous.

Follow the author as @hiltmon on Twitter and @hiltmon on App.Net. Mute #xpost on one.

Magical Migrations

As I am writing this, I am using magical migrations to perform a full data transplant across three servers, each with 100+GB of data, with a single command. Yep, I’m chilling and writing a blog post while replacing the entire data scaffolding of my firm as if it were just another cold Saturday. Magical indeed.

TL;DR: Rails migrations and Capistrano deploys create a stable, reliable, testable, reversible and no-fault way to manage and execute production database changes, no matter the platform size or complexity.

When I started at the current firm, I knew I would be designing and developing a wholly new, proprietary platform. At the time, all I knew was that it would be big and complex and that I did not yet have a handle on what it would be.

But I did know these to be true (for that matter not-knowing that you do not know something is knowledge in itself):

  • I did not know the complete requirements at the start, and most likely would not know the complete requirements in the middle.
  • I would not get the database design and architecture right the first time, and probably not the second time either.
  • Things will change. The business, requirements and architecture will change and continue to do so. Which means I needed to be flexible in choosing what to do and how to do things.
  • I will have more than one database server, many in fact. Some as backups or read-slaves, some as replicas, some as islands, and some for testing and modeling.
  • I need to be able to track, manage and automate the myriad of changes to all of these servers using tools because there is no way I could do it in my head.
  • I would probably be the only person doing this for the first year or so.

In the past, the way I have done this was to create separate database projects for each database and then created SQL files to save the queries and data definition commands to change these databases. Then, in preparation for deploy, we’d backup the production databases to a staging area, then manually run the SQL to migrate the database and then run a test suite to see if we got it all. If it all worked, we’d then spend a weekend doing this on the production servers. More often than not, something had been forgotten, not placed in the database project, or not updated as things changed, and we’d leave production deploy alone until the staging model was rebuilt and tried again.

And that was better than the older ad-hoc method of just having a dedicated Database Administrator patch the databases manually on deployment days (which is how it worked in the dark past and still does in a lot of firms).

I needed a way to automate all database changes, a way that was integrated in my development workflow and a way to deploy these changes with ease.

The solution: Rails migrations and Capistrano deploys.

Aside: Fortunately I am using Rails for a few of the web servers on the platform which made choosing Rails migrations easy. But I am also running a bunch of Sinatra servers for web services, Python programs for analytics, C++ programs for high-speed work and looking at Node.js and GoLang for future projects that all access our databases. And who knows what else will access them in the future.

For the main database, I created a single master Ruby on Rails project. I then share that model between all my Rails projects, see my Rails Tricks – Sharing the Model. All other projects just assume the current schema. For other databases, more Rails projects, several of which have no web interface at all, or are in support of a non-Ruby project.

Creating and Managing Database Schema Migrations

But lets focus on the main database.

In my environment, this runs on a pair of servers in a remote data center for production and on an additional database server used for analytics. Development is done locally, as is staging. All three of these main servers are accessible from the same code base, so they all need to have the exact same schema. And all have over 100GB of data in them which is business critical and so I cannot screw up deploys and changes.

All database changes are in Rails migrations.

All of them. No exceptions.

Create a new table, its a migration. Add a column, another migration. New index, another migration. Add seed data, a migration.

As a developer then, it’s easy to use and test. Create and edit a migration (or Rails model for new tables) and rake db:migrate to apply the changes. Not perfect, rake db:rollback, correct the model or migration and rake db:migrate again.

I never, ever use SQL directly on the database to make schema changes. It violates the protocol. All changes are in migrations, all of them.

A few Rails migration tips:

  • Create all tables as Rails models which create migrations. It adds a bit of code to the project, but makes it easy to test the database or create web views of the data later. For example:
1
$ rails generate model Table description:string price:decimal{12-2}
  • Do not be afraid to use raw SQL in migrations. That which Rails migrations cannot do, can be done with SQL in migrations. I use it all the time to transform or backup data. For example:
1
2
3
4
5
6
...
# Fixup codes
execute "UPDATE securities SET coupon_type = 'FIX' WHERE coupon_type = 'LKD'"
execute "UPDATE securities SET coupon_type = 'FIX' WHERE coupon_type = 'ADJ'"
execute "UPDATE securities SET coupon_type = 'FIX' WHERE coupon_type = 'SPC'"
...
  • Make all up migrations reversible. This is easy as Rails takes care of most of these as long as the migrations are non-destructive. For destructive migrations, such as when you are moving data to new tables or removing columns, I create temporary tables or dump files to save the data being deleted, and reverse these in the down part. Only when I am very sure that these changes are fixed and deployed to production do I create migrations to drop these temporary tables – these being the only non-reversible migrations I have. For example, in an up migration for a changing table, I first back it up:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
def up
  execute %Q{
     CREATE TABLE ex_table (
         id     INTEGER     NOT NULL,
         column_1 character varying(255),
         ...
     CONSTRAINT ex_table_pkey UNIQUE(id)
 );}

    execute %Q{
     INSERT INTO ex_table (id, column_1, ...)
     SELECT id, column_1, ...
     FROM table;
 }

  remove_column :table, :column_1
end

In some cases, I even dump data to files to enable reverses or just to have backup copies. For example, in a down (reverse) migration where I dropped a table in the up, I may have:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
def up
  # Save the data for Justin Case :)
  execute "COPY (SELECT cusip, price_date, price, yield) FROM historical_prices) TO '/tmp/hp.csv' WITH CSV;"
  drop_table :historical_prices
end
  
def down
    create_table :historical_prices, { id: false } do |t|
      t.string :cusip, limit: 9
      t.date :price_date
      t.decimal :price, precision: 12, scale: 4
      t.decimal :yield, precision: 12, scale: 6
    end

    # Get it back
    execute "COPY historical_prices (cusip, price_date, price, yield) FROM '/tmp/hp.csv' WITH CSV;"

    add_index :historical_prices, :cusip
    add_index :historical_prices, :price_date
end
  • One big negative of Rails migrations when creating tables is that they create an id field automatically. This is useful if running a Rails web app or using ActiveModel. But for tables that have no need of these, or are being accessed by applications that do not need id columns, here’s how to get rid of it: Just add { id: false } to the create_table line as in:
1
2
3
4
5
6
7
8
9
10
class CreateNoIDTable < ActiveRecord::Migration
  def change
      create_table :no_id_table, { id: false } do |t|
          t.string :my_key, limit: 9
          t.string :my_text

          t.timestamps
      end
  end
end

You can also get rid of the Rails created_at and updated_at columns by commenting out the t.timestamps code.

  • Seed data in migrations. Sometimes you need to put data into tables when creating them, for example when creating code to string reference tables. Put the data into the migration and have the migration load it. For example, using Rails model creates:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
class CreatePurposeClasses < ActiveRecord::Migration
  def up
      create_table :purpose_classes do |t|
          t.string :class_code, limit: 4
          t.string :class_name, limit: 32

          t.timestamps
      end
      
      add_index :purpose_classes, :class_code, unique: true

      # Populate
      PurposeClass.create!(class_code: 'AUTH', class_name: 'Authority')
      PurposeClass.create!(class_code: 'BAN', class_name: 'Bond Anticipation Note')
      PurposeClass.create!(class_code: 'BLDG', class_name: 'Building ')
      ...

Deploying Database Schema Migrations

I have been using Capistrano for years to deploy Rails applications. And it works well. I use it now to deploy my main database Rails project that contains all the necessary migrations to all servers.

It takes just one command to send the code over and one more to make all the changes and I can be sure all my staging and production databases are perfect.

To deploy:

$ cap production deploy

To migrate:

$ cap production deploy:migrate

I prefer to do this in two steps in case the initial deploy fails, in which case Capistrano rolls it back safely without affecting the database. I do worry about Mr Murphy.

In order to choose which server and database to migrate, I use Capistrano’s multistage extension. Each database gets a stage. For example, I have the following at the top of my main project’s config/deploy.rb file:

1
2
3
4
# Enable multi-stage support
set :stages, %w(staging analytics production)
set :default_stage, "production"
require 'capistrano/ext/multistage'

I then have separate files in the config/deploy/ folder for each server (stage) that sets the server names and roles. For example the config/deploy/analytics.rb file sets the :db role to the analytics database server, and the config/deploy/production.rb file sets :db to the production server.

I can then easily run:

$ cap analytics deploy:migrate
$ cap production deploy:migrate

I really rely on Capistrano’s ability to rollback code deploy errors, and on Rails migrations ability to rollback database migration errors to prevent massive failure situations.

The Benefits

I get a lot of benefits from using Rails migrations and Capistrano deploys (Magic Migrations):

  • Database management is part of my development process, not an add-on, separate process to be scheduled and managed.
  • I do not need to ‘remember’ the state of databases.
  • I do not need to document changes, I have a log of them in the migration files.
  • I do not need to insert data into tables after migrations as the seed data is included.
  • One command to deploy and one to migrate. I get my weekends back.
  • If anything goes wrong, it can be undone and rolled back as if nothing has happened.
  • I just know that each database has the latest and correct schema for all my systems and can access them with confidence.

So here we are, I’m running one of these right now that is significantly changing the core architecture of my firm’s platform and instead of manually doing the work, or staring helpless at the process, or even having to worry if it fails, I’m writing this post.

Follow the author as @hiltmon on Twitter and @hiltmon on App.Net. Mute #xpost on one.