The Hiltmon

On walkabout in life and technology

Magical Migrations

As I am writing this, I am using magical migrations to perform a full data transplant across three servers, each with 100+GB of data, with a single command. Yep, I’m chilling and writing a blog post while replacing the entire data scaffolding of my firm as if it were just another cold Saturday. Magical indeed.

TL;DR: Rails migrations and Capistrano deploys create a stable, reliable, testable, reversible and no-fault way to manage and execute production database changes, no matter the platform size or complexity.

When I started at the current firm, I knew I would be designing and developing a wholly new, proprietary platform. At the time, all I knew was that it would be big and complex and that I did not yet have a handle on what it would be.

But I did know these to be true (for that matter not-knowing that you do not know something is knowledge in itself):

  • I did not know the complete requirements at the start, and most likely would not know the complete requirements in the middle.
  • I would not get the database design and architecture right the first time, and probably not the second time either.
  • Things will change. The business, requirements and architecture will change and continue to do so. Which means I needed to be flexible in choosing what to do and how to do things.
  • I will have more than one database server, many in fact. Some as backups or read-slaves, some as replicas, some as islands, and some for testing and modeling.
  • I need to be able to track, manage and automate the myriad of changes to all of these servers using tools because there is no way I could do it in my head.
  • I would probably be the only person doing this for the first year or so.

In the past, the way I have done this was to create separate database projects for each database and then created SQL files to save the queries and data definition commands to change these databases. Then, in preparation for deploy, we’d backup the production databases to a staging area, then manually run the SQL to migrate the database and then run a test suite to see if we got it all. If it all worked, we’d then spend a weekend doing this on the production servers. More often than not, something had been forgotten, not placed in the database project, or not updated as things changed, and we’d leave production deploy alone until the staging model was rebuilt and tried again.

And that was better than the older ad-hoc method of just having a dedicated Database Administrator patch the databases manually on deployment days (which is how it worked in the dark past and still does in a lot of firms).

I needed a way to automate all database changes, a way that was integrated in my development workflow and a way to deploy these changes with ease.

The solution: Rails migrations and Capistrano deploys.

Aside: Fortunately I am using Rails for a few of the web servers on the platform which made choosing Rails migrations easy. But I am also running a bunch of Sinatra servers for web services, Python programs for analytics, C++ programs for high-speed work and looking at Node.js and GoLang for future projects that all access our databases. And who knows what else will access them in the future.

For the main database, I created a single master Ruby on Rails project. I then share that model between all my Rails projects, see my Rails Tricks – Sharing the Model. All other projects just assume the current schema. For other databases, more Rails projects, several of which have no web interface at all, or are in support of a non-Ruby project.

Creating and Managing Database Schema Migrations

But lets focus on the main database.

In my environment, this runs on a pair of servers in a remote data center for production and on an additional database server used for analytics. Development is done locally, as is staging. All three of these main servers are accessible from the same code base, so they all need to have the exact same schema. And all have over 100GB of data in them which is business critical and so I cannot screw up deploys and changes.

All database changes are in Rails migrations.

All of them. No exceptions.

Create a new table, its a migration. Add a column, another migration. New index, another migration. Add seed data, a migration.

As a developer then, it’s easy to use and test. Create and edit a migration (or Rails model for new tables) and rake db:migrate to apply the changes. Not perfect, rake db:rollback, correct the model or migration and rake db:migrate again.

I never, ever use SQL directly on the database to make schema changes. It violates the protocol. All changes are in migrations, all of them.

A few Rails migration tips:

  • Create all tables as Rails models which create migrations. It adds a bit of code to the project, but makes it easy to test the database or create web views of the data later. For example:
1
$ rails generate model Table description:string price:decimal{12-2}
  • Do not be afraid to use raw SQL in migrations. That which Rails migrations cannot do, can be done with SQL in migrations. I use it all the time to transform or backup data. For example:
1
2
3
4
5
6
...
# Fixup codes
execute "UPDATE securities SET coupon_type = 'FIX' WHERE coupon_type = 'LKD'"
execute "UPDATE securities SET coupon_type = 'FIX' WHERE coupon_type = 'ADJ'"
execute "UPDATE securities SET coupon_type = 'FIX' WHERE coupon_type = 'SPC'"
...
  • Make all up migrations reversible. This is easy as Rails takes care of most of these as long as the migrations are non-destructive. For destructive migrations, such as when you are moving data to new tables or removing columns, I create temporary tables or dump files to save the data being deleted, and reverse these in the down part. Only when I am very sure that these changes are fixed and deployed to production do I create migrations to drop these temporary tables – these being the only non-reversible migrations I have. For example, in an up migration for a changing table, I first back it up:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
def up
  execute %Q{
     CREATE TABLE ex_table (
         id     INTEGER     NOT NULL,
         column_1 character varying(255),
         ...
     CONSTRAINT ex_table_pkey UNIQUE(id)
 );}

    execute %Q{
     INSERT INTO ex_table (id, column_1, ...)
     SELECT id, column_1, ...
     FROM table;
 }

  remove_column :table, :column_1
end

In some cases, I even dump data to files to enable reverses or just to have backup copies. For example, in a down (reverse) migration where I dropped a table in the up, I may have:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
def up
  # Save the data for Justin Case :)
  execute "COPY (SELECT cusip, price_date, price, yield) FROM historical_prices) TO '/tmp/hp.csv' WITH CSV;"
  drop_table :historical_prices
end
  
def down
    create_table :historical_prices, { id: false } do |t|
      t.string :cusip, limit: 9
      t.date :price_date
      t.decimal :price, precision: 12, scale: 4
      t.decimal :yield, precision: 12, scale: 6
    end

    # Get it back
    execute "COPY historical_prices (cusip, price_date, price, yield) FROM '/tmp/hp.csv' WITH CSV;"

    add_index :historical_prices, :cusip
    add_index :historical_prices, :price_date
end
  • One big negative of Rails migrations when creating tables is that they create an id field automatically. This is useful if running a Rails web app or using ActiveModel. But for tables that have no need of these, or are being accessed by applications that do not need id columns, here’s how to get rid of it: Just add { id: false } to the create_table line as in:
1
2
3
4
5
6
7
8
9
10
class CreateNoIDTable < ActiveRecord::Migration
  def change
      create_table :no_id_table, { id: false } do |t|
          t.string :my_key, limit: 9
          t.string :my_text

          t.timestamps
      end
  end
end

You can also get rid of the Rails created_at and updated_at columns by commenting out the t.timestamps code.

  • Seed data in migrations. Sometimes you need to put data into tables when creating them, for example when creating code to string reference tables. Put the data into the migration and have the migration load it. For example, using Rails model creates:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
class CreatePurposeClasses < ActiveRecord::Migration
  def up
      create_table :purpose_classes do |t|
          t.string :class_code, limit: 4
          t.string :class_name, limit: 32

          t.timestamps
      end
      
      add_index :purpose_classes, :class_code, unique: true

      # Populate
      PurposeClass.create!(class_code: 'AUTH', class_name: 'Authority')
      PurposeClass.create!(class_code: 'BAN', class_name: 'Bond Anticipation Note')
      PurposeClass.create!(class_code: 'BLDG', class_name: 'Building ')
      ...

Deploying Database Schema Migrations

I have been using Capistrano for years to deploy Rails applications. And it works well. I use it now to deploy my main database Rails project that contains all the necessary migrations to all servers.

It takes just one command to send the code over and one more to make all the changes and I can be sure all my staging and production databases are perfect.

To deploy:

$ cap production deploy

To migrate:

$ cap production deploy:migrate

I prefer to do this in two steps in case the initial deploy fails, in which case Capistrano rolls it back safely without affecting the database. I do worry about Mr Murphy.

In order to choose which server and database to migrate, I use Capistrano’s multistage extension. Each database gets a stage. For example, I have the following at the top of my main project’s config/deploy.rb file:

1
2
3
4
# Enable multi-stage support
set :stages, %w(staging analytics production)
set :default_stage, "production"
require 'capistrano/ext/multistage'

I then have separate files in the config/deploy/ folder for each server (stage) that sets the server names and roles. For example the config/deploy/analytics.rb file sets the :db role to the analytics database server, and the config/deploy/production.rb file sets :db to the production server.

I can then easily run:

$ cap analytics deploy:migrate
$ cap production deploy:migrate

I really rely on Capistrano’s ability to rollback code deploy errors, and on Rails migrations ability to rollback database migration errors to prevent massive failure situations.

The Benefits

I get a lot of benefits from using Rails migrations and Capistrano deploys (Magic Migrations):

  • Database management is part of my development process, not an add-on, separate process to be scheduled and managed.
  • I do not need to ‘remember’ the state of databases.
  • I do not need to document changes, I have a log of them in the migration files.
  • I do not need to insert data into tables after migrations as the seed data is included.
  • One command to deploy and one to migrate. I get my weekends back.
  • If anything goes wrong, it can be undone and rolled back as if nothing has happened.
  • I just know that each database has the latest and correct schema for all my systems and can access them with confidence.

So here we are, I’m running one of these right now that is significantly changing the core architecture of my firm’s platform and instead of manually doing the work, or staring helpless at the process, or even having to worry if it fails, I’m writing this post.

Follow the author as @hiltmon on Twitter and @hiltmon on App.Net. Mute #xpost on one.

View on an Insider as CEO for Microsoft

Today, Microsoft announced Satya Nadella as its new CEO, a 22-year veteran of the business and a true Insider. I wish him the best of luck and success. But is choosing an Insider as CEO good for a mature Microsoft or bad?

I think its a bad call based on documented history of other mature companies and my own selfish (and very positive) wishes for the firm.

And then there’s this, which makes no Outsider sense and means nothing to anyone, from his first letter to employees:

I believe over the next decade computing will become even more ubiquitous and intelligence will become ambient. The coevolution of software and new hardware form factors will intermediate and digitize — many of the things we do and experience in business, life and our world.

Ambient? Coevolution? Intermediate? Huh?

Bill Gates’ “a PC on every desktop” was a far better start.

Why an Insider could be good for Microsoft

Well, it is possible. I think the biggest reason an Insider is good for Microsoft is that the Microsoft customer has been happy with “The Microsoft Way” since Windows 95 and is loath to change. This has been proven by the measurable disaster that is Windows 8 Metro and Microsoft’s response in 8.2 bringing back the old ways. An Insider understands the Microsoft customer’s comfort zone and will work within these constraints, whereas an Outsider would likely challenge this status quo in order to assert their vision. The vast majority of Microsoft’s customers do not want change, nor, typically, do Insiders.

On the other side of the business, Investors and Wall Street treat Microsoft as a blue-chip stock. They expect stable revenues, stable growth, stable product lines and a stable management team. An Insider delivers here too, an Outsider may shake things up too much. These folks too hate change.

With an Insider at the helm, we’ll get more of the Microsoft same. And that, friends, is what Microsoft customers, users and investors want. And that could be good for Microsoft.

In short: Insiders bring less change, and more customer comfort. Beige is safe.

Why an Insider is bad for Microsoft

If you follow the five stages of business growth1 (Existence, Survival, Success, Take-Off and Resource Maturity), Microsoft has certainly hit maturity. Which means it’s at the top, flat part of the S-curve where growth stagnates and begins to turn down (see Technology Life Cycle). If nothing is done, the business will slowly die, as lots of other companies have done in the past.

The only way to grow a mature business is to research and develop a new strategy or product-line and ride up a new S-curve. Insiders, traditionally, have been more worried about maintaining market share and existing product lines and are averse or blind to new strategies. This is not good for Microsoft. Outsiders bring new ideas, new research and the will to try new things. Usually they encourage and create the magical “innovation” thingamajig that creates new S-curves and grows businesses. Without this drive, Microsoft will slowly shrivel away.

An Outsider also comes in without belief, history or baggage. An Insider commonly believes the internal Cargo Cult view of the business which is regularly different to reality. They carry the baggage of years of politics, inefficiencies and compromises that led to the current stagnation of the business. And they unintentionally wear blinders to the truth, faults and opportunities because these things do not appear in their limited field of vision. They are, after all, only human. Which is bad for Microsoft. An Outsider comes in with no such preconceptions, no history and fresh new ideas, no blinders. An Outsider can see the current business faults as their field of vision is not limited in any way. And they can, and are expected, to fix them.

In short: Outsiders bring change, more growth and that is good for a mature business. Beige is boring, old and belongs in the past, time for a new color palette.

Which will it be? Good or Bad?

Will the new CEO try to squeeze the most out of the current S-curve, or grow new S-curves in spite of his Insider status? Will he have the insight to see beyond Insider blindness and then have courage and opportunity to research and chase new S-curves. And will the customer base, investor base and organization help or hinder?

We’ll see. Time will tell.

In my humble opinion, however, an Insider was the wrong choice. No matter how amazing Nadella is (and this author assumes he is seriously good), he carries Insider baggage, Insider views and Insider tendencies. And this will be bad for a large, mature business with no new S-curves to grow on, a lot of stable and falling S-curve businesses facing stiff competition, and a whole bunch of legacy customers and baggage to carry forward.

Then again, I am not part of the Microsoft customer majority. I seek innovation and change and cool new technologies, not more of the same. I want Microsoft to change and grow and leverage the amazing talents it has. I want it to compete and shake up the status quo. I just don’t see an Insider making that happen.

And thustly begins the end of Microsoft.

Follow the author as @hiltmon on Twitter and @hiltmon on App.Net. Mute #xpost on one.


  1. The Five Stages of Business Growth, Churchill and Lewis, Harvard Business Review, May-June 1983.

First They Came For…

A Modern Version

First they came for the record stores
and I didn’t speak out
because I used iTunes online.

Then they came for the bookstores
and I didn’t speak out
because I used Kindle online.

Then they came for the technology stores
and I didn’t speak out
because I used Amazon online.

Then they came for the grocery stores
and I didn’t speak out
because I used FreshDirect online.

Then they came for the restaurants
and I didn’t speak out
because I used SeamlessWeb online.

Then they came for the shoe stores
and I didn’t speak out
because I used Zappos online.

Then they came for the home-wares stores
and I didn’t speak out
because I used Soap online.

Then they came for the stationery stores
and I didn’t speak out
because I used Staples online.

Then they came for the drug stores
and I didn’t speak out
because I used CVS online.

Then they came for the furniture stores
and I didn’t speak out
because I used IKEA online.

Then they came for the newspapers
and I didn’t speak out
because I used a web browser online.

Then they came for the coffee shops
and I didn’t speak out
because I make my own.

And then there was nowhere to go
and nothing left to do.

The Hiltmon 2014

The Original Version

First they came for the Socialists, and I did not speak out
Because I was not a Socialist.

Then they came for the Trade Unionists, and I did not speak out
Because I was not a Trade Unionist.

Then they came for the Jews, and I did not speak out
Because I was not a Jew.

Then they came for me
and there was no one left to speak for me.

Follow the author as @hiltmon on Twitter and @hiltmon on App.Net. Mute #xpost on one.

More TextMate 2 Basics

Previously, I wrote about the TextMate 2 Basics that I use all the time, and I recommend you read that post first. This post follows up with a mish-mash of more tools, ideas and tricks that I also use surprisingly frequently.

Text Tools

TextMate 2 tries to reduce the number of keystrokes you need to make to generate good code. There are many ways it does this but today I want to talk about auto-pairs and completions.

Auto-pairing

Auto-pairing is when you type in a character that should be paired in code, and the editor ensures the paired character is also inserted. For example, pressing the ( key inserts a ) as well (and where the caret is between the paired characters). This works well for all brace keys ([, ( and {) and quote (' and ") characters.

But TextMate takes it to the next level:

  • Wrap Selected: Select some text and press the opening key in an auto-pair set. Instead of replacing the selection with the key pressed like other editors do, TextMate wraps that selection in the correct pair.

    For example, to to correctly bracket an expression such as a / b + c, select the b + c bit using ⌥⌘← and hit ( to get a / (b + c).

  • String Interpolation: In Ruby, you can interpolate a variable’s value in a string using the #{} construct. TextMate is aware of the context and if you press the # key inside a string that can be interpolated, TextMate uses auto-pairing to wrap the selection in the interpolation braces.

    For example, to convert user_name in the following line puts "Name: user_name", select user_name and press # to get puts "Name: #{user_name}".

    Note also that if the string does not support interpolation (a single quoted string), pressing # inserts a # character only. Smart.

Of course the big problem with auto-pairing is that in most other editors, you need to then navigate past the closing pair character to continue working. In TextMate, if you type the closing character manually, it knows, and just moves your caret along without duplicating the close. Or you can use ⌘↩ to go to a new line, leaving the closed pairs behind, or ⌘→ to navigate over all the closes.

Tab and Esc Completions

TextMate has two kinds of completions, “tab” (⇥) completions and “esc” (⎋) completions.

Tab completions were invented in TextMate and have improved in TextMate 2. Tab completions operate by typing in a few letters and pressing the tab (⇥) key. TextMate attempts to match the characters before the cursor to the tab completions available for that language or context, and if a match is found, it puts the completion in. For example, in Ruby, typing def⇥ will insert def function_name\n\nend and highlights function_name for you to overtype.

You can find the currently available tab completions in the cog menu at the bottom for each language. The best way to learn them is to see what is available and then start using them. For example, in Ruby, I always use cla⇥ to create Ruby classes, mod⇥ to create Ruby modules and ea⇥ for quick each loops. I strongly recommend you check out and learn the tab completions for our favorite languages in TextMate. You will save a ton of keystrokes.

Note that TextMate is aware at all times on the language context you are in. Which means that different language completions are available in different parts of a code file. For example, in a Rails .erb file, HTML completions are available unless you are inside a <% ... %> construct, in which case, Ruby tab completions work.

Esc completion saves you keystrokes within a code file by completing function and variable names that exist in the current file. To get project level completions, you need to look at ctags which I intend to cover in a future post.

Start typing a function or variable name that already exists and press the esc (⎋) key to see the first recommended match. Press ⎋ again to find another match or cycle through the choices.

Just remember, in completions, tab (⇥) is for code, esc (⎋) is for names.

The File Browser

The File Browser has changed completely in TextMate 2 and it took me a while to get used to it. This is because the new file browser is more like a Finder window than the old project file manager.

To switch between the editor and the File Browser without using the mouse, hit ⌃⌘⇥ (Control-Command-Tab). You can then use the arrow keys to navigate the tree. Use ⌘↓ to open the selected file (just like Finder). When in the editor, to reveal the current file in the file browser, hit ⌃⌘r (“Reveal”).

If you mouse instead (like I do), single-click the file icon to open it in the editor. Single-clicking the name just selects it (just like Finder). Double-clicking the file name opens the file too. Once you get used to single-clicking the icon, opening files becomes so much easier.

Editor Tabs

The tabs at the top are also smart in TextMate 2. If you have the file browser open and request a file that is in the same tree as the currently open folder, it opens in a new tab. If not, it opens in a new window. TextMate therefore tabs or windows depending on the context of the file, automatically determining project membership. This even works when you use mate <filename> from the command-line.

We all open a lot of tabs as we work, especially in Rails development. And it’s a pain to close each tab individually. If you ⌘-click on a tab close button, all other saved tabs will close.

You can also drag a tab out of the tab bar (or double-click on it) to move it to a new window. If you want it back, use the Window / Merge All Windows menu. Currently this merges all windows into one, here’s hoping they use the smart logic for creating tabs and windows to find a way to merge to project-based windows someday.

Fonts and Themes

Since we all have different tastes and preferences, TextMate comes with a lovely set of built-in themes which are fully customizable. You can change the theme from the View / Theme menu.

To install a new theme, download a .tmTheme file and double-click it. Then look for the new theme on the View / Theme menu. I use my own CombinedCasts.tmTheme theme (See Multiple Themes in TextMate 2 – The Hiltmon) but there are hundreds out there to choose from.

In my case, I have set TextMate to be the default editor for all my script file formats, from .sh to .rb and .py. I also use QuickLook a lot to browse code files before opening them. If TextMate is the default for a file, it’s QuickLook generator is used and it now renders code files using your selected theme. A nice touch.

One recent change in TextMate 2 has been the way it accesses and uses fonts. I highly recommend using the View / Fonts / Show Fonts menu at least once after changing a theme to select the font and size that you prefer. Some old themes use incorrect names for fonts and the new model guesses incorrectly. If you have a global .tm_properties file and set your font there, make sure the name is the correct there too.

Script Writing Tips

If you use TextMate to write Shell, Ruby or Python scripts, use the environment string (or “shebang” #!) to make things easier to run:

  • Start each file with the right environment string. This helps the shell know what runtime to use to execute the file. Type env⇥ at the top of the file and TextMate will insert the correct environment “shebang” into the file. It will also mark the file as executable on first save. So instead of running ruby my_script.rb, you can just type my_script.rb on the command line to run it.
  • Speaking of the command line, hitting ⌃⇧O (the letter O) will open a fresh terminal session in the current folder of the file. But if you prefer the TextMate output window, ⌘r in TextMate will launch the environment specified in the file’s “shebang” and display the output in the output window.

Quick Keyboard Tips

Three additional keyboard tricks I use a lot:

  • ⌃": Toggles between single-quotes (''), double quotes("") and quoted strings (%Q{}). I use this a lot in Ruby as I normally create strings without interpolation (single-quotes) and then need to change it later. It works on the quotes surrounding the cursor.
  • ⌃_: Toggles the selected name between CamelCase (HomeBrew), underscores (home_brew) and nerdCase (homeBrew), useful when you use the wrong convention in code (especially if you code in multiple languages).
  • ⌃⌥⌘V: Brings up TextMate’s clipboard history. Now you can copy, copy, copy, then switch files and paste, paste, paste.

Some Useful Bundle Tips

Source Code Control

If you use git, mercurial or subversion, install the matching bundle. You then get the following benefits:

  • TextMate shows the current status of each file in the file browser. You can see which files have uncommitted changes instantly.
  • The ⌘y key brings up a menu of Source Code Control commands that you can use instead of leaving the editor and using the command-line. Options include seeing the changes to be committed, amending commits, viewing branches and of course, the ability to commit changes. The git bundle even has a config option to enable you to change your global settings.

The TODO Bundle

I don’t know about you, but I often leave TODO and HACK comments in my code to remind me about things, and then forget about them. The TODO bundle in TextMate 2 searches the current project tree for comments with TODO, FIXME, CHANGED and RADAR and displays them in the output window. Just hit ⌃⇧T. You’ll never forget again.

To add your own markers, go to the cog menu, find the TODO bundle and click Preferences…. Add a new marker and replace the regular expression to catch the rest of the comment. Hit Done, restart TextMate and run TODO again to see your new marker matches.

SQL Bundle

I only found this one recently, but it comes in handy a lot. I write a lot of code that queries PostgreSQL databases, which means I write a lot of SQL. Many of the processes and views I write contain embedded SQL statements.

Before finding this bundle, I used to have to copy the SQL from TextMate into NaviCat, test and run it, then copy it back. And I could never be sure that I got it all until I ran the program.

The SQL bundle supports MySQL and PostgreSQL only, but it works rather well. Start by setting up your connections in Cog Menu / SQL / Preferences…. Then, to test a SQL statement, just select it in TextMate and hit ⌃⇧Q. If TextMate picks the wrong database, open Cog Menu / SQL / Preferences… and highlight the correct database then click Done. ⌃⇧Q will run the SQL against the correct database now.

The output window is also a Database Browser, so you can use it to browse databases, their tables and see what their fields are named while coding.

The SQL bundle is pretty basic, but for quick and dirty views and tests it works great.

Fin

So those are some more TextMate 2 features that I use all the time. Hopefully you found a few more gems in there that could help you out.

If you missed the first post, check out TextMate 2 Basics.

If you have any awesome TextMate 2 features or keys you cannot live without, please share them in the comments.

Follow the author as @hiltmon on Twitter and @hiltmon on App.Net. Mute #xpost on one.

Reminder: Update Your Tools

At the start of every new year, I spend a the time to update all my tools and recompile all my products with them. At the cost of a few hours (or at worst, days) work, I get rid of most the maintenance hassles I faced before I used to do this and gain the benefits of all the new features and performance from updated tools.

I understand that, for all of us, updating our tools is surprisingly hard. Things that used to work now fail, incompatibilities have to be addressed and it takes time, perceived as unproductive time, to do. And then there is the risk that updating production servers will cause downtime for no valid reason. It’s easy to feel that we should wait for the next release, or when we have some mythical down-time, or when we’re forced to do so as the versions of our tools become obsolete. And it’s hard to justify updating tools when there’s real work to be done and no current perceived benefit in changing tools.

On the other hand, updating our tools now means we get the performance and feature benefits of these new tools now. We get the incompatibilities from last year out of the way, meaning that the products we make with these tools can move forward more easily (we’re less likely to get stuck with a version dependency). We get to use newer and better versions of our libraries. Our servers get updated, run better and require less maintenance. We get to learn and practice the updated tools ideas and technologies. And the start of the year is the best time to do this as it’s one of the slowest business times.

For me, that’s actually quite a lot of work, but so worth it. Amongst other things, I have:

  • Upgraded to Ruby 2.1.0 as it’s the latest and fastest.
  • Moved our Python code to 3.3.
  • Moved all our Rails applications to 4.0.2 as that is current.
  • Updated to the latest SASS, Rake, Sinatra, Node.JS and other tools.
  • Updated all my Homebrew installs (brew upgrade). This entailed a non-trivial upgrade of PostgreSQL from 9.1 to 9.2 (I had to export and import all my databases)
  • Updated all my CentOS servers and their tools to the new PostgreSQL, Ruby and Rails.
  • Moved all our Windows C++ and C# applications to Visual Studio 2010 from 2008 (we run one version behind on all Microsoft products – been burned before)
  • Recompiled all our C++ projects using the latest clang and gcc on each platform.
  • Updated all Objective-C applications to the latest Xcode and recompiled with ARC and the latest libraries.

What did I get out of all of this work? On the surface, nothing much. The same old applications doing the same old things. But each and every one of them is now ready to take advantage of new tools, new technologies and new features in 2014. And most seem to run just a little better and a little faster and use a tad less memory than before. And best of all I have peace of mind, I know there are no legacy dependencies that are holding me back.

There is no good time to do this, so do it now. Get it over with for the year. Update your tools now. Or you never will.

Follow the author as @hiltmon on Twitter and @hiltmon on App.Net. Mute #xpost on one.

Python: It’s a Trap

TL;DR: Python is the best language for quantitative scripting (because of its libraries). But it’s a trap. Almost all programmers and libraries require Python 2 when Python 3 is out and way better. Choosing to use an old language for new project traps us into supporting old languages, libraries and platforms. And that’s not smart. The intent of this post is to point out this odd stagnation and encourage the migration and adoption of Python 3, or, as it has happened in the past, we’ll have to look elsewhere for a solution.

Over the past few weeks, I have been looking at switching to the Python programming language for quantitative analytics development at work. On the surface, Python looks perfect for this – it has the best quantitative libraries (e.g. numpy, scipy and pandas), it is the language taught in finance schools and it is a beautiful, fast and reliable scripting language to work with.

It’s a Trap!

Admiral Ackbar

In my research I felt there was something wrong with the picture I was seeing, something on the periphery in the corner of my eye. And then I saw it: The stats on Python 2.7 usage and compatibility vs Python 3.3 usage and compatibility make no sense.

As a non-pythonista, my expectation was that after five years, Python 3.x would be the most used and most compatible platform. After 5 years, all libraries should have moved over, all new code should be written in it, all new technologies compatible with it. The reality is that less than 2%1 of all work is done in Python 3 even after 5 years! Python 2 is still 98% of the game. WTF?

That’s where the trap lies.

An old language stagnates, increases support burden and cost, and slows growth. People sticking with old languages are forced to use old libraries, old tools and run on old platforms. The reason languages update is to make them better, faster and to eliminate bad language design decisions. Picking an old platform for a new project traps you into long term support and maintenance of that old platform. That’s the trap.

All this has happened before. All this will happen again.

Pythia

We’ve been there before. Back in the early 2000’s, I was a Perl programmer. Perl was the scripting language at the time, the one with the best libraries (see CPAN), the one with the latest tools and the best compatibility. But as the naughts progressed, Perl started to stagnate as a language. First, they started talking about a non-compatible Perl 6 that never happened, and then the community left for Ruby and Python because they were happening. Perl, it’s ecosystem and its community stagnated. 10 years later, Perl 5 is still with us. But I believed then that sticking with Perl beyond the middle 2000’s would have been a bad idea. And it was for those who stayed.

Back then, the Python 2 community and ecosystem was dynamic and growing, as was the Ruby one. Back then, I chose Ruby over Python by a hair, and it was a great decision for me. Since I started with Ruby, it has gone through three revisions (1.8, 1.9 and now 2.0) and the libraries and community have grown with it. Even when an incompatible version, 1.9, came out, the speed at which the community adopted it was tremendous. Ruby 2.1 came out a few days ago and already way more than 2% of the community is using it.

Other languages too have gone through changes as significant as the Python 3 move over the same period. C#, one of the most popular corporate programming platforms, went from 2.0 to 3.0 to 3.5 to 4.0, all incompatible revisions, yet most new C# code requires 4.0. C++ went to C++11 over this time and the most popular C++ compilers happily compile it.

On the other hand, other languages too have remained weirdly behind, just like Python 2. Java went to 7 but no-one seems to write for it. Oh, they all install Java 7 (in order to get rid of those infernal update dialogs) but still run 6 in production.

It is possible that the reason some languages stay behind and others grow is because of the way the change happened. Ruby, C# and C++ (Clang) all have compilers that not only point out the incompatibilities with old code, but have ways to fix them easily. Maybe the Java compiler and the Python one do not.

Or maybe the reason they stay behind is because they are too well embedded in corporate infrastructure and too much code has been written in them to make it viable to change. That sure explains Java and Python, but does not explain C#.

Or maybe it’s because these languages are taught in schools and the education system is slow to adopt the new version. Or maybe the community around these languages is more conservative, with an “it ain’t broke, don’t fix it” kind of attitude. Or maybe it’s just too popular.

I do not know why people have stuck with Python 2.

I do know that Python 3 has been out for five years and is about to go through its fourth revision. I do know that it improves the way the language works in a myriad of ways, removes some really bad stuff, is stable, mature and runs brilliantly. And yet it has not been adopted.

Since I do not know why everyone is still on Python 2, I cannot offer the right solution. Only some ideas:

  • If the reason people still use Python 2 is the community, then the community should start 2014 by installing Python 3.3 and using that. Put pressure on the library developers to get their products up to speed.
  • If the reason is because your distribution installs an old version, pressure your vendor to put the new version in as default.
  • If the reason is legacy code, spend the time now to fix and upgrade. Or you will get deeper and deeper into the trap.
  • And if the reason is the language provider, then python.org should stop supporting the Python 2 stream. It works for other languages and it will force all users to upgrade.

I still think Python is currently the best language for quantitive analytics and development of ideas, but it is also a trap. The trap being that we really need to use the old version to do what we need to be done. It’s like requiring users to be running Windows XP when Windows 8 is out and mature. This trap is holding everyone back.

If the trap remains, and it seems to be the case, then me choosing Python 2 as our development and production platform for new projects is nuts. I can, and probably will choose Python 3.3 for now as numpy and pandas should work, but I worry that if I wish to use other libraries, they will remain incompatible. And rewrite in C++ for production to avoid the trap.

Unfortunately, no real alternative exists for quantitative scripting. R is way too slow, and few know MATLAB. Java and C++ are just too heavy and require too much effort.

Or, if someone could make a numpy and pandas for Ruby (or even Go or Javascript), I’d use that in a New York minute.

Follow the author as @hiltmon on Twitter and @hiltmon on App.Net. Mute #xpost on one.


  1. “Looking at download statistics for the Python Package Index, we can see that Python 3 represents under 2% of package downloads. Worse still, almost no code is written for Python 3.” in About Python 3

Can’t Innovate Anymore, My Ass

Over the past few days I have seen so many references to the Quartz post 2013 was a lost year for tech that it has become a meme. “Innovation is dead”, “2013 proves it”, “RIP Innovation”, it screams.

Rubbish!

2013 was a great year for innovation in the tech space.

in·no·vate /ˈinəˌvāt/ verb
1. make changes in something established, esp. by introducing new methods, ideas, or products.

Remember, innovation is not about creating something completely new (that’s invention), it is about making changes to established products that significantly improve their abilities or experience. And we got that in spades in 2013.

Innovative Hardware

Apple continued it’s well known strategy of continuous innovation across its product line. The new iPad Air is the iPad they have been trying to make since they released the original one: thin, light, fast, with mind bending battery life. The design and concept of the new Mac Pro is insane, a small portable powerhouse, the first true computer built around Open CL. And the iPhone went 64-bit, so we now have a portable computer in our hands more powerful than all the computers on all the Space Shuttles combined.

Intel leaped forward with its Haswell line of CPUs. This innovative design provided us for the first time in 2013 with the ability to run for more than 10 hours on our laptops without recharging, and without penalizing speed or core count.

Even Microsoft innovated, it took the seriously limited Surface tablet concept of 2012 and made one that may look the same on the outside, but actually works this time, the Surface Pro 2. And their new Xbox One platform wipes the floor with their previous platform as evidenced by first day sales.

Even beleaguered RIM, er Blackberry Ltd, innovated by finally releasing the QNX based Blackberry 10, the next generation Blackberry. And it really is a good one.

Google let its innovation leak out of the labs with Google Glass on the developer program. No innovation, my ass, even SNL had a sketch on it.

And lets not forget the sensational Pebble Watch or the magnificent innovative Leap Motion.

Innovative Software

Since I use Macs and iOS all day at work and at home, I am really only exposed to innovations in Apple ecosystem software, so I’ll point some out here. Would love to know what innovations happened in other platform software too.

Apple release not one but two innovative Operating Systems this year, Mavericks and iOS 7. Mavericks may be the best release of OS X yet, with its innovative power saving techniques, interactive notifications, ability to send maps to your iPhone and they finally figured out how to handle multiple monitors on Macs. iOS 7 is also a remarkably innovative release, not because of the new look or its innovative ability to make folks queasy, but because it is the first 64-bit mobile operating system, and it uses the antennae in such a way as to maximize connection and minimize battery.

And in apps, just to name the first few that come to mind:

  • Marked 2 came out. It used to be a visualizer for Markdown documents, now it’s a publishing ecosystem.
  • Acorn 4 turned the old Acorn on its head and is the best Photoshop alternative out there.
  • Kaleidoscope 2 was released with innovative image diffing. Who did that before?
  • Napkin changed the way we annotate screenshots with its innovative zoom bubbles.
  • Omnigraffle 6 proved that you can innovate on perfection.
  • Ulysses III’s innovations turned the text editor on its head.

And lets not forget all those developers who took the opportunity when updating their applications for Mavericks or iOS 7 to add innovative features instead of just re-skinning.

“Can’t innovate anymore, my ass”.

When Apple’s Phil Schiller said that on stage at WWDC, he was taking a poke at the tech press and the memes that scream innovation is dead. The only place innovation is truly dead is where they have published its obituary.

Follow the author as @hiltmon on Twitter and @hiltmon on App.Net. Mute #xpost on one.

Writers: Thank You

I want to say thank you to a bunch of writers, to people who do what they do because they love to do it, from an anonymous one that reads and appreciates what they do every day.

They may never see this, but, in no particular order, a huge thank you from me to you all.

Tech and Opinions

Matt Gemmell at Matt Gemmell is owed free drinks at my house forever. He may be well known for his tech, tweets and speech making, but the depth, emotion and soulfulness of his recent writing has touched us more deeply than he knows.

Brett Terpstra on his blog at BrettTerpstra.com shares some brilliant, mad-scientist scripts and automation ideas. I use his Slogger and Markdown products all the time, and am always surprised by the creativity, breadth and depth of his ideas.

John Gruber at Daring Fireball is the original and still the best. He has an opinion, but more importantly, he justifies his opinion with clearly written thought and argument. And he never claims to be right. May I write as well as him one day.

Federico Viticci and friends review products at MacStories and its clear that they love what they do. Instead of just reviewing products, they use them, push them hard and render clear opinions on them.

I came late to Shawn Blanc’s blog at Shawn Blanc. And boy is it good. Shawn walks a personal journey with tech, and helps us all connect with it.

Marco Arment writes at Marco.org and pulls no punches. He has an intelligent and opinionated style, but never lets his personality override the facts that he ensures are totally, nerdily correct. Aside: Is nerdily a word? It is now.

Stephen Hackett writes at 512 Pixels and seems to point out all the cool stuff others miss. And for his Josiah, #GOJGO.

The first indie Mac app I ever purchased was Brent Simmons‘ NetNewsWire. He writes tech at inessential.com: weblog and I learn something new from each post.

In the same vein, Craig Hockenberry occasionally writes tech at furbo.org and when he drops a post, it hits with a resounding thud. Many a time his posts have saved me when I have gotten stuck on the same issues.

The mysterious Dr Drang writes at And now it’s all this. You never know what to expect next, a script, an analysis or an opinion piece. But they all make you think.

Ben Brooks hosts The Brooks Review and often has an alternative view of things. Which I like. Maybe too many of us follow the herd, Ben does not, and clearly presents his view.

News

The Beard, a.k.a. Jim Dalrymple presents The Loop with all the latest Apple news. It’s where I hear about most things first. And I just love it when Jim drops in with his own opinion pieces. I have a special supply of Heineken should he ever drop by.

Renee Richie heads the team at iMore and I seem to be reading them more and more. I suppose that’s because the are the anti-Verge.

Of all the big companies that write about tech, only one makes this list today, Ars Technica. Their in-depth articles are usually well balanced and they avoid the link-bait fluff pieces that seem to fill the tech press these days. Lets hope that does not change.

Maciej Ceglowski probably had no idea that the Pinboard: popular bookmarks page would be of any use, but it is. Whenever I am looking for something new that I know I will like, I check this page out. It’s like a good Hacker News because of the kind of people who use Pinboard.

Shopping

I hate shopping, but surprised myself this year by finding two recommendation sites that I actually like. The Sweet Setup is new, and recommends apps, sorting the wheat from the chaff; and The Wirecutter that goes through all the tech specs and mad model numbers to find the best devices so we do not have to.

Thank you all for inspiring, entertaining, informing and sharing. You may never hear from us, your audience, from me, your reader, so hear this now: thank you, we love and appreciate what you do.

Hiltmon Now

Note: I wanted to keep the list short, being part of a long list makes it less special. And I wanted to focus on individuals or small businesses (except for one), doing what they love, not what they are being paid to do.

Follow the author as @hiltmon on Twitter and @hiltmon on App.Net. Mute #xpost on one.

Cannot Find File

“Can you please send me that file again? I cannot find it”

“I’ll have to get back to you, I don’t know where I saved the file I need.”

I hear this every day. Even now, in 2013, more than 20 years after hierarchical file systems became the core and the norm of computing, users still struggle to understand something as simple as knowing the location of their own files in their own computer systems.

I blame the complexity of the past and I have only seen one promising idea that has a way to go but may solve this issue.

The Cause

The hierarchical file system is something that ordinary, non-geek users do actually understand, once they get to it. They do get the concept of folders and the concept of placing files in these folders. The issue is in the complexity of finding the starting folder to work with.

All current Operating Systems have an issue here.

Windows sucks because of drive letters. Instead of just giving users a clean file system like Unix, Microsoft followed the CP/M and MS-DOS model of assigning letters to disk drives, something that users struggle to comprehend. What the heck is A: vs C: vs D: vs J:? Before users can even think of which folder the file may be in, they need to think in terms of which drive letter may contain the folder. Yet some letters are CD Drives that cannot be written to, but which? Some are inscrutably empty, but which? Some only appear at work, but which? It’s too confusing, so users give up and just let the system save where it wills, thereby losing any chance of finding the file later.

Unix sucks because of the /mnt file system. Instead of having inscrutable drive letters, users are faced with inscrutable paths. Worse, these mounts are usually shared between users on the same system. Which means that other user’s mount points exist to confuse the current user. And even worse, ordinary users may have to use actual command-line commands to access files remotely, using tools like scp. Which means that before the user can get to the folder to start looking, they need to remember which computer, server or mount the folder is on. It does not end well.

And OS X sucks these days because of iCloud. Apple may have solved the /mnt mess with it’s GUI display of volumes mounted on the desktop, then taken two steps back by introducing iCloud. Now the user has to remember which application (even worse than drive letter or mount path) that they used to create the darn document before they can figure out the folder in order to find the file. Assuming they even created the file themselves.

In short, there is too much for the user to know before they can even get to the file system to find files.

The Workarounds

Search tools such as Spotlight on OS X or Windows Desktop search attempt to resolve this mess by indexing files on all available drives. But regular users do not name or label their files properly, or know how to phrase searches in a way that these engines can find what they need. So search engines may be good, but do not work as expected and are ignored.

Some users have worked around this using their email clients. If a file is in their recent emails, they can find it based on who sent it and when. Go beyond a few weeks, however, and it is lost there too.

Other users jam everything onto their desktops. Because that’s the only place they can find things again. And wonder why their computers get so slow.

Document management applications do exist, but your average user does not have them. Web served file stores, like SharePoint or Google Drive all provide search and folders too, but again, most users don’t even think to look there first.

Even the workarounds require more user effort.

The Promising Option

In reality, the majority of computer users hit save, name the file and hope for the best, that they can find the file again whenever it is needed.

The current model requires them to make a bunch of decisions when saving: which drive, which cloud, which mount point, which server and then which folder. It’s too hard and too confusing. So they don’t. And I believe that they shouldn’t have to make these decisions.

I believe the option that holds the most promise in my mind is the technology behind Dropbox. Imagine for a moment that the default save location, the documents folder, on every device on every Operating System was actually a shared Dropbox-like root folder. Then have the File Open and File Save dialogs all default to this starting point. All files, all user folders, all in one place. Which leaves only one concept for users to understand, folder trees, something they do get!

Add a new drive to your computer? It should invisibly enable more storage in this documents folder or show the folders it contains as a folder. Go to work? The shared folders you have access to should appear there, as folders. Heck, work from home and they should still be there (invisibly being routed over secure VPNs without user intervention).

From a user’s perspective, all their folders and all the folders they have access to appear all in one place. As folders, with their files in them. So much simpler.

How do we get there?

Most users do not have a Dropbox account, and will never get one. And old-school IT managers want people to use their server drives for saving, compliance, control and backups. And I know not of a Dropbox like sync engine that just works within companies and homes.

So, if Operating System vendors added a common sync framework, maybe even just buy Dropbox for its technology, then share it amongst themselves, it could work.

Imagine this.

A new user gets a company computer with the operating system of their choice. IT has already set up their corporate dropbox. All the user user needs to do is save and load files to their default documents folder and the network (Dropbox) takes care of sync, shared folders and backup. Corporate shared folders appear as user folders, with special icons.

At home, if the user gets a Time Capsule, server or networked drive, it can broadcast itself as a home documents/dropbox server. The first time the user’s computer sees the device on the network, it asks if the user wants to back up to it and it then takes care of syncing documents to itself. All done. If users want to share files at home, they can set up a shared folder in their Documents folder and the dropbox-like home or work server takes care of routing.

From this perspective, all users have to do is open and save files in the default location in the folders they know. No need to think about drive letters, applications or mount paths. Just folders and files. And their data.

I believe the technology exists to do this, but sadly, we cannot make it happen. The Operating System vendors need to bring it into existence, integrate it into their offerings and make it seamless and accessible to users and network administrators alike, wrapping it in necessary security using the reliability of the Dropbox model. For users, it should “just work”.

Here’s hoping it will happen for regular users in 2014. It cannot happen soon enough for those that work with and support them. Then maybe “I cannot find a file” can go the way of the floppy disk.

Follow the author as @hiltmon on Twitter and @hiltmon on App.Net. Mute #xpost on one.

Managing Multiple Project’s Assembly Notes in One

The problem I have is that I regularly need to open a specific set of Markdown documents scattered all over my file system while I work and have them available at my fingertips. But I need a different set open when working at home and available the same way. The best solution I have found is to use BBEdit projects to manage these document sets. Here’s why and how to do it.

The Scenario

I work across several projects during the day at work and across a different set of projects at night. I would like to have all their Assembly Notes and TODO documents together in one application for reference and update.

But each project I work on follows my preferred Project Folder Layout in the file system which means these files are scattered all over the place. Each project has an Assembly Notes document and a TODO document stored in its documents folder but I want to work with them in one place.

The pain point comes in when I switch projects and need to update the assembly notes or TODO for that project, then get back to another project. Finding, opening and closing these scattered files all day is not optimal.

Also, storing these files in a central location does not work for me as these are all for different clients and are stored in different repositories. I want the documentation to remain part of each project so its easy to zip, back up, copy, sync, commit and share.

The Solution

Since all these files are plain text Markdown files and I usually have them open in BBEdit, I use BBEdit projects to manage each set that I need. During the day I open the work project in BBEdit that contains references to all my work project assembly notes and TODO documents no matter where they are. At night, I open the Noverse BBEdit project which contains references to all current Noverse project assembly notes and TODOS.

And the best part is that BBEdit has a separate section in File / Open Recent for just these project files, so its very quick to switch sets.

To set this up, create a new Project File in BBEdit from the File / New menu. Then drag and drop the assembly notes files and TODO files into the project area at the top left of BBEdit from all the various project folders. Then do the same for the next set.

At the start of the work day, open the work BBEdit project and all your project assembly notes will be available in BBEdit at your fingertips. At the end of the day, close that project window and open the evening Project file and all those files will be now be available in the same manner.

In my case, BBEdit is always running, with one window containing the current set of assembly notes and the other being used as my hammer tool for file and data processing.

Follow the author as @hiltmon on Twitter and @hiltmon on App.Net. Mute #xpost on one.