how to run shell commands from ruby if you care about their output or if they failed

January 31, 2016

Recently I made a new gem called shell whisperer which you might find useful for when your ruby programs need to run shell commands.

Let’s say you want to write a script which prints a summary of the current directory. The desired output is:

There are 213 files in this git repo.
Last commit: fixed a typo

There are two questions we need to ask the shell in order to print this output.

First question: how many files are there in this git repo?

First answer: we can ask git to list the files in the repo, and pipe the list to the word counting command to get the answer:

git ls-files | wc -l

Second question: what is the last commit?

Second answer: we can ask git for the log, limited to the most recent commit, and formatted to include just the first line:

git log -1 --pretty="%s"

So far so good, but how do we run these commands from Ruby?

The language provides two ways that I’m aware of:

# backtick style
`git ls-files | wc -l`

Or:

# system style
system 'git ls-files | wc -l'

What is the difference? I don’t want to go into all of the nuances (see this SO post for that and more) but I’ll share how I think of the difference:

if you use the backtick (`) style, the return value is whatever was output by the command – but only STDOUT, not STDERR, so you’ll miss error messages
if you use the system style, the output from the command will go to STDOUT, as if you had run puts and output some text, and the return value will signify whether the command failed or not

So our program might look something like this:

count = `git ls-files`.each_line.count
message = `git log -1 --pretty="%s"`.chomp
puts "There are #{count} files in this git repo."
puts "Last commit: #{message}"

And this is okay.

The issue becomes: well, what do you if you care that the command might fail? The system style allowed for checking the return value to see whether it succeeded or failed, but there’s a reason we’re not using the system style: we care about capturing the output of the command. So with the backtick style, we can capture the output, but (seemingly) we can’t capture the successfulness.

Well, we can, it’s just a little awkward:

count = `git ls-files`.each_line.count
raise 'list failed somehow' unless $?.success?
message = `git log -1 --pretty="%s"`.chomp
raise 'message failed somehow' unless $?.success?
puts "There are #{count} files in this git repo."
puts "Last commit: #{message}"

Which, OK, kind of cool, but what if we want to know why it failed?

This is possible:

count_or_failure_reason = `git ls-files 2>&1`.each_line.count
raise count_or_failure_reason unless $?.success?
message_or_failure_reason = `git log -1 --pretty="%s" 2>&1`.chomp
raise message_or_failure_reason unless $?.success?
puts "There are #{count_or_failure_reason} files in this git repo."
puts "Last commit: #{message_or_failure_reason}"

Let me attempt to explain this. The 2>&1 part means that we want the STDERR stream to be directed to the STDOUT stream, so that we’ll capture either one (or both). This gives us access to the reason the command failed, if it failed, but still gives us access to the output if it succeeds.

I found myself doing this in multiple places, so I decided to wrap this pattern up in a tiny gem, which allows you to instead write your program like this:

require 'shell_whisperer'
count = ShellWhisperer.run('git ls-files').each_line.count
message = ShellWhisperer.run('git log -1 --pretty="%s"').chomp
puts "There are #{count} files in this git repo."
puts "Last commit: #{message}"

If any of the commands fail, that error message will be re-raised as a ShellWhisperer::CommandFailed exception, so you can handle that as you please.

The node.js community seems to be all about tiny modules, and I think that idea is very cool, and I’m hoping to find more opportunities to do that with Ruby.

how to re-draw the line you just printed in Ruby, like to make a progress bar

December 14, 2015

Here’s something I learned recently. Let’s say you have a program that is going to take a long time, and you want to mark the progress over time. You can print out some information like this:

tasks = Array.new(1000)
tasks.each.with_index do |task, index|
  sleep rand(0..0.1) # (something slow)
  percentage = (index + 1) / tasks.count.to_f
  puts "#{(percentage * 100).round(1)}%"
end

Which looks kinda like this:

progress bar before picture

Which is, let’s say, serviceable, but not, let’s say, beautiful. It stinks that it printed out all those lines when it didn’t really need to. I would rather it had sort of animated while it went. But how is this done?

This is one of those questions that’s itched at the back of my mind for a while and which, when I finally googled it, was a bit disappointing. It’s just another unix escape character, like \n (which prints a new line). It’s \r, which I now think of as “the backspace to the beginning of the line” magic character.

Armed with this knowledge and some clunky math we can write something like this:

begin
  tasks = Array.new(1000)
  tasks.each.with_index do |task, i|
    width = `tput cols`.to_i
    sleep rand(0..0.1) # (something slow)
    percentage = (i + 1) / tasks.count.to_f
    summary = "#{(percentage * 100).round(1)}% ".rjust("100.0% ".length)
    remaining_chars_for_progress_bar = width - summary.length - 2
    chunks = (percentage * remaining_chars_for_progress_bar).ceil
    spaces = remaining_chars_for_progress_bar - chunks
    bar = "\r#{summary}[#{ '#' * chunks }#{' ' * spaces}]"
    print bar
  end
rescue Interrupt
  system "say 'I was almost done, jeez'" if RUBY_PLATFORM.include?("darwin")
end

progress bar after gif

Probably you shouldn’t use this – there’s a very nice gem called ruby-progressbar which will work across platforms and lets you customize some things. But it’s nice information to have, because now you can do things like this:

barnyard

I’ll leave it as an exercise to the reader how to write this one.

how to tell ruby how to compare numbers to your object with coerce

November 9, 2015

Let’s say you have some object that represents some numeric idea:

class CupsOfCoffeePerDay
  include Comparable

  MY_LIMIT = 3

  def initialize(num)
    @num = num
  end

  def <=>(other)
    num <=> other
  end

  def risky?(threshold: MY_LIMIT)
    self > threshold
  end
end

CupsOfCoffeePerDay.new(4).risky? #=> true
CupsOfCoffeePerDay.new(4) > 5 #=> false

This object takes in a number and wraps it, and then extends it with some domain-specific logic. Specifically, it represents the idea that there is a threshold to how many cups of coffee an individual can have per day before it becomes risky.

It’s neat that we’re able to compare our custom ruby object to a plain number. All we had to do was include Comparable and then implement the <=> method (also known as “the spaceship operator”) to define how we’d like our object to compare to numbers. We’d like to expose the internal num value, and use that when comparing.

The neat thing is that we get all the comparing methods for free.

We’re not quite done yet, though. Watch what happens when we try to do this:

CupsOfCoffeePerDay.new(4) > CupsOfCoffeePerDay.new(5)

I get this error when I run the program:

app.rb:27:in `>': comparison of CupsOfCoffeePerDay with CupsOfCoffeePerDay failed (ArgumentError)
        from app.rb:27:in `<main>'

What’s happening here?

we create two objects
we ask one object if it’s greater than the second object
our implementation refers to the wrapped number value (num, which is just a Fixnum) and asks it if it’s greater than this second object
the fixnum complains that it doesn’t know how to compare itself to some ranom object

And, fair enough. From the point of view of the number, it has no idea what cups of coffee per day even means.

We could change our implementation to accomodate this use-case:

class CupsOfCoffeePerDay
  include Comparable

  MY_LIMIT = 3

  def initialize(num)
    @num = num
  end

  def <=>(other)
    if other.is_a?(CupsOfCoffeePerDay)
      num <=> other.num
    else
      num <=> other
    end
  end

  def risky?(threshold: MY_LIMIT)
    self > threshold
  end

  protected

  attr_reader :num
end

Note that we had to add those last few lines to make it easier to access the num from outside an instance of CupsOfCoffeePerDay.

This is not bad.

That attribute is marked as protected because so far we can only imagine it being necessary to be used by other instances of CupsOfCoffeePerDay, for the sake of comparison.

(I remember having a long and horrified conversation with a coworker when neither of us could come up with a scenario where you would use protected over private, but it turns out that this is precisely the situation where you would.)

But look what happens when you try this:

4 > CupsOfCoffeePerDay.new(5)

Or this:

[
  CupsOfCoffeePerDay.new(4),
  3,
  CupsOfCoffeePerDay.new(1)
].sort

When I try these, I get errors like this:

app.rb:32:in `>': comparison of Fixnum with CupsOfCoffeePerDay failed (ArgumentError)
        from app.rb:32:in `<main>'

Is there anything we can do to avoid these errors? I think one, strong argument is that we shouldn’t try to. Rather, we should audit our system and make sure that we never mix-and-match our types. If we can do that, that’s probably for the best.

Except… this is Ruby, and Ruby always has another trick up its sleeve.

Check it:

class CupsOfCoffeePerDay
  include Comparable

  MY_LIMIT = 3

  def initialize(num)
    @num = num
  end

  def <=>(other)
    if other.is_a?(CupsOfCoffeePerDay)
      num <=> other.num
    else
      num <=> other
    end
  end

  def risky?(threshold: MY_LIMIT)
    self > threshold
  end

  def coerce(other)
    [CupsOfCoffeePerDay.new(other), self]
  end

  protected

  attr_reader :num
end

There’s not a ton of documentation about this. I only found it by luck. I was looking to understand how Ruby numbers does its comparisons, and I opened up pry (with pry-doc installed), and started exploring:

$ gem install pry pry-doc
$ pry
> 4.pry
(4)> show-source >
From: numeric.c (C Method):
Owner: Fixnum
Visibility: public
Number of lines: 17

static VALUE
fix_gt(VALUE x, VALUE y)
{
    if (FIXNUM_P(y)) {
        if (FIX2LONG(x) > FIX2LONG(y)) return Qtrue;
        return Qfalse;
    }
    else if (RB_TYPE_P(y, T_BIGNUM)) {
        return FIX2INT(rb_big_cmp(rb_int2big(FIX2LONG(x)), y)) > 0 ? Qtrue : Qfalse;
    }
    else if (RB_TYPE_P(y, T_FLOAT)) {
        return rb_integer_float_cmp(x, y) == INT2FIX(1) ? Qtrue : Qfalse;
    }
    else {
        return rb_num_coerce_relop(x, y, '>');
    }
}

At this point, I thought oh no! C!

But like, this is so cool: this is the implementation of the greater than method in numbers in Ruby, and it’s totally discoverable if you open pry and ask it to show-source.

I don’t really know C, but if I squint, I can tell that this is doing something kind of reasonable. It seems to be checking the type of the second value (the one you’re comparing the current value to) and doing something different based on the type. The final branch of logic is when the type is unknown. Bingo. Our CupsOfCoffeePerDay type is definitely unknown. In that case, it calls rb_num_coerce_relop. Unfortunately, when I asked pry to show-source rb_num_coerce_relop it didn’t know how.

Thankfully, it printed the filename this source code can be found in, so I went to the ruby source code and searched for a file called numeric.c. Within that, I searched for the rb_num_coerce_relop function. It takes in the two objects (the CupsOfCoffeePerDay and the number) and the operator (>). Its source looks like this:

VALUE
rb_num_coerce_relop(VALUE x, VALUE y, ID func)
{
    VALUE c, x0 = x, y0 = y;

    if (!do_coerce(&x, &y, FALSE) ||
	NIL_P(c = rb_funcall(x, func, 1, y))) {
	rb_cmperr(x0, y0);
	return Qnil;		/* not reached */
    }
    return c;
}

What does that do? It looks like it coerces the two types to be the same type, and then calls the > function on the first one, passing the second one. (Again: squinting).

So do_coerce is where the interesting part happens. I’ll just link to it because it’s pretty long. But the cool thing in it is that it checks if the first object implements a coerce method, and if it does, it does something different. So then it becomes a game of figuring out how to write a coerce method and finding out, via stack overflow (of course), that you can add this magic coerce method, and it will take in the second object, and it’s expected to return an array of compatible types, with the second object’s value first, and the first object’s value second.

So. Now that we know about coerce, our objects can be really simple, but they can still be compared bidirectionally.

begin rescue else

October 20, 2015

Quick ruby tip kinda post.

Today I learned, this is a valid Ruby program:

begin
  raise if [true, false].sample
rescue
  puts "failed"
else
  puts "did not fail"
end

I’m used to using else after an if, but not after a rescue. This is like saying “do this thing. if it fails, do this other thing. if it doesn’t fail, do this other, other thing.

Huh!

(Via rails)

how to log all input in your pry rails console

October 14, 2015

Many Rubyists use and love the pry gem for adding breakpoints to their programs and inspecting objects. Super useful. Some others use the pry-rails gem to use the pry REPL in place of irb for the rails console.

Let’s say you want to log all of the activity that occurs in your rails console. This could be a nice security thing. Maybe you’re just nostalgic for old times.

Pry has something called an “input object”, which you can override in your configuration. The object’s responsibility is to feed ruby code to Pry, line by line. By default, it uses the Readline module. I don’t know a ton about readline, but I gather that it’s wrapping some standard unix program, which means it sort of feels natural. For example, you can input Control+l and it will clear the screen; gets.chomp doesn’t do that kind of thing.

So, Readline is great. We want to use it. We just kind of want to wrap it. SO let’s see what that looks like.

First: where do we actually put our configuration?

You can put a .pryrc file in the root of your project. You can even put Ruby code in that file. I think that’s the official way to do it. But I don’t know… it doesn’t get syntax highlighting because it doesn’t have a .rb file extension… I put my configuration in a Rails initializer named config/initializers/pry.rb, and that works fine too.

class LoggingReadline
  delegate :completion_proc, :completion_proc=, to: Readline

  def readline(prompt)
    Readline.readline(prompt).tap do |user_input|
      logger.info(user_input)
    end
  end

  private

  def logger
    @logger ||= Logger.new('log/console.log')
  end
end

Pry.config.input = LoggingReadline.new

The important thing for custom input objects is that they implement the readline method. The method takes in a string that holds the current user prompt, and it is expected to return a string that holds the next line of Ruby code for Pry to evaluate.

If pry is a REPL (read evaluate print loop), the custom input object assumes the responsibility of the first letter, and thats’ it.

It doesn’t strictly need to ask the user for input. It could just return some nonsense.

But, this one does. We can summarize what it does as: ask the dev for a line of input, but first log it to a file before returning it to pry for EPL-ing.

There’s one line that’s kind of strange:

delegate :completion_proc, :completion_proc=, to: Readline

What’s that about?

Well, I’ve learned, it’s just kind of a necessary thing to make sure your custom input object seamlessly behaves like the default pry input behavior. Let me explain.

Readline, by default, has some strategy for tab completing when you start to write something, and then press tab. That strategy is a proc object. The default one has something to do with irb I guess?

$ irb
>> Readline.completion_proc
=> #<Proc:0xb9964ce0@/home/max/.rubies/2.2.3/lib/ruby/2.2.0/irb/completion.rb:37>

But! When starting pry, it has a different completion proc!

$ pry
[1] pry(main)> Readline.completion_proc
=> #<Proc:0xb8a0c25c@/home/max/.gem/ruby/2.2.3/gems/pry-0.10.2/lib/pry/repl.rb:177>

But when you provide a custom input object, pry doesn’t replace the completion proc on readline because you seem not to even be using it, so why bother? But in this case we totally are using it, we’re just wrapping it.

At first, I thought this was a bug with Pry, and I opened an issue to complain about it, but while writing this blog post I realized that it’s kind of not a bug, and this delegation approach is probably fine.

fake tools

October 11, 2015

Last week, Flatiron School launched a new online learning program called “Learn Verified”. The launch was accompanied by a letter from the founders, Avi and Adam. This section jumped out to me:

Real Tools

You can’t learn real skills with fake tools. As much as you can learn in a simulation, you can’t become a competent surgeon without picking up a scalpel or pilot without stepping into an airplane. Yet, online learning platforms today teach people using in-browser, simulated coding tools (often referred to as REPLs) and multiple choice quizzes which, while helpful, can never bring a student to the level of competency required of a professional software engineer.

Learn requires students to use the same tools and workflows that professional software engineers use on the job. From the start, students work in their terminals using git-based workflows. They’re taught to master the craft using the tools of the trade.

On the most recent episode of metaphor loop, we talked about the different styles of learning. One of them was kinesthetic learning, which we can say is like “hands on” learning. I hadn’t heard the term until Vaidehi told me about it and I realized I identified with it. I think kinesthetic learners are the same ones who will identify with Learn Verified’s emphasis on using “real tools”, because they’ll get to get their hands on the material they’re learning in a more direct, free-to-explore way. That market of learners has been underserved by the existing solutions, and I wonder if Flatiron will be able to pull off an online learning environment just for them.

I do kind of chafe at the idea that “you can’t learn real skills with fake tools”, though. It feels like a pretty inflammatory position to take. I’m reminded of Lost’s great “Don’t tell me what I can’t do!” catchphrase. Aren’t some people different from other people? And not to be all metaphysical here, but aren’t all tools kind of fake tools? Where do you draw the line? idk.

participate in society

October 3, 2015

If there’s one piece of advice I feel comfortable giving, it’s this: participate in society. Find it in your introverted self to join in.

When you hear vaguely that you ought to be using zsh instead of bash, but when you try it you’re not sure why it’s better, stick with it. When you hear that oh-my-zsh is a good way to manage your zsh configuration and you think the name is dumb and don’t want to use it, get over yourself, because months later you’ll find yourself desperately googling to see if anyone else is using zsh in tmux on el capitan and experiencing a weird behavior where option-backspace isn’t working (it’s supposed to delete backwards a full word), but only when in tmux, and you can’t really find anything, just throw out your zsh configuration and use the thing that people use. You won’t regret it.

Anyway, that’s just my one piece of advice.

announcing metaphor loop

September 18, 2015

Last month I started a podcast but I didn’t really publicize it because I wasn’t sure I was going to keep doing it. I put it on this site, hidden down in the footer, and I put it on iTunes, and I tweeted a few cryptic things like:

there’s a slight chance I’m editing a podcast pic.twitter.com/TGoacn3LCj
— Max Jacobson (@maxjacobson) August 23, 2015

Today I released the second episode, which makes it real enough that I’m ready to share it.

metaphor loop

(Click on the art to visit the show homepage, which has subscription links)

It’s about “how we think about programming”. I started out with an agenda, which was to argue that figurative language is the best tool for teaching code, or something like that, and I’m finding that I’m not sure what I think anymore, but I’m excited to keep exploring the ways people build understanding by interviewing programmers about what goes on in their heads.

The first episode was an interview with my old friend Corey Mendell. I had a really good time recording and editing it and the few people I shared it with seemed to like it too. I was immediately addicted.

I don’t think I have it in me to do it on a weekly basis like a lot of podcasts. Today I’m sharing the second episode, approximately a month after the first. So, monthly? Maybe.

This one features Vaidehi Joshi who I don’t even really know, but whose blog I really like. I’m happy with how this one turned out and think you’ll like it.

a few details

I’m not planning to have ads. I want this to be a fun, pure endeavor that makes no money. I’m not tracking subscriber counts or anything like that. I don’t want to know.

I’m licensing it under a creative commons license, because

why not?
I wanted to include some CC-licensed music, and it had a “share-alike” clause
so why not?

By the way: part of the reason I made this is that I really like listening to podcasts and my ego would often hear them and think “why don’t they invite me on?” One thing I’m realizing is that inviting people on is kind of hard, because you don’t know if they’ll want to do it and maybe they’ll think it’s dumb? So: if you listen to some episodes and think you’d like to be on the show, let me know, and we’ll at least have a phone call and if it feels like a show, it’ll be a show.

hardscrabble.net/metaphorloop

pro tip open firefox tabs in background

September 8, 2015

One good thing to know if you’re a firefox person: visit about:config and poke around, configuring things.

Here’s what happened today: I was watching a youtube video while browsing Twitter via Tweetbot. I clicked a link, which opened a new tab, pushing my video into the background. I diligently clicked the video’s tab to bring it back to the foreground so I could continue passively watching it while browsing twitter.

Then I clicked another link, and instinctively clicked the video’s tab to bring it back into the foreground again.

By the third time I did this, I realized I really wished there was a setting to automatically open tabs in the background. I tried googling it, but wasn’t really finding anything. So I checked about:config and searched through for “background”. The screen is a list of every configuration you can control. Many of them are boolean attributes, which can be toggled by simply double clicking the attribute.

I saw one, browser.tabs.loadDivertedInBackground;false and thought “hmm, maybe?” At this point, I’m not certain there’s even a configuration that does this, but I try toggling it… and … click a link from a tweet… and…

It did what I wanted. Sweet.

eighty character lines

September 5, 2015

Last month we talked about RuboCop, which analyzes your Ruby code and nitpicks it. One of its most difficult to follow suggestions is to keep your lines of code no longer than 80 characters.

The creator of rubocop, bbatsov, explained his perspective on his blog:

We should definitely have a limit – that’s beyond any doubt. It’s common knowledge that humans read much faster vertically, than horizontally. Try to read the source code of something with 200-character lines and you’ll come to acknowledge that.

I’m totally on board with the short lines train. For me, it only gets tricky when dealing with nested stuff (examples to follow) which add a lot of space to the left of the first character of code. For example:

module MyGreatGem
  module SomeOtherNamespace
    module OmgAnotherNamespace
      module LolYeahOneMore
        class SomethingGreat
          class SomethingOk
            class MyGreatClass
              def initialize
                puts "OMG I only have 64 characters to express something on " \
                     "this line! And now it's more like 'these lines' haha"
              end
            end
          end
        end
      end
    end
  end
end

Often strings are the first thing to get chopped up, as in that example.

The only approach I thought of to deal with that is to organize my code differently to not use many nested namespaces. That’s probably not the worst idea, honestly, but I’m writing this post to share an interesting style I observed in the wild (read: on github) that takes a whole nother approach:

# Excerpted from:
# https://github.com/net-ssh/net-sftp/blob/ebf5d5380cc533b69b308baa2e396e4a18abc900/lib/net/sftp/operations/dir.rb
module Net; module SFTP; module Operations
  class Dir
    attr_reader :sftp

    def initialize(sftp)
      @sftp = sftp
    end
  end
end; end; end

Huh! That’s a style I hadn’t seen before. RuboCop has many complaints about it, and I don’t totally love the style, but it’s a very novel and neat way to do it, and it certainly frees up columns to spend on your code if you’re planning to stick to an 80 character limit.

One possible alternative is to define your namespaced class using this shorthand:

class Net::SFTP::Operations::Dir
  attr_reader :sftp

  def initialize(sftp)
    @sftp = sftp
  end
end

If you do that, you get 2 extra characters on each line. Sweet!

One problem: it sort of doesn’t work, at least not in the same way.

If you just look at that example, and imagine that you’re the Ruby interpreter trying to figure out what this code means, how are you supposed to know whether Net, SFTP, and Operations are supposed to be classes or modules? You have to already know by them being previously defined. If they haven’t been defined yet, you are well within your right to raise a RuntimeException to complain that this constant hasn’t been defined yet, rather than try to guess.

Both of the earlier longhand examples were explicitly explaining what the type of each namespace constant is. That pattern works whether you’re defining the module or class in that moment, or “opening” a previously defined module or class to add something new to it. This shorthand, while optimal for line length, only works when opening previously defined constants.

One downside of this approach is that, by relying on all of the namespaces being predefined, it becomes harder to test this class in isolation (it’s probably possible to do it through some gnarly stubbing but, harder). You’re also introducing some requirements about the order in which the files from your code need to be loaded, which feels kind of fragile.

One possible upside comes to mind. When you follow the typical pattern of writing out all the namespace modules and classes, you introduce some room for error: what if in one file you write class Operations by mistake (instead of module Operations)? You’ll get an error. That’s not too bad, honestly.

I think 80 is usually enough but if you’re doing too many contortions to stay in that box, try like 90 or 100, you’re still a good person.