Business Intelligence: Quick 'n Dirty Harvest Report Emails

Preface

As VP of Engineering at Drupal Connect, I managed a number of large scale, concurrent projects. The projects would have anywhere from 2-8 developers and last for months, running between say 500 and several thousand hours. The budgets were 5 and 6 figures. If you have 4 developers on a project, you can easily burn 150-200 hours in a week. And of course, you are always trying to balance the burn rate against the timeline and any client dependencies.

But the main point here, is that there was a lot to focus on and a lot happening at once. Spending 30 minutes every day logging into Harvest and clicking through a bunch of projects or a bunch of employee time entries just wasn't feasible.

The good news here, is that the tools below aren't tied to this specific use-case. Most web apps have an API with a decent client in a handful of different languages. You could just as easily email daily reports from Freshbooks, Freckle, or Basecamp.

The plan

The goal is hit the Harvest API and grab select up-to-date project data, summarize the information, and send out an email. As a small implementation detail, we are going to store our projects and users in local YAML files. Lastly, it's important to note that we will need to get info from both /people and /projects

TL;DR

Sample code ready for you to start hacking away on github.

On to the code

So Harvest has a REST API and a Ruby lib. Actually, they have had a few Ruby libraries, the newest of which is a bit more complicated than we need. Essentially, we just need a wrapper that handles the HTTP connection, authentication, and request/response cycle. For the sake of our example, this old client works fine. Once you fill in your credentials and require the file, you should be able to access Harvest like:

  # create a dictionary of users {id => "first name last name"} and print
  # note that the response body is JSON per our client wrapper, so don't forget to require 'json'
  # http://www.getharvest.com/api/people
  harvest = Harvest.new
  response = harvest.request "/people", :get
  people = Hash.new
  JSON.parse(response.body).each do |p|
    if TRUE == p["user"]["is_active"] then
      people[p["user"]["id"]] = "#{p["user"]["first_name"]} #{p["user"]["last_name"]}"
    end
  end
  p people

Since staff change is a fairly rare occurrence and we want to run this file daily/weekly, we can write the output to a YAML file.

  # remember to require 'yaml'
  File.open('users.yml', 'w') { |f| f.write people.to_yaml }

All of this is pretty straight forward and we can encapsulate it in an object and a method.

We can create a similar method for projects:

  def update_projects_db
    # note we are now using an instance variable for @harvest
    # http://www.getharvest.com/api/projects
    response = @harvest.request "/projects", :get
    projects = Hash.new
    JSON.parse(response.body).each do |p|
      projects[p["project"]["id"]] = p["project"]["name"]
    end
    File.open('projects.yml', 'w') { |f| f.write projects.to_yaml }
  end

At this point, we have two methods we should only need to call on an ad hoc basis to update our (local) project and user YAML files. Our object that encapsulates them should load the YAML on instantiation and as alluded to above, create a Harvest instance variable. The skeleton for something useful might look like:

  class Bigbro
    def initialize
      @harvest = Harvest.new
      #load YAML databases
      @projects = YAML.load_file("projects.yml") if File.exists?("projects.yml")
      @users = YAML.load_file("users.yml") if File.exists?("users.yml")
    end
  
    def update_users_db
      # see above
    end
  
    def update_projects_db
      # see above
    end
  
    # print a project summary for a given range
    def range_summary project_id, start_date, end_date
      # see below
    end
  
    # print a weekly project summary
    def weekly_summary project_id
      # see below
    end
  
    # print a project total to-date
    def project_total_to_date project_id, end_date = nil
      # see below 
    end
  end

The meat of our range_summary method could look like this:

  def range_summary project_id, start_date, end_date 
    response = @harvest.request "/projects/#{project_id}/entries?from=#{start_date}&to=#{end_date}", :get
    entries = JSON.parse(response.body)
    total, users = 0, Hash.new(0)
  
    entries.each do |e|
      users[e["day_entry"]["user_id"]] += e["day_entry"]["hours"].to_f
      total = total + e["day_entry"]["hours"].to_f
    end
    users.each { |k,v| puts "#{@users[k]} had #{v.to_i} hours" }
    puts "PROJECT SUMMARY #{@projects[project_id]} #{start_date} - #{end_date} TOTAL: #{total.to_i}"
  end

So we will get output on each users' hours for a given range as well as the total number of project hours.

The weekly_summary would then just be a glorified wrapper around range_summary:

  def weekly_summary project_id
    day_of_week = Time.now.wday
    past_sunday = (Date.today - day_of_week).to_s
    two_saturdays_ago = (Date.today - day_of_week - 6).to_s
    range_summary project_id, two_saturdays_ago, past_sunday
  end

And it might also be nice to get a simple project total to-date. The code for that would look like:

  def project_total project_id, end_date = nil
    end_date = Date.today.to_s if end_date.nil?
    response = @harvest.request "/projects/#{project_id}", :get
    project = JSON.parse(response.body)
    
    # Harvest was founded in 2006 so this captures all data from any project
    response = @harvest.request "/projects/#{project_id}/entries?from=2006-01-01&to=#{end_date}", :get
    entries = JSON.parse(response.body)
    to_date = 0
    entries.each { |e| to_date += e["day_entry"]["hours"].to_f }
    
    puts "PROJECT TOTAL #{@projects[project_id]} #{to_date.to_i}/#{project["project"]["budget"]} through #{end_date}"
  end

And lastly we would call our code like:

  # http://en.wikipedia.org/wiki/O%27Brien_(Nineteen_Eighty-Four)
  obrien = Bigbro.new
  # initial runs would call obrien.update_projects_db and obrien.update_users_db
  obrien.weekly_summary 1234
  obrien.range_summary 1234, '2013-01-01', '2013-01-31'
  obrien.project_total 1234, '2013-04-15'
  # OR on an array of projects
  [1234, 2345, 3456, 4567].each do |id|
    obrien.project_total id
  end

Our last steps are to write a little bash script that calls ruby and pipes the output into mutt or mail and call this bash script from a cronjob.

  # maybe something like
  # obrien.rb
  require File.join(File.dirname(__FILE__), 'bigbro')
  
  obrien = Bigbro.new
  [
  1234, # project blah 
  2345, # project bar
  3456, # project baz
  4567 # project foo
  ].each do |id|
    obrien.project_total id
  end
  
  # daily-reports.sh
  #/bin/bash
  # obviously going to be different if you have rbenv or rvm and you need to have outgoing mail set up — postfix FTW
  cd /opt
  ruby obrien.rb | mutt -s "[DAILY PROJECTS REPORT]" -- tim@example.com
  
  #crontab
  8 30 * * * /full/path/to/daily-reports.sh

Conclusion

Obviously we could make our code nicer, have a legit local database that had a flag for projects to include, etc. I also wrote a little reminder script that pinged devs to put in their hours each day, it was opt in, promise. And I toyed with the idea of creating chargeability reports, so we could help devs hit annual goals. And it was just as easy to set up a nightly script for one of our admins to summarize hours per user each day. And you get the idea — this is the kind of thing you could throw together in a little Sinatra app in an afternoon and have awesome custom reporting with some easy Javascript graphs (D3, HighCharts, or even Google Charts).

But even without getting fancy, we have a quick 'n dirty solution that can make your life easier by saving you time and focusing your attention on the projects or employees that need it most.