Loudon & Company

If you like this, sign up for my mailing list (1~2 a month)

Business Intelligence: Quick 'n Dirty Harvest Report Emails

Tuesday, April 23rd, 2013


As VP of Engineering at Drupal Connect, I managed a number of large scale, concurrent projects. The projects would have anywhere from 2-8 developers and last for months, running between say 500 and several thousand hours. The budgets were 5 and 6 figures. If you have 4 developers on a project, you can easily burn 150-200 hours in a week. And of course, you are always trying to balance the burn rate against the timeline and any client dependencies.

But the main point here, is that there was a lot to focus on and a lot happening at once. Spending 30 minutes every day logging into Harvest and clicking through a bunch of projects or a bunch of employee time entries just wasn't feasible.

The good news here, is that the tools below aren't tied to this specific use-case. Most web apps have an API with a decent client in a handful of different languages. You could just as easily email daily reports from Freshbooks, Freckle, or Basecamp.

The plan

The goal is hit the Harvest API and grab select up-to-date project data, summarize the information, and send out an email. As a small implementation detail, we are going to store our projects and users in local YAML files. Lastly, it's important to note that we will need to get info from both /people and /projects


Sample code ready for you to start hacking away on github.

On to the code

So Harvest has a REST API and a Ruby lib. Actually, they have had a few Ruby libraries, the newest of which is a bit more complicated than we need. Essentially, we just need a wrapper that handles the HTTP connection, authentication, and request/response cycle. For the sake of our example, this old client works fine. Once you fill in your credentials and require the file, you should be able to access Harvest like:

  # create a dictionary of users {id => "first name last name"} and print
  # note that the response body is JSON per our client wrapper, so don't forget to require 'json'
  # http://www.getharvest.com/api/people
  harvest = Harvest.new
  response = harvest.request "/people", :get
  people = Hash.new
  JSON.parse(response.body).each do |p|
    if TRUE == p["user"]["is_active"] then
      people[p["user"]["id"]] = "#{p["user"]["first_name"]} #{p["user"]["last_name"]}"
  p people

Since staff change is a fairly rare occurrence and we want to run this file daily/weekly, we can write the output to a YAML file.

  # remember to require 'yaml'
  File.open('users.yml', 'w') { |f| f.write people.to_yaml }

All of this is pretty straight forward and we can encapsulate it in an object and a method.

We can create a similar method for projects:

  def update_projects_db
    # note we are now using an instance variable for @harvest
    # http://www.getharvest.com/api/projects
    response = @harvest.request "/projects", :get
    projects = Hash.new
    JSON.parse(response.body).each do |p|
      projects[p["project"]["id"]] = p["project"]["name"]
    File.open('projects.yml', 'w') { |f| f.write projects.to_yaml }

At this point, we have two methods we should only need to call on an ad hoc basis to update our (local) project and user YAML files. Our object that encapsulates them should load the YAML on instantiation and as alluded to above, create a Harvest instance variable. The skeleton for something useful might look like:

  class Bigbro
    def initialize
      @harvest = Harvest.new
      #load YAML databases
      @projects = YAML.load_file("projects.yml") if File.exists?("projects.yml")
      @users = YAML.load_file("users.yml") if File.exists?("users.yml")
    def update_users_db
      # see above
    def update_projects_db
      # see above
    # print a project summary for a given range
    def range_summary project_id, start_date, end_date
      # see below
    # print a weekly project summary
    def weekly_summary project_id
      # see below
    # print a project total to-date
    def project_total_to_date project_id, end_date = nil
      # see below 

The meat of our range_summary method could look like this:

  def range_summary project_id, start_date, end_date 
    response = @harvest.request "/projects/#{project_id}/entries?from=#{start_date}&to=#{end_date}", :get
    entries = JSON.parse(response.body)
    total, users = 0, Hash.new(0)
    entries.each do |e|
      users[e["day_entry"]["user_id"]] += e["day_entry"]["hours"].to_f
      total = total + e["day_entry"]["hours"].to_f
    users.each { |k,v| puts "#{@users[k]} had #{v.to_i} hours" }
    puts "PROJECT SUMMARY #{@projects[project_id]} #{start_date} - #{end_date} TOTAL: #{total.to_i}"

So we will get output on each users' hours for a given range as well as the total number of project hours.

The weekly_summary would then just be a glorified wrapper around range_summary:

  def weekly_summary project_id
    day_of_week = Time.now.wday
    past_sunday = (Date.today - day_of_week).to_s
    two_saturdays_ago = (Date.today - day_of_week - 6).to_s
    range_summary project_id, two_saturdays_ago, past_sunday

And it might also be nice to get a simple project total to-date. The code for that would look like:

  def project_total project_id, end_date = nil
    end_date = Date.today.to_s if end_date.nil?
    response = @harvest.request "/projects/#{project_id}", :get
    project = JSON.parse(response.body)
    # Harvest was founded in 2006 so this captures all data from any project
    response = @harvest.request "/projects/#{project_id}/entries?from=2006-01-01&to=#{end_date}", :get
    entries = JSON.parse(response.body)
    to_date = 0
    entries.each { |e| to_date += e["day_entry"]["hours"].to_f }
    puts "PROJECT TOTAL #{@projects[project_id]} #{to_date.to_i}/#{project["project"]["budget"]} through #{end_date}"

And lastly we would call our code like:

  # http://en.wikipedia.org/wiki/O%27Brien_(Nineteen_Eighty-Four)
  obrien = Bigbro.new
  # initial runs would call obrien.update_projects_db and obrien.update_users_db
  obrien.weekly_summary 1234
  obrien.range_summary 1234, '2013-01-01', '2013-01-31'
  obrien.project_total 1234, '2013-04-15'
  # OR on an array of projects
  [1234, 2345, 3456, 4567].each do |id|
    obrien.project_total id

Our last steps are to write a little bash script that calls ruby and pipes the output into mutt or mail and call this bash script from a cronjob.

  # maybe something like
  # obrien.rb
  require File.join(File.dirname(__FILE__), 'bigbro')
  obrien = Bigbro.new
  1234, # project blah 
  2345, # project bar
  3456, # project baz
  4567 # project foo
  ].each do |id|
    obrien.project_total id
  # daily-reports.sh
  # obviously going to be different if you have rbenv or rvm and you need to have outgoing mail set up — postfix FTW
  cd /opt
  ruby obrien.rb | mutt -s "[DAILY PROJECTS REPORT]" -- tim@example.com
  8 30 * * * /full/path/to/daily-reports.sh


Obviously we could make our code nicer, have a legit local database that had a flag for projects to include, etc. I also wrote a little reminder script that pinged devs to put in their hours each day, it was opt in, promise. And I toyed with the idea of creating chargeability reports, so we could help devs hit annual goals. And it was just as easy to set up a nightly script for one of our admins to summarize hours per user each day. And you get the idea — this is the kind of thing you could throw together in a little Sinatra app in an afternoon and have awesome custom reporting with some easy Javascript graphs (D3, HighCharts, or even Google Charts).

But even without getting fancy, we have a quick 'n dirty solution that can make your life easier by saving you time and focusing your attention on the projects or employees that need it most.