Getting SQL Results That Are Distinct Across Two Columns

So this is a weird issue I just came across. Here's an example table schema:

mysql> describe queues;
+--------------+---------------+
| Field        | Type          |
+--------------+---------------+
| id           | int(11)       |
| customer_id  | mediumint(9)  |
| request_time | decimal(12,0) |
| item_id      | smallint(6)   |
+--------------+---------------+

mysql> select * from queues;
+------+--------------+--------------+--------+
| id   | customer_id | request_time | item_id |
+------+-------------+--------------+---------+
| 6829 |       15066 | 201704161118 |       1 |
| 6872 |       15066 | 201704161118 |       2 |
| 6875 |       15066 | 201704161118 |      26 |
| 6880 |       15066 | 201704161118 |       8 |
| 6881 |       15066 | 201704161118 |      15 |
| 6930 |       15077 | 201704161942 |       6 |
| 8683 |       14625 | 201704171412 |      10 |
+------+-------------+--------------+---------+

In my example, I might have the same customer requesting multiple items at the same time. I want to display all the items they have requested in the same line. That means I want to get a list of all the unique customers and request times combined. Yes, this isn't the *greatest* example because this table should probably be designed in a different way, but stick with me!

If I only want customer_id and request_time, that is pretty simple.

mysql> SELECT DISTINCT customer_id, request_time FROM queues;
+-------------+--------------+
| customer_id | request_time |
+-------------+--------------+
|       15066 | 201704161118 |
|       15077 | 201704161942 |
|       14625 | 201704171412 |
+-------------+--------------+

However, in my case, I need the queue id to do additional queries. That's where it gets just a smidge bit more complicated! Instead of just a simple DISTINCT, I've got to count the distinct records and then use HAVING to actually limit it.

mysql> SELECT *, 
COUNT(DISTINCT customer_id, request_time) as unique_orders 
FROM queues 
GROUP BY customer_id, request_time 
HAVING unique_orders >= 1;
+------+-------------+--------------+---------+
| id   | customer_id | request_time | item_id |
+------+-------------+--------------+---------+
| 6829 |       15066 | 201704161118 |       1 |
| 6930 |       15077 | 201704161942 |       6 |
| 8683 |       14625 | 201704171412 |      10 |
+------+-------------+--------------+---------+

Not too difficult, but I did go through a few different variations before getting to this result. I wanted it to work, but SELECT id, DISTINCT(customer_id, request_time) definitely does not!

Squashing Commits & Keeping Your History Clean

I'm a big fan of committing early and often. However, if you are anything like me, that means your commit history looks something like this:

added feature
fixed typo
oops another typo
added tests
fix failing test
fix another failing test
UGH TYPO

Fine for me alone, but not a great reference for the rest of the team when they try to figure out WTF I was doing a month or a year later. I've already written about how to rebase and squash commits before, so I won't cover that again. I do want to go a little more into why it's important to do so. Each commit message should reflect a distinct piece of work done. What I need to do now is rebase and change my commits to be more like this:

Added endpoint to return list of components
Added unit tests for component index endpoint

Now, if someone does a git blame, they can get the full context of what I was doing, not just a one character typo change. It's also worth expanding out your messaging and putting more context in the description. Every team has their own style and rules, but, personally, this is my normal git workflow and I'm a huge fan.

Finding Unseen Unicode/ASCII Characters in Ruby

This morning, I thought I was losing my mind. I'm writing a little web app (mostly Angular) that makes API calls. I know the API works, but for some reason, the calls from my app to the API were getting a 500 error in response. I tailed the API logs to see an "ArgumentError: argument out of range". However, the only thing that happened on this line was the date parsing. I open up the Rails console and start debugging. First I type out the date that isn't working. It works. Then I copy and paste from my browser. Failure.

irb(main):027:0> "2017-02-13T13:12:51Z".to_time(:utc)
=> 2017-02-13 13:12:51 UTC
irb(main):028:0> "2017‑02‑13T13:12:51Z".to_time(:utc)
ArgumentError: argument out of range

As you can see above, they look IDENTICAL. One of my coworkers suggested that I check the ASCII value of each character. Lucky for me, Ruby makes this easy.

"2017‑02‑13T13:12:51Z".each_byte do |c|
  puts c
end
==>
50
48
49
55
226
128
145
48
50
226
128
145
49
51
84
49
51
58
49
50
58
53
49
90

If you look at a chart of ASCII characters and values, you can see that 127 is the end of the standard characters. My fifth character starts with 226. I know that the pattern of 226, 128, 145 repeats twice and in the same spot as the dash. Looking at a UTF-8 encoding table, I can see that set of characters represents the non-breaking hyphen, which is definitely breaking my API call. Mystery #1 of the morning? Solved.

Building A Community For Beginners

Originally given as a talk at PyCaribbean on February 18, 2017. Modified slightly for the web.

I started programming in Python six years ago and have been doing Ruby development for the past five years. When I would go to user groups as a new developer, it was very intimidating. People seemed to know each other, and I wasn't sure who to ask for help. In the end, I stopped going and just created my own group. But many people get so discouraged that they decide programming isn’t for them.

That’s the reason I believe that building a community for beginners is so important. Let me explain why. We want to ensure that our communities are open to beginners because we need to expand and diversify. The more diverse our community is, the more diverse our teams will be. According to the Harvard Business Review, "working with people who are different from you may challenge your brain to overcome its stale ways of thinking and sharpen its performance."  I also think that everyone should be able to learn to program. Programming shouldn’t be limited only to people who were privileged enough to learn to code in grade school. No matter their age, gender, or background, if someone wants to join our community, we should be open to helping them learn. Having community to help learn should not only be open to people willing to pay thousands of dollars for a code school. I started PyLadies Boston almost four years ago with the express intention of bringing more women into the Python community. I am also involved with Boston Ruby Women, leading weekly study sessions where I answer any questions that people bring me. More on these and a few others as we move on…

How can you do your best to make sure your group is open to beginners? First, let's talk about new groups. As I mentioned, existing groups can still benefit from many of these ideas, so don't totally zone out if you already have a group that you run.

1. Make sure the way you describe your group and events is beginner inclusive

Ex. “no problem too big or small” “good for people of all levels”

2. Be clear about what knowledge and skills are required

Be clear about what level of knowledge is required. People will often underestimate themselves, so keep this is mind when describing what is needed. Ex. “basic Python required, should have mostly completed a tutorial like Learn Python The Hard Way”

3. Find out what your local community needs

This is important. Every community needs different things. With PyLadies, we have a large community of academics and scientists, so there's a huge desire for tutorials and code reviews. With Boston Ruby Women, we have a lot of recent boot camp grads, so we spend a lot of time talking about interviews and finding jobs.

4. Ask for feedback all the time.

Every year, I have an anniversary party and ask everyone who attends for feedback on the past year (what did you like, what did you not like) and for suggestions for the future. This regular evaluation of PyLadies has led us to have new types of events that I would never think of on my own and to get rid of ones that I thought would be successful that weren’t.

5. Try out different types of events for the whole group. Depending on your local community, some may work better for you than others. Here are some event types that I've had success with:

  1. Presentation Nights are the standard, but often there's an idea that you have to be an expert to give a presentation. Make it clear when asking for presentations that you are open to presentations about beginner projects.
  2. Lightning Talks are a great way to get people to do their first public talk. One of the ways that I have encouraged people is to say that, while it should be related, if you have a hobby that you want to share with everyone, we'd love to hear a lightning talk on it. One of the members of PyLadies ended up doing her first presentation on bird-watching, and it was a huge hit!
  3. Tutorials are always successful. They give experienced people a chance to share their knowledge in a meaningful way and beginners a chance to learn a new skill or toolset. However, with tutorials it's key to allow for extra time in the beginning, or before the event, to get set up. Even if you give clear instructions and ask people to set up prior, believe me, you will still likely need extra time.
  4. Mob Programming is where the whole group looks at the same problem and tries to solve it together. We started running events that were combination mob programming and code reviews, and they have been a blast. Everyone can participate: even with limited programming knowledge, you can get an idea of what kinds of problems other people are facing.
  5. Host separate beginner-focused events. These events will draw out people who are still too intimated to go to the main meetings. We regularly have women show up to our beginner events that almost never go to the main group because they don't feel they are ready, despite my encouragement.
    1. Study groups can help people teach each other. At PyLadies, we try to have study groups every week and have a mentor each time. However, we've also encouraged our members to start study groups in their neighborhoods as well and have had a ton of success with that. I've used these to target people who are just starting to learn to code. If you are trying to provide mentors at study groups, it can be a challenge. One way to sell it to your more experienced members is that it's a way to both share their knowledge and improve their understanding of fundamentals. I have one woman who comes every week who always challenges me and makes me go deeper into the language than I had before.
    2. Mentor sessions are similar to study groups but more focused on career growth. These target people who know how to code and are looking to enter the industry. Job hunting as a junior is often very discouraging, and it helps to have regular meetings with someone who tells you that you can do it. Also, by getting to know a larger amount of junior developers, it makes it easier for you to find great developers who just haven't been given a chance yet. Through these groups, I've gotten two people hired at both Akamai and a previous company. Frequently, it is harder to get to know people in a larger group setting. Having a smaller subset like a mentor session can help your more experienced members get to know the individuals who are just getting started.

So that's some of the basics for starting a beginner-focused group, but what if you are currently running a group? Here are some suggestions that have been successful at bringing more beginners into both Boston Python and Boston RB. I'll start with the simplest:

  1. Send an email to everyone who joins with a message that emphasizes that everyone is welcome, no matter their level of programming experience and let them know what they can expect to happen at your events. With Meetup, you can write a message once and automatically send it to every new member.
  2. If you can, have someone greet people as they walk in. Ideally, it will be one of the organizers, someone who is there regularly. This individual should do their best to get to know the people who have just joined. It will give all newcomers a friendly face each time they return and someone who is familiar with their level.
  3. For presentations nights, ensure that there are regular talks that are suitable for beginners. These talks do not have to be about ‘how to write a for loop,' but more ‘here's a problem, this is how I solved it,' with less emphasis on pure programming. Organizers I spoke to said they got large influxes of new sign-ups for nights when they had multiple speakers from a variety of fields.
  4. For project nights, a few suggestions:
    1. Have a couple of beginner tables and, if you can, have a few experienced programmers to staff them and help new people work through issues.
    2. Do introductions at the beginning. Have everyone introduce themselves and mention what they are working on. You can ask experienced people to raise their hands if they are willing to be available for help throughout the night. It can be time-consuming, but it will help build community and create opportunities for people to collaborate.
    3. Reassure people that it's ok if they are not working on a project. You can have people raise their hands at the beginning if they are looking to collaborate on a project.

Ok, last but not least: running workshops and finding the best way to teach people to code.

  1. Running workshops is, personally, one of the biggest challenges as an organizer. Here's a rundown of some the problems and some suggestions on how to deal with them.
    1. Space: given that a workshop is at least 6 hours long, you can't run one on weeknights. Therefore, most businesses won't want to host. However, you should try reaching out to local universities and community colleges - even better if you have someone in your group who works at one.
    2. Volunteers are a challenge at any time, but getting people to give away their Saturday (plus maybe their Friday night) is another problem entirely. Expect at least a couple of individuals to bail last minute, so have a backup plan. Make sure to have a few more volunteers than you think you need and be prepared to present if someone who is supposed to present doesn't show.
    3. Content is probably the easiest if you are doing a Django or Rails workshop since there are already full tutorials for both aimed at a weekend time frame. If you want to run a workshop for either of those, check out DjangoGirls, RailsGirls, and RailsBridge. If you want to do a workshop for a different language or framework, consider still looking at those for example of what you should include and adopt it for the framework that you want to cover. If considering a workshop on a language, review the material covered by the Boston Python Workshops. Though the materials are in Python, you could adapt them to fit other languages.
    4. Food - it's important to provide at least lunch when you have people stuck in a room for a full day. You can reach out to local companies who use the language or framework that you are teaching and get someone to provide food. Usually, they'll also want to send a volunteer for the workshop too so they can have someone to represent their company. 
    5. Continued engagement is probably the biggest challenge. When people come to a workshop, make sure they know what the next steps are. When a RailsBridge Boston workshop occurs, they always make sure there is a Boston RB project night the week after so people can keep learning. You could also have lightning talks soon after and encourage people to talk about problems that they want to solve or applications they want to build.
  2. There is no "one best way" for teaching people how to code. However, I have had more success with some methods than with others.
    1. Doesn’t work:
      1. Class style setting that builds on itself week after week potentially works if people are paying for it. However, if you are like me and just trying to provide a free service to your community, do not choose this option. I did this when I first started PyLadies because there was a demand for beginner classes. I held classes for just two hours every other weekend. I had a fantastic turnout the first week - 30 people showed up and were super engaged. The second week was still good - 20 people. Then it started dropping drastically. By the fifth week, it was just me and my co-organizer.
      2. Just giving a text tutorial (like Learn Code the Hard Way), with no support. With no support group or place to reach out for help, when people get to a tight spot, they can assume that they just aren't cut out for programming and quit. There's still a stigma that you have to be good at math to be a programmer, and some non-technical people think that only geniuses can program (have been told that I must be super smart because I'm a developer). Often it's just a matter of seeing the right example for a concept to make sense. Just because someone has trouble learning using one resource doesn't mean they couldn't learn using another.
    2. Does work:
      1. Short one-off tutorials on basic programming concepts that don't build on each other. You can't necessarily do a ton of these since most of programming does require knowing other concepts. But you can teach the idea of object-oriented programming without involving a significant amount of code. There are also other languages that you can learn the basics of in a two hour period - SQL being my favorite, but HTML also being a possibility. The goal is to share knowledge, so get creative!
      2. Having beginner focused events where people can bring questions from any tutorial they choose. As I mentioned above, this is an essential part of PyLadies Boston. I always suggest my two favorite tutorials, but if someone learns better another way (say a MOOC or videos), then they can use those and I will still be there to answer questions. I also try to make it clear in all communication that I am always available by email. Unless you have a group that is 5K plus people, this is not as big of a deal as you might think. I make myself available to about 1500 people through the groups I run and countless more through my website, yet I maybe get an email a week max. It will give people a lifeline if they need it, but it will not take up too much of your time.

These are my recommendations on how to build a community for beginners. However you involve yourself, being part of a space where everyone is welcome to learn is a valuable and rewarding experience that can really make a difference to someone just starting out.

On Building, Then Leaving, Community

When I moved up to Boston, I felt very lonely. I had my partner, sure, but that's never quite enough. I tried going to existing meetups but I found that they were too large and I still felt isolated. I had been part of a PyLadies group in Atlanta, so I decided to start one in Boston. From the very first meeting, I was energized by the women who came. They were all so excited about the group and the possibilities that it pushed me to spend more time organizing, where I might have otherwise said I was too busy. Every event we had, no matter how small or large, gave me energy and life. It was so gratifying to be able to give people who had never spoken in public before a platform to share their knowledge.

A few months after the founding of PyLadies Boston, I heard about a Ruby women's study group that had formed. Since I was by then working in Rails, I joined hoping to get to know more people in the Ruby community. That group moved from a mailing list to Meetup a few months after I joined and became Boston Ruby Women. That group has also been a source of good in my life. Every month, I'm able to help junior developers not lose confidence during tough job searches. Every month, I talk to brilliant women who routinely give wonderful advice on how to deal with all the bullshit that life throws. The group is always up for a table flipping conversation and I love it so much for that.

I am excited to be moving to Pittsburgh and to start a new chapter in my life. But I'm brokenhearted to leave these two communities. I have faith that I'll be able to meet rad and awesome women in my new home, but I know it will take time. Thank you to everyone who has helped me grow over the past four years. Y'all mean so much to me.

Creating Awesome Documentation With Yard

This semester I have been taking a computer architecture class. Overall, it's been pretty fun because I was given three projects and allowed to do them in the language of my choice. I chose Ruby. I'm pretty proud of these projects, so I decided to post them all to Github. If you are interested in the actual code, you can find it here. While I was doing this, I realized I needed to up my average documentation game. I needed the grader, who didn't know Ruby, to be able to easily understand what I was doing and why I was doing it. For the first two projects, I just wrote up documentation in a relatively reasonable way, and they were able to read through the code comments to see how it worked.

That's when I found YARD. YARD uses markup (I used markdown) and tags to create delightful HTML docs. What were just comments in my code turned into this, with almost no extra effort. I'd heard of it before, but I hadn't had a project that was worth massive documentation. YARD made the documentation a delight. You create the necessary documentation by using markdown, so your README functions as the homepage for your docs. Then, within each class, you use tags to explain params, return values, and add notes and examples. Here is an example from my MIPSDisassembler project:

class MipsDisassembler
  # Creates a new instance of MipsDisassembler
  # @param array_of_instructions [Array] the array of string instructions
  # @param starting_address [String] starting address for instructions, should be a string hexadecimal value
  # @param is_hex [Boolean] true if array of instructions is in hex, false if in binary
  # @note Each object in array_of_instructions should be a string representation of either binary or hexadecimal number
  # @return [MipsDisassembler] a new MipsDisassembler object
  # @example Create an object
  #    mips = MipsDisassembler.new(["0x022DA822", "0x8EF30018", "0x12A70004"], "7A060", true)
  def initialize(array_of_instructions, starting_address, is_hex)
    @instructions = array_of_instructions
    @starting_address = starting_address
    @is_hex = is_hex
  end

  # Takes binary/hex instructions and starting address and return an array of MIPs instructions.
  # @note Determines if r-format or i-format and parses accordingly.
  # @return [Array] an array of MIPs instructions, human-readable
  # @example Disassemble instructions
  #    mips.disassemble => ["7A060 sub $21 $17 $13", "7a064 lw $19, 24 ($23)", "7a068 beq $7, $21, address 0x7a07c"]
  def disassemble
    disassemble_instructions(@instructions, @starting_address, @is_hex)
  end

  # write MIPs instructions to file
  # @note File "mips_results.txt" will be created in same directory as code
  # @return [void]
  def output_to_file
    File.open("mips_results.txt", "w") do |f|
      in_file = ""
      disassemble.each { |instruction| in_file << instruction + "\n" }
      f.write(in_file)
    end
  end
 end
 

You can see the result of this code here as well as the image below. The result is easy to navigate documentation that you can share with anyone. It also gives the ability to see the source code of each method inline, so you don't have to go far to see the actual code behind public methods that you would want to use. I know I'm a bit of a dork, but I seriously loved putting this documentation together and I'm hoping it made my code just a bit more accessible.

Fun with Binary & Hex in Ruby!

While I haven't written a coding post in three months, I swear I do code every day. Recently, I started taking night classes again. This semester, I'm taking Computer Architecture and Data Structures with Java. My first project in Computer Architecture was to build a MIPS disassembler. I decided to use Ruby, which ended up bringing up some unique issues, mostly because Ruby does not have a short variable type within it's Numeric class. In Java, the short type is a 16-bit signed two's complement integer. Ruby does not use primitive types because everything has to be an object. No short object == no short type. Also, while binary and hexadecimal numbers can be converted easily to decimal in Ruby, they are initially strings. What does this mean? It means that in addition to the other parts of the translate, I also had to convert from hex to binary and from binary to signed decimal. I'll probably share all of my code in the future, but for now, here's a walkthrough of those two functions:

Translating to binary

This was pretty simple. I just had to use sprintf and it immediately converted the hexadecimal numbers into binary. Only one hitch! I needed the leading zeros (if there were any), so I had to use rjust to make sure it was a 32 bit binary by padding it to the left with 0s.

# takes an array of hex and returns an array of binary
# (32bit, including leading zeros)
def translate_to_binary(array_of_hex)
  array_of_binary = []
  array_of_hex.each do |num|
    array_of_binary << sprintf("%b", num).rjust(32, '0')
  end
  array_of_binary
end
  

Converting to a signed integer

Since I couldn't just cast as a short, I had to use two's complement. With two's complement, I knew that if the integer version of the binary was greater than 2^15, then it was actually a negative number. Otherwise, it was correct as is.

def convert_to_signed_binary(binary)
  binary_int = binary.to_i(2)
  if binary_int >= 2**15
    return binary_int - 2**16
  else
    return binary_int
  end
end

Boom! I hope this helps someone else who might've had the same trouble I did at first. I'll go into the program in full after my whole class has actually submitted theirs. 😛

SQL is the best!

I gave a SQL tutorial at PyLadies Boston last night and it was pretty fun. We used sqlite3 (which is definitely my least favorite DBMS, but it does come installed on pretty much every Linux/Unix machine by default and is the default for Django so I decided it was the best tool for this particular job. Giving a tutorial on something I used daily and have used consistently for 7 years was a bit weird because I did forget a few things because it didn't even cross my mind that people wouldn't know. For example: I initially neglected to mention that every statement needs a semicolon at the end and that you can't mix quotes (no " with '). Consider that was the bulk of all the issues, I'm feeling pretty successful right now! Take a look at the full tutorial below and let me know what you think.

The Mental Impact of Tech Interviews

I just watched this excellent talk by Zack Zlotnik (given at Code & Supply in Pittsburgh). I think every developer involved in the interview process should listen to this talk. Even if you aren't involved in the interview process, if you have interviewed anywhere, this is a great talk to watch.

Two slides in particular really struck with me:

In my experience as both an interviewee and an interviewer, all of these points are 100% true. Usually, about half way through my job search, I'll feel worthless and stupid, sure that no one will ever hire me. I've had an interviewer interrupt me midway through a white boarding problem and tell me that I wasn't quite the level they were looking for. Could I do the job they were asking? Definitely. I know I have routinely performed well in every job I've been given. However, white boarding routinely terrifies me and I've only gotten slightly more relaxed the more I've done it. I've always done better in a pairing session or, heck, just coding on a laptop in front of people. I've seen people hired through whiteboard interviews who are not good at their jobs. Zack has some good suggestions on how to improve the interview process that I think everyone should take into consideration.

Hey, I was interviewed!

What was it like to ramp up at that first job? Did you find you had some blind spots that your job filled in? Were certain things more or less important than you anticipated?
It was tough. I didn't get a lot of support from the other engineers, so I had to learn a lot on my own. If I ever ran into issues, usually they would just take the ticket themselves instead of helping me learn. At one point, I had almost no work, so I started taking online classes and working through programming books. I had never realized how important it was to have a team that supports you, which is definitely something I've looked for in every job since. Even as I move along in my career, I still want to be able to talk to my coworkers about issues that I am having without feeling like I'm wasting their time. However, I still feel like it wasn't a waste. It still enabled me to list myself as a software developer, have some actual production code that I could show, and really get my foot in the door. It's great to have a great job, but sometimes, especially if it's your first in a field, any job will do if you have limited choices.

Plotting Data From A CSV with Matplotlib

After I got all that data from the logs, my boss wanted it in a nice graph. First of the active user numbers, then the top 15 users. I knew that, despite having never used Matplotlib, it will still take me less time to learn it than any of my other options. I was able to get my script running and plotting correctly in less than two hours, so I felt pretty good about that. However, I had a few nested for loops and I wasn't a big fan. Enter the crowd-sourced code review! My friend Jenny was able to come up with a cool alternative to my solution that I ended up using. She utilized plot_date to sort the dates/data, which really helped (I was doing all sorts of crazy fun things).

So here's an example of what active_users.csv looked like:

system,au1,au30,date
jira,5,20,2016-06-09
confluence,16,23,2016-06-09
jira,8,22,2016-06-10
confluence,18,26,2016-06-10
jira,10,22,2016-06-11
confluence,18,26,2016-06-11
jira,11,23,2016-06-12
confluence,19,27,2016-06-12
jira,13,24,2016-06-13
confluence,19,28,2016-06-13
jira,8,24,2016-06-14
confluence,10,28,2016-06-14
jira,9,26,2016-06-15
confluence,15,30,2016-06-15
jira,15,26,2016-06-16
confluence,20,30,2016-06-16

he biggest problem was determining how to store the data in the program in a way that could be easily plotted. End solution? A dictionary of arrays. Or more precisely, a dictionary of a dictionary of arrays. With each line, we appended each data point to the matching array, which meant that a given date had the same index as it's data. And boom! It works!

import matplotlib.pyplot as plt
import csv
from datetime import datetime

active_users = {
  'jira': {
    'dates': [],
    'au1': [],
    'au30': []
  },
  'confluence': {
    'dates': [],
    'au1': [],
    'au30': []
  }
}

with open('active_users.csv') as csvfile:
  active_users_csv = csv.reader(csvfile)
  for system, au1, au30, date in active_users_csv:
    active_users[system]['dates'].append(datetime.strptime(date, '%Y-%m-%d'))
    active_users[system]['au1'].append(au1)
    active_users[system]['au30'].append(au30)

plt.plot_date(active_users['jira']['dates'], active_users['jira']['au1'], label='jira au1', color='orange', fmt='-')
plt.plot_date(active_users['jira']['dates'], active_users['jira']['au30'], label='jira au30', color='red', fmt='-')
plt.plot_date(active_users['confluence']['dates'], active_users['confluence']['au1'], label='confluence au1', color='green', fmt='-')
plt.plot_date(active_users['confluence']['dates'], active_users['confluence']['au30'], label='confluence au30', color='blue', fmt='-')
plt.legend(bbox_to_anchor=(0., 1.02, 1., .102), loc=3,
           ncol=4, borderaxespad=0.)
plt.show()
Resulting graph of AU1 and AU30 numbers

Resulting graph of AU1 and AU30 numbers

Ok, so now that graph #1 is done, I had to graph the top 15 users over the past week and their usage patterns. First off, here's an example of the data I was working with:

User,Date,Request Count
jsmith,2016-06-20,12
kthrace,2016-06-20,1
shastings,2016-06-20,11
sbristow,2016-06-20,3
jmccoy,2016-06-20,3
akoni,2016-06-20,9
gmorrison,2016-06-20,4
pfisher,2016-06-20,18
ndrake,2016-06-20,10
lbriscoe,2016-06-20,7
egreen,2016-06-20,13
crubirosa,2016-06-20,20
avanburen,2016-06-20,2
mlogan,2016-06-20,18
ckincaid,2016-06-20,11
rcurtis,2016-06-20,21
jfontana,2016-06-20,16
clupo,2016-06-20,5
kbernard,2016-06-20,7

Obviously, with our actual prod data, there were thousands of users... so a few more lines to loop through. The first problem was to put the data into a format I could use. Since even a top user might not use the system at all one day (say a Sunday), I couldn't use a simple dictionary; this time I had to utilize defaultdict. Defaultdict enabled me to create a dictionary of users where the value was (by default) an array of 7 zeros (representing usage for the past 7 days). After that, I was able to loop through the file for each day. To get the file names, I had to start with yesterday's date and go backwards. The date still gets appended to the 'dates' array, but the big change is in users: instead of appending the data to an array, I insert it into the index that matches that day.

So now that I have a dictionary of dates and users, I have all that I need to determine the top 15 users of the week. I create another dictionary that has the users as keys and sums up their total requests from the array and sets that as the value. Once I do that, I sort it, end up with a tuple, reverse it, then slice off the top 15. At that point, I just need to loop through my weekly_active_users list and then plot each user's data! Though I did have one, final (much smaller) problem: I had to find 15 matplotlib colors that I could use and distinguish. I created my array of colors and added a counter to each loop so I could add a unique color to each user. Success!

import matplotlib.pyplot as plt
import csv
from datetime import datetime, timedelta, date
from collections import defaultdict
import operator
from itertools import islice
import re

active_users = {
  'jira': {
    'dates': [],
    'users': defaultdict(lambda: [0] * 7)
  },
  'confluence': {
    'dates': [],
    'users': defaultdict(lambda: [0] * 7)
  }
}

active_users_weekly = {
  'jira': {},
  'confluence': {}
}
current_date = date.today()
day = timedelta(days=1)

for i in range(0,7):
  current_date = current_date - day
  current_date_txt = current_date.strftime('%Y-%m-%d')
  for system in ['jira', 'confluence']:
    active_users[system]['dates'].append(current_date)
    with open("active_users_{0}_{1}.csv".format(system,current_date_txt)) as csvfile:
      users = csv.reader(csvfile)
      for user, log_date, request_count in users:
        active_users[system]['users'][user][i] = int(request_count)

sorted_active = {'jira': {}, 'confluence': {}}
for system in ['jira', 'confluence']:
  for user, request_list in active_users[system]['users'].items():
    active_users_weekly[system][user] = sum(request_list)
  sorted_active[system] = sorted(active_users_weekly[system].items(), key=operator.itemgetter(1))
  sorted_active[system].reverse()
  sorted_active[system] = list(islice(sorted_active[system],15))

fig = plt.figure()
jira = fig.add_subplot(211)
jira.set_title('JIRA')
conf = fig.add_subplot(212)
conf.set_title('Confluence')
colors = ['red', 'gold', 'darkorange', 'green', 'turquoise', 'dodgerblue', 'navy', 'darkviolet', 'violet', 'pink', 'darkslategrey', 'silver', 'blue', 'lime', 'orange']

i = 0
for user, request_num in sorted_active['jira']:
  jira.plot_date(active_users['jira']['dates'], active_users['jira']['users'][user], label=user, fmt='-', color=colors[i])
  i += 1

i = 0
for user, request_num in sorted_active['confluence']:
  conf.plot_date(active_users['confluence']['dates'], active_users['confluence']['users'][user], label=user, fmt='-', color=colors[i])
  i += 1

jira.legend(bbox_to_anchor=(0., 1.1, 1., .102), loc='lower center',
           ncol=8, borderaxespad=0.)
conf.legend(loc='upper center', bbox_to_anchor=(0.5,-0.1), ncol=8)
plt.show()
Oooh pretty colors! Also, fake data makes for a bad graph.

Oooh pretty colors! Also, fake data makes for a bad graph.

Parsing Logs With Ruby

I used to write log parsing scripts all the time with Python. That's basically how I got started programming. In the past few years, I've been working almost entirely on application development. Recently my boss wanted to get the unique number of active users in a day and the unique number of active users within the past 30 days, in addition to an actual list of each of those users with the number of requests that they made to the system.

There were a few issues. I had to keep in mind that there were sometimes anonymous requests, which I didn't want to add to my list since it didn't serve my purpose. I also noticed that there were some lines that didn't seem to contain a request at all, so I also had to account for that. Let's all hop on the regex party train!

Thanks to the magic of regex, I can very quickly tell whether or not a line is a request from an anonymous user and skip that line. I can also skip a line if there is no IP address at the beginning, which occurs when there are usually requests that cover multiple lines. I also have a regex to match the date format that our logs are using. If there's a date, then I grab it; if not, then I can skip that line too.

Now I'm left with only valid lines. SWEET! But what do I do with them? Since I need to keep track of both users by day (along with number of requests) and the numbers of unique users across the day and past 30 days, a hash will definitely be my best friend. All of the information in this case will go into a hash called all_days. Here is the basic format:

all_days = { date => { user_name => num_requests } }

Now that we have valid lines, we can start populating this hash. If the current date exists as a key, then we increase the given user's request count. If it doesn't, then we instantiate the user/request_count hash with a default value of zero. That's actually the bulk of the work. After that it's just a matter of counting and generating a list of users for the past 30 days, then uniquing that list.

If anyone has any comments on how I could make this better, I would welcome the feedback!

Complete script:

require 'date'
require 'fileutils'
au1 = 0
au30 = 0
EMPTYLINE = /- -/
IP_AT_START = /^(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)/
DATE = /(3[01]|[12][0-9]|0[1-9])\/(Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)\/[0-9]{4}/
month_users = []
all_days = {}
Dir.glob("/path/to/logs/access.log*") do |filename|
  File.foreach(filename) do |line|
    next if line.match(EMPTYLINE)
    next unless line.match(IP_AT_START)
    date_match = line.match(DATE)
    next unless date_match
    user_name = line.split(" ")[2]
    date = Date.parse date_match[0]
    if all_days[date]
      all_days[date][user_name] += 1
    else
      all_days[date] = Hash.new(0)
      all_days[date][user_name] = 1
    end
  end
end
au1 = all_days[Date.today].count
all_days.each do |date, day_users|
  month_users += day_users.keys if date >= Date.today - 30
end
au30 = month_users.uniq.count

Getting Your First Job As A Software Developer

Preface

This post is aimed primarily at people who are transitioning to another industry, not new grads. New grads will probably get something from this post, but they are not the primary audience. With that said, this post will discuss primarily getting a job as a web developer since, not only is that what I have experience with, web dev jobs are usually the ones that junior developers can get.

Getting the Interview

In some ways, this can be the most daunting part. I'm going to break this up and hopefully help you to see that you (yes YOU!) have what it takes to apply.

Getting Your Resume Ready

One thing I always tell people is to make their existing jobs as technical as possible. For example, when I was applying to my first software developer job, I made my job at Home Depot (which was a PM job) sound like I did way more development. I did all the things that I put on my resume, but I emphasized the more relevant tasks and did not mention that my day to day was mostly working in Excel. Did you optimize a process at work? That shows how you can problem solve! Did you write a script to help you analyze data? Definitely add that. Also, writing actual code is not the only skill needed by developer. Chris Doyle, CTO of Pretty Quick, does a good job of summing this up (slightly paraphrasing from tweets):

There are many valuable dev skills besides code, many of which you probably possess! Start there.[1] Also, your dev skills may be relatively small, but it doesn't mean they aren't already useful.[2] A junior developer sent me a cover letter identifying a potential UX improvement in my site, saying "This is something I could help you with."[3] That was such a concrete demonstration of initiative! They were an immediate favorite who was ultimately hired.[4]

For new developers, I expect to invest in their training, so it's really about "are they going to multiply or waste effort".[5] I do consider their current ability and have a low minimum bar, but gaining confidence in trajectory is much more important.[6]

What are these other valuable dev skills? Chris has created a whole list of developer competences. Don't worry if you look at that list and feel like you are missing a few. However, this list should help you decide what to include on your resume. For example, if you are currently in a customer support role, you are probably very good at suggesting possible causes for bugs! For more of my thoughts on resumes, see my post on the topic.

Creating Your Portfolio

This part of applying to jobs because less critical as you have more relevant experience. For example, a visit to my github or gitlab pages would make one think that I never coded! False... all my code is just proprietary and I invest my time in PyLadies vs OSS. Works for me because I have former employers and coworkers who will back up the quality of my code. If you are starting out, you need to show it. As Carlos Alonso said, as a junior developer, "your public code is the most important part of your CV." Dustin King recommends "writing a small game or other fun demo and putting it up online with code." Your project doesn't have to be huge, but you should have a friend QA it and make sure that there aren't glaring bugs. My project ended up being a choose your own adventure game that was part of going through Zed Shaw's Learn Python The Hard Way. However, I would recommend going a step further and building your own web app prior to applying. See the end of my "Getting Started As A Developer" post for more suggestions on how to get started with that.

Finding Jobs To Apply To

How do you even find jobs that are available? There are many great job boards out there, my favorite being Stack Overflow Jobs. But don't stick with just one. Search any job board that you can find and, as Christian Steinert says "try not to only consider common tech companies. Others (like financial services) might offer interesting stuff." At this point, most companies have at least a few software developers on staff. If there's a company you love, look at their job board. If it looks like they have a few technical jobs, reach out! Being passionate about a company's mission/product can often get you pretty far. As an example, if I lived in San Francisco, I would be hounding Betabrand until they hired me. If you have extensive experience in a field, look at companies with that focus so that your past experience can be even more beneficial. For example, if you are a biologist by training and looking to switch to development, you might find a place like AddGene to be a good fit. I, personally, have also enjoyed working with recruiters, like Talener in Boston. At bare minimum, they will get you in front of a ton of companies and get you in for interviews. Even if none of those pan out, it will be good practice and you will be better prepared when you find a job you are really excited about. Also, don't forget to milk your own personal connections! Do you know anyone who works at a company that's hiring devs? Contact them, see if they can meet you for coffee (buy them a coffee), and talk to them about what it's like to work at their company and their hiring process.

Now that you have a long list of job postings, you maybe are starting to notice that they all seem to say that you need experience with all these different languages, maybe one of which you have used... how can you ever be ready? GOOD NEWS! Most job postings are wish lists. Yes, even the parts that say "Required" are often negotiable. If a job sounds interesting, you should apply. Let whoever is reviewing your resume determine whether you are qualified. The only phrase that should give you pause is "Senior". If a job posting is looking for a "Senior Application Developer" or something like that, the hiring manager is unlikely to hire a junior person instead. Even so, if it's a company that you are really passionate about, reaching out will not hurt.

Acing the Interview

Before

One way to prepare is to work through a list of common interview questions and maybe a few exercism problems. Almost every interview I've had has also asked questions about SQL. Given that every job I have had has required me to use SQL in one form or another, I would recommend learning a bit before applying to any job. If you aren't up to speed yet, work through exercise 12 of Learn SQL The Hard Way. It looks like most of the lessons probably aren't too time consuming and it will be well worth your time!

During

I wrote my own "Job Search Retrospective" last September which talks about some of the things that I did in my most recent round of interviews that made me feel a lot more confident. As a junior, the most important thing to remember is that it's ok to say "I don't know" or "I don't know, but I was reading about this recently and I think it is [x]". An employer who you actually want will not be expecting you to be super knowledgable about development at this point. Stay calm on just make sure all your awesome qualities are on display. Most importantly, don't forget to ask questions!

After

If you aren't sure that you did very well during the interview, don't despair! You still have time to make a good impression. Going back to Chris Doyle, he had a person who sent in a refactored version of a coding exercise they did during an interview. This is such a great demonstration of initiative and also that you continue to consider and think about your solution even after you have "solved" the problem. It is good to send an email to the interviewer and thank them. That's a perfect place to add in "and I've been thinking about that problem that we did and I think the solution could be improved by [x]".

Starting the Job

Getting the Offer

The only advice that Mr. Sam Phippen through out really struck a cord with me:

Charge. More. People entering the industry consistently underestimate their reasonable salary band by about 20%.

This hit me because, as of recently, I was underpaid by about 25%. To get an idea of what other places are paying, look at sites like Indeed and the recent Stack Overflow Developer Survey. If you find out after the fact that you have undervalued yourself, you can fix it, but it takes a while. I speak from experience.

Doing the Work

Never feel bad about asking questions. Take advantage of the senior devs that you work with and learn as much as you can from them. If your company supports pairing, pair program as often as you can. You will learn so much more so much faster that way. Also, take charge of a project or a feature. That doesn't mean you have to do all the work, just that you are taking responsibility for making sure it gets past the finish line. Along those lines, @codepaintsleep says:

Don't get stuck with grunt work just because you're junior. Push your knowledge when there's more experienced people to help! Also, don't write off grunt work as grunt work. Learn from everything!

To give an example of how you can learn from everything, I am covering for a coworker and working support this week. I am not doing any actual coding. However, the amount that I have learned about how our system works and what our users want in just a week is incredible. Learn. From. EVERYTHING.

Postface

I hope you got something out of this. If you think I missed something, feel free to comment on the post or contact me.

Why You Should Go To Conferences

Maybe I'm the wrong person to write this. After all, I only go to one or two conferences a year because I can't quite afford to travel as much as some people do. I check out some local conferences and that's about it. However, I'm inspired to write this because I just got back from RailsCamp East Coast and it was AMAZING. It reminded me of the same reason I love Burlington Ruby Conference and my first PyCon. Conferences (or in the case of RailsCamp, a retreat) set aside a few days to learn some new things and also to spend some time with other developers, building relationships. In a way, it's the best networking you will ever do. Going to loads of meetups and making a passing acquaintance with a lot of people might do you some good. Really getting to know a few people over a few days will do you a lot of good.

OMG RAINBOW (waterfalls near the Ashokan Center in Olivebridge, NY)

OMG RAINBOW (waterfalls near the Ashokan Center in Olivebridge, NY)

I don't mean to dissuade anyone from going to meetups... I have made so many wonderful friends through PyLadies Boston that I would absolutely vouch for and that I've helped get jobs. However, those relationships have formed over the course of years and sometimes you don't have that much time. If you are a junior developer who is trying to get a job, one of the best things you could do is go to a conference and do some heavy-duty bonding. If you can do a talk, that's even better. Anything to show how interested you are in whatever language/field you want to work in. Going to conferences won't guarantee a job, but it will likely increase the number of people who are willing to recommend you to other employers and increase your chances of getting a better job.

If you are like me and a more experienced developer who is not looking for a job, conferences are still beneficial. I love my job, but I'm not going to be there for the rest of my career. When I am looking for a job, now I know even more awesome people who I would love to work with who also know me. This increases my chances of finding a job I actually enjoy since I have spent a significant amount of time with all of these people and have a better sense of who they are and what they value.

TL;DR - Go to conferences (and RailsCamp). They are fun and valuable.

Getting Started As A Developer

Through my work with PyLadies Boston, I have been asked quite a few times on how to get started with development. I'm going to try to write it all down here.

So you want to become a software developer?

Awesome! It's a pretty fun (albeit sometimes frustrating) gig and the pay is pretty decent too. Just be patient... it's not super easy and sometimes it'll get difficult. It's worth it though, so stick with it.

Step 1: Pick a language

Don't spend too long on this step! I would recommend either Python or Ruby as good beginner languages. The syntax is relatively similar to English, so it's not too hard to read code from early on. Also, these are two languages that are widely used at actual companies! Ruby is a fan favorite of startups and Python has a huge following in the scientific/academic communities. If you want to further progress into web development, I would recommend Ruby because, in my opinion, I think the documentation and tutorials available for Rails are much better (and in some cases easier to understand) than the docs/tutorials for Django.

Either way: don't think too hard about it. You just need to pick one. Once you learn one, you can always, much more easily, learn another.

Step 2: Pick a method

There are a load of resources out there. One I recommend is Zed Shaw's Learn Code the Hard Way (for Ruby, Python, SQL, and C). There's also How To Think Like A Computer Scientist (for Python), along with plenty of others. If you prefer a book, I can recommend both Dietel's How To Program (Java) and Pine's Learn To Program (Ruby, also a web tutorial!). The world of programming books/tutorials is  your oyster! Just pick a learning style that you like and stick with it. If videos are your thing, Codeschool has excellent video tutorials.

What I do not recommend: while Codecademy can be good for trying to decide what language to use, I do not recommend it for learning. Codecademy is software (what you will be building) and software has bugs. What you don't want to be spending time on is trying to figure out if the bug is yours or Codecademy's. If you think that sounds crazy, I have had Python code that I've run locally with no errors that gets a random error on Codecademy. Plus, one of the most difficult parts is installation and setup. You miss that with Codecademy. If this is your tool of choice, you have been warned.

Step 3: Give it some time

Try to dedicate some amount of time every day. 10 minutes when you first get in to work? 30 minutes when you get home? Doesn't matter. The more time you can dedicate, the faster you will progress, but the important thing is to make it a habit so you stick with it. Most of these resources have forums that you can utilize if you run into problems. If they don't, then you can also use StackOverflow. If you google for your error message, you will probably get a result on StackOverflow. Check it out and see if you can fix your bug. Once you get past the basics, give yourself a challenge by trying some exercism.io problems. They have problems for almost all languages and your submissions will actually get code reviewed!

Step 4: Level up!

You have a solid foundation! Time to take it to next level! And by that I mean web development. Is that the only route you can go? Nope! But I'm a web developer, so that's what I actually have experience on. Also, I have the most experience in Python and Ruby, so those are the languages that I'll have the most links for. If anyone has some next level topics for non-web developers, put it in the comments! Or link to your own post. Depending on what you started with, here are some resources:

Ruby:

  • Michael Hartl's Rails Tutorial - This is the best Rails tutorial out there. I'd almost argue that it's the best web dev tutorial out of any language.
  • CodeSchool's Rails For Zombies - If you prefer videos, Rails For Zombies is corny, but pretty great. And the first course is free!
  • Sinatra - a microframework for Ruby. If you really want to dig in and try to learn how things work, using a microframework that doesn't enable all the bells and whistles by default is awesome. 

Python:

  • Tracy Osborn's Hello Web App - Awesome book series made to teach non-programmers web development through Django
  • Getting Started With Django - Short video series. Starts you after the official Django tutorial
  • Django Book - The official Django tutorial. I'm hoping it's been updated since I tried to go through it because it was a bit buggy then.
  • Flask - a microframework for Python. Also see this tutorial
  • Lynn Root's NewCoder.io - Not web dev, but definitely a level up. Lynn has written tutorials on APIs, web scraping, data visualization, GUIs, and networks. These are great if one of these topics is of interest to you.
  • Daniel and Audrey Roy Greenfield's Two Scoops of Django - this is not really a beginner book. More an "after your first app" book. But this is one of the best programming books I have ever read, so I absolutely had to add it to this list.

Java:

  • Play Framework - As far as I can tell, this is the most popular web framework for Java. Their own documentation contains a solid amount of good tutorials to get you up and running fast.

Step 5: Build something!

This is absolutely the hardest step. Why? Because it requires you to actually be a little imaginative and think of something that you want to create. To start, you can create a website (either a personal site or a landing page for your project) on Github Pages. It's free and super easy to get started! As far as picking a project, there are shortcuts if your brain is a bit fried and you can't think of anything. There are lists of coding projects that you can pick from. You can also contribute to open source. Whatever you choose, the important thing is to keep working at it. Even senior developers are still constantly improving their skills, so you will constantly be learning at all stages of your career.

A Quick Post About Resumes

PyLadies Boston recently had a mock interview night and with that I offered to review resumes. I got a few takers, did some reviews, and now I have some thoughts.

  1. If you are randomly switching fonts, please have a good reason for it. It is distracting (in a bad way) if you go from a serif to a sans serif for seemingly no reason.
  2. If you are writing your job duties as a bullet-point list, please make sure each point is connected to itself. I can't seem to think of a great way to say that, but let me give an example. If one of your points is: Improved test coverage by 10%, organized tech talks, and implemented a code quality standard - then those should really be elaborated upon if possibly and definitely split into three separate points.
  3. Make sure your resume reflects the skills of the job you are looking for. That doesn't mean that, if you have been a research scientist that now wants to be a full-time programmer, you have to ignore your past history. However, you do have to highlight different things. For example, how did you analyze a set of data? Did you use Python? What libraries did you use?
  4. Along the same lines: If you are applying for your first programming job and don't have any related experience, you really need a projects section that lists the programming projects you have worked on and any open source you have done, along with descriptions. You know you can do the job, but if you don't put proof that you can code, the internal recruiter/HR person is going to throw your resume out.
  5. For skills section: if you are including it, please make sure they are relevant! If you are applying for programming jobs, you do not need to include photoshop. Also, definitely do not include the Office Suite... familiarity with Office or similar software is assumed if you know how to use a computer (which is also assumed if you are applying for programming jobs).
  6. White space is your friend! Definitely don't jam everything together. Separating out sections, careful use of bold fonts and color, and horizontal lines can really help draw the reader's attention to wherever you want it to go.

Sorry some of those were a bit of a ramble, but these are all things I have seen recently on resumes. A resume is often the first look that many people have into your professional life, so you want it to represent you in the best way possible. If you have any questions, feel free to comment!

Calculating Age... in Java 8

I'm doing a bit more Java now that I'm taking a Java class. With that is coming a lot of "oh that should be easy... wait, there's not a really simple way to accomplish this???". First example of this: determining someone's age.

import java.time.LocalDate;

public class Age {
  private int birthYear, birthMonth, birthDay, age;

  public Age(int birthYear, int birthMonth, int birthDay){
    this.birthYear = birthYear;
    this.birthMonth = birthMonth;
    this.birthDay = birthDay;
    this.age = getAge(birthYear, birthMonth, birthDay);
  }

  public int getAge(int birthYear, int birthMonth, int birthDay){
    LocalDate fullBirthday = LocalDate.of(birthYear, birthMonth, birthDay);
    LocalDate now = LocalDate.now();
    long daysSinceBirth = now.toEpochDay() - fullBirthday.toEpochDay();
    return (int) (daysSinceBirth/365);
  }
}

LocalDate is new to Java 8. Previously it was part of the Joda-Time API, but the Java folks seem to have added the bulk of the functionality directly into Java. Sweet! What does this allow us to do? LocalDate creates an object that represents a date and has quite a verbose API. Since we're calculating someone's age, we are going to need an object that represents their birthday and an object that represents today (in this case fullBirthday and now). If we convert these both to Epoch Days, which is generally just the number of days from 1970-01-01, we can just compare the number of days and divide by 365 to get the age. Not too hard... but did take a second to come up with it... I was a bit surprised that it seemed like I couldn't actually subtract dates. Ruby has spoiled me...

Polymorphic Routes

I just started classes (working toward the CS certificate at BU Met) and my new big project at work is porting over a ton of code from Rails 2 to Rails 4, so I’m sure I’m about to have tons to write about. For today, here’s something I somehow just found out about: polymorphic routes in Rails.

What are polymorphic routes? Let’s say you want to have a partial that is used for quite a few different models. Every model you have has a show page for individual instances of that model and each show page has an edit link. So instead of creating a new page for each, the view you have reads in a generic @object and then you can use polymorphic routes to generate the path for the edit link! In this example, I’ll have the @object represent an instance of the Article class. Like so:

edit_polymorphic_path(@object)

results in:

edit_article_path(@object)

I’m pretty surprised I haven’t seen this yet, but now I’m glad that I have! This is pretty cool :D