His content is friggin awesome. I’m hooked.
Alex Kehayias is MVP’ing a newspaper app. I’m curating a paper called Future Prototype, which is a collection of articles new and old which point towards our cyberpunk future.
Cohort Analysis in Django
I wrote a simple reusable app for doing cohort analysis in Django. Cohort analysis is an incredible tool for deflating your ego. Since it compares apples to apples you will see if you are actually helping, hurting or, more than likely, making no impact at all on your users with the iterations you’ve been making on you’re awesome web app.
What is Cohort Analysis?
Cohort analysis involves segmenting your users into smaller groups so you can compare measurements on them. For example, a cohort could be the user’s that signed up for your app in a week. So all the people that joined last week is a group, everyone who joined two weeks ago are in a group, etc. Now we can analyze actions that the cohort has taken in relation to the week they joined. For example “Users who joined the week of 2/1 performed 10% more action x in their first week than the users who joined in 1/25.” You can then think about the changes you made and correlate it back to the results. This prevents things like traffic spikes or that PR you did from messing up your numbers. You’ll be able to see over time if you are actually getting any better rather than looking at aggregate statistics of the activity of all your users. If you try hard enough with any aggregate stat you can make it look like a hockey stick. But that’s bullshit because you have no idea if the changes you made are the cause of that improvement when you look in the aggregate.
Cohort Analysis in Django
So how do we do this in django? I wrote a very simple, reausable app called django-cohorts. The objective is to set up a simple framework for running cohort analysis defined by the segments (cohorts) you want and the metrics you want in a dashboard that lives in the admin section. It’s raw right now, but it’s a start.
How does it work?
I’ve done the base cohorts by week, which is pretty common, but you can override that. In metrics.py, create a class with functions that will return a percentage. It will send you the users and you do some db lookup for certain actions the user took. After you’ve hooked it up to your url conf, you can then see the analysis with a dropdown of the metrics that are available.
Looking for Contributors
If anyone wants to help make this more sophisticated please let me know. Feel free to fork it on GitHub.
A Note about Cohort Analysis
Since cohort analysis typically looks at usage over time, you need to have some data for it to be useful. You can’t identify patterns that are significant without at least a few weeks of data. Just saying.
My Journey on the StartupBus
As you may know, I rode the NYC StartupBus last year with a kickass crew of hackers, designers, and hustlers. The Red Bull was plentiful, the sleep was not. I can say without a doubt that the StartupBus had a bigger impact on me than any single tech-related event.
My Story
Things had been going pretty slow at BeanSprout and I have a regular impulse to do ridiculous things that involve transportation vehicles and technology. I met with Justin, who was running that years NYC bus, and before you know it I was on a bus bound for SXSW.
I pitched my idea, once on the bus, about creating a cheat sheet for pop culture, originally positioned as “Girlfriend Notes.” Essentially, shit your girlfriend expects you to know (amzingly I don’t have a girlfriend) about that you don’t give a shit about watching/reading i.e. last night’s Glee episode. Somehow I convinced three other people to join me and that’s how Whadimiss (as in “What did I miss?”) was born (we are no longer maintaining it). It evolved into an editorial style daily email that summarized major categories of pop culture in witty, bite sized chunks. I still love the idea, and people found it very entertaining (Charlie Sheen was the pop topic at the time), but automating isn’t possible to do well given the current technology (summarizing large chunks of text using NLP).
The Buspreneurs
It takes a certain kind of person to get on a bus for over 2 days, hack, launch at SXSW and party. I want to be around people like that all day, every day. The crew that was assembled for our bus had the highest quality tech people (by tech I don’t mean coders, just people who are involved in tech) I’ve ever been around. I still keep in touch with the majority of the bus and they’ve helped me along my journey in learning to code. That was the best part of the whole experience. Oh and I drank a lot too at SXSW, in parties that I was never on the list for (thanks Adrianne <3).
Inspiration
I’ve seen some shit on that bus. Market ready websites, mobile apps, a startup getting thrown a check for angel funding, all built in 2 days that destroy 70% of what I’ve seen other people create in months. When you work at the speed of the StartupBus there really is no going back. If things take me more than a week to do in real life, I think back to the bus, ignore the smells, and think about what’s possible when you push it. Life is a hackathon, and if you aint’ hackin’ you ain’t lastin’.
Most Useful Things To Learn in Python for Beginners
If you’re just getting started in Python there are some things you should learn that will get you productive right away. In my experience with learning from no prior technical experience, I’ve found that 90% of what you’re doing is manipulating data structures to do what you want. Here’s the things I would recommend you learn right away so you can do some damage.
List Comprehensions
Once you’ve learned the “for” loop for iterating through a list of items, there is a much easier way to write out a loop called a “list comprehension.” It’s like a shorthand for loop that keeps your code clean and very readable.
# This is a for loop that is pretty common
new_list = []
for item in list_of_items:
if item.property > 15:
new_list.append(item)
# This is the same thing with a list comprehension
new_list = [i for i in list_of_items if i > 15]
This reduced 4 lines into a simple one liner. It’s cleaner, more readable, and does the same thing. Read up on them. In python 2.7 and up you can also do this with dictionaries!
Datetime
Sometimes you need to manipulate time (whoa). There is a built in package in python for just that. If you need to figure out last week’s date or subtract time or figure out what day of the week it is 1000 days from now then datetime has you covered.
# What was last week's date?
from datetime import datetime, timedelta
today = datetime.today()
last_week = today - timedelta(days=7)
print last_week.strftime("%A %m/%d %Y")
Functions or Don’t Copy and Paste
If you look at code and you’re like, “Man I really just need that code I used over there in here,” then you need to learn to use functions. Code reuse is a skill that you will need because it will be easier to read, maintain, and change. General rule of thumb, if you need to do something more than once, make a function out of it. Don’t copy and paste, it’s a disaster waiting to happen.
# Basic function example
def add(x, y):
# x, y are arguments to the function
# If the function is called with x,y
# the values are available in this namespace
# Do some stuff to the arguments x, y
z = x + y
# Return back the value of the stuff you processed
return z
There’s a Package for that
Python is very much a “batteries included” language in that there are built in packages for handling most common problems that will come up. There’s time and date, file io, url tools, and the list goes on. If it’s not already in the core of Python then there is probably a library someone has written that does what you want. Check out PyPy or go through github. You’re problem is almost certainly not unique and someone has probably solved it already.
I’ll probably add some more to this post, but I’m running low on coffee…
Appification of Web Design
My awesome web designer friend made a really interesting comment the other day when we were discussing a project. He said that there’s this “appification” of web design that’s going on right now. What he meant is that more and more websites are pushing this trend looking more like an app you might find on the iPhone or Android. Maybe not in the exact UI sense, but in the layout and visual cues that make it look like an interface for content rather than a website. This has it’s pros and cons.
Why interface design thinking is good
When thinking about building an interface you typically work from the ground up. What functionality is present and what is most important and how do I make it obvious that you can use x,y, x. We see more and more websites that are more like applications than static pieces of content. There is more functionality here today than there was yesterday and even more tomorrow.
Why it’s bad
Not everyone’s website should be approached like their building some masterful intuitive interface. Last time I checked, websites are pretty user friendly. They get me what I want when I want it and I don’t need to learn an interface to get it. While it may be trendy to make your website look like a interface for a spaceship, is that really what your user’s need? Probably not. Not every website needs to be dressed up like that, but I feel that less appropriate websites will take design cues from apps. How many interfaces need to exist before it’s just uncomfortable to navigate the web? How many times will designers reinvent the interface wheel?
L.A.E. (Let’s appify everything)
It’s a trend. Design experiences not interfaces. Work from the ground up not shiny cool interface down.
How to overwrite the save of a Django form like a pro
Use the “super” function to use the save method built into the form plus the extra things you need. Example:
class DebateMediaForm(forms.ModelForm):
def __init__(self, *args, **kwargs):
super(DebateMediaForm, self).__init__(*args, **kwargs)
self.fields['image'].required = True
self.fields['source'].label = "Image Source"
class Meta:
model = MediaContent
fields = (
'image',
'source',
)
def save(self, user, debate):
media = super(DebateMediaForm, self).save(commit=False)
media.added_by = user
media.related_debate = debate
media.content_type = "P"
media.save()
return debate
This is really handy for dealing with a ModelForm where you only want to have the user fill out a couple fields, but additional background information is required. In this case I need to specify who added the media. That’s something you don’t want your users to have to fill out, but you still want the luxury of using a model form because of the beautiful built in save function. Just super that shit and move on :)
A/B Split Testing with Django
Dear internet startups, there’s no excuse not to A/B test your shit. You will learn horrible things about yourself. For example, that “really cool feature that you added that is so awesome people are going to be glued to my app,” doesn’t actually do anything to improve your core metrics. In fact they may be hurting them. Since it’s hard to test apples to apples you are flying blind.
Don’t worry, I’ve done that too. It’s dumb now that I look back and I should have started testing everything sooner. I thought it would be too hard to do, to hard to implement, and slow down my lightning development skills (ha!). It’s worth the up front investment because you can help correlate the things you do to lasting results.
A/B test with django-lean
I don’t use Google’s wonderful website optimizer. It doesn’t help me do feature a/b tests. If you want to test the copy on your homepage then fine use GWO, but if you want to test that sweet feature you just added good luck trying to do that with optimizer. I use django-lean on just about all my projects to do real A/B tests that use a chi squared analysis to measure for lasting improvement changes. It makes it very easy to make experiments out of chunks of functionality you are using and figure out if what you are doing is moving the needle. Oh and you can run multiple experiments at the same time (that’s not multivariate testing, that just means multiple unique experiments).
Modifications
I only use the “experiments” app that’s in django-lean and remove everything else. I haven’t needed the other stuff in there and it’s not documented.
Setup
I put my modified version of django-lean into my project folder rather than installing it as a dependency in site-packages. I’ve made changes to it and probably will make more so I’d rather treat it like a project app. Follow their wiki to add the urls to your url conf and syncdb/migrate.
It was really tricky figuring out just how to enroll visitors/users into an experiment. Once I figured it out I made a wrapper for it that can be added to any view that you want people to be added to an experiment. For me that’s just about every view. Here’s my decorator that wraps views to enroll the user in an experiment if they are not already.
# Note that I modified the paths from django_lean.experiments to experiments
from experiments.utils import WebUser
def set_experiment_user(target):
'''Decorator for setting the WebUser for use with ab split testing
assumes the first argument is the request object'''
def wrapper(*args, **kwargs):
request = args[0]
WebUser(request).confirm_human()
return target(*args, **kwargs)
return wrapper
Now we can use this in our views like so:
# My wrapper is in lib.utils, but put it wherever you want
from lib.utils import set_experiment_user
@set_experiment_user
def home(request):
'''Your view here as per usual'''
return direct_to_template(request, 'index.html', locals())
Now anyone that goes to home will be enrolled in whatever active experiments I have going on.
We also need to add the following to our templates. A script that attempts to figure out if you are a real visitor and not a bot and boilerplate stuff for the experiments:
<script language="javascript" src="{% get_static_prefix %}javascripts/experiments.js" type="text/javascript">
{% include "experiments/include/experiment_enrollment.html" %}
Creating and experiment
Experiments have two paths, control and test. In your templates you use the experiments tags to show chunks of code that are for the test or for the control in the experiment.
{% load experiments %}
{% experiment experiment_name control %}
This is shown when the visitor is in the control group
{% endexperiment %}
This is shown when the visitor is in the test group
{% experiment experiment_name test %}
In your views you can route to the right code like this:
if Experiment.test("experiment_name", WebUser(request)):
# do some stuff knowing they are in the test group
Recording goals
We need to record the key actions the user makes so that we can compare them in the A/B test. These should be aligned with the core measurements you need to make to know what your users are doing. Don’t track goals that make no difference in figuring out if you are better engaging your customers. Focus on actions, for TimeoutDebate that means votes, opinions, and number of debate views. You add goals through the admin and can call them whatever you like. To record that a user has performed the action you do this:
from experiments.models import GoalRecord, Experiment
# In your view somewhere
GoalRecord.record("vote create", WebUser(request))
# This records the "vote create" goal for that experiment participant
Also you can do this using a tracking pixel that points to your django-lean url route (not the admin one). Simply add an img tag with the source of a url that points to your goal and it will record it just like the above. This is useful for putting it inside ajax interactions or other places that are not controlled by your django views.
Engagement Calculator
This is an interesting calculation that gets added to the A/B report that is generated by django-lean. You basically write a calculation that returns an arbitrary number based on actions a user has taken that measures overall engagement with your system. I don’t like the example on the django-lean wiki because it doesn’t reuse the goals recorded to calculate the user’s engagement score. This example reuses it and awards points accordingly:
# Defines the engagement score for experiments
from experiments.models import GoalRecord
class EngagementScoreCalculator(object):
def calculate_user_engagement_score(self, anonymous_visitor, start_date, end_date):
"""
Defines the a user's engagement based on the actions they take
Points awarded for certain actions.
"""
# Sets the weighting for the engagement score
weighting = {
"signup": 10,
"debate create": 0,
"debate view": 1,
"opinion create": 10,
"vote create": 5,
}
# Get all the tracking goals completed by them over the time period
goals_completed_set = GoalRecord.objects.filter(created__range=(start_date, end_date), anonymous_visitor=anonymous_visitor)
# Get a list of all the GoalRecord Names
tracking_goals = {}
for i in GoalRecord.objects.all():
tracking_goals[i.goal_type.name] = 0
# Count all the goals
for i in goals_completed_set:
if i.goal_type.name in tracking_goals.keys():
try:
tracking_goals[i.goal_type.name] = tracking_goals[i.goal_type.name] + 1 * weighting[i.goal_type.name] # multiply by the weighting
except:
tracking_goals[i.goal_type.name] = 0
days_in_period = (end_date - start_date).days + 1
# Sum up the total
total = 0
for key in tracking_goals.keys():
total = tracking_goals[key] + total
engagement_score = (float(total)/days_in_period)
return engagement_score
Generate reports and make decisions
Update your experiment reports using the django-lean management command “python manage.py update_experiment_reports” Now when you look in the admin (mine is /admin/django-lean) you will see a list of experiments. When you click on it you will see the report with all the lovely statistical analysis. It even makes a check mark next to items that have a confidence interval above 95% (that means that it is 95% confident that the test version will have a lasting improvement).
To make decisions, make sure you have a large enough sample size (number of people enrolled in your experiment in both the test and control group) and look at the confidence intervals, improvement, and engagement score. If it’s not conclusive just keep measuring it. Green is generally good, red is generally bad when it comes to the experiment report. The key here is that you’re making the decision based on real data, not your gut. You can disable your experiment at any time in the admin and “promote” the winning version. Just make sure you clean up your templates and views so it doesn’t get super cluttered with experiments.
Early thoughts on Clojure after one weekend
I started learning Clojure this weekend, a Lisp language that’s built for concurrency. It makes you approach solving problems in a functional way. This is all new to me coming from Python; an object oriented programming. However, since I don’t have any formal background in CS all of this stuff is new to me anyway :), but I’ve gotten used to doing things the Python way.
To sum it up, you build things with lots of functions in Clojure compared to Python where you group things in categories and leverage inheritance. When I first started learning to code last year, I approached it in a very functional way. Why? Because it’s natural to think that way. The principals of the language really speak to me.
Concise
This has to be one of the most concise languages I’ve seen so far. I’m a big fan of using shorthand syntax in python and things like list comprehensions, lambdas, etc. Clojure (and Lisps in general) make dealing with data extremely concise while maintaining readability. It’s like going from a machine gun to a rocket launcher.
No way this can be your first language
Clojure is completely unapproachable for those who are learning to code for the first time (given the current state of resources). All of the tutorials I’ve found are very… academic in explaining the language without providing examples that illustrate the common expressions used in Clojure. Setting up the dev environment alone was difficult (just use Leiningen and use a mac), and those unfamiliar with Java in general, there are lots of Java-isms that are mentioned in documentation, examples, etc. that I just don’t get. There is a really amazing community behind Clojure and I will contribute what I can to help intro people to it.
I need a problem to solve
When I was learning python, I learned really fast because I was working on solving my own problems. While Clojure is more of a leisure learning activity for me, I really need to find a problem to solve with it. Clojure looks awesome for data analysis, algorithms, but if I need to make a simple web app, I know I can bang it out in 3 days (I’m boasting) in Python/Django. I’m going to take my own advice and find a project that’s in Clojure’s wheel house that scratches my own itch.
Overall
Clojure feels like moving from a coal power plant to a nuclear one. I feel like a scientist or a mathematician about to launch a rocket when I use it rather than a foreman at a construction site building an office building. Ridiculous analogies aside, Clojure has incredible potential. I’ll save my thoughts about the metaphysical philosophy I’m uncovering about Lisps for another time, but that’s my primary reason for learning it…
Infinite Scroll with Django
Twitter style infinite scroll using Django as your backend is actually pretty simple. I did it for the TimeoutDebate archive page. You can break it down into these simple steps:
- Scrolling past a certain point triggers a jquery ajax event
- Django responds with a json object
- Insert the new data into the document, reload the scroll handler
That’s really not much different than any other ajax type of interaction. There’s no black magic (there rarely is) although I did need to look up how to actually capture this event. I found a good article about infinite scroll with django here, but ended up rolling my own backend handler.
This is how you capture the event of scrolling past a certain point (via Palewire, the article I mentioned):
$(document).ready(function(){
$(window).bind('scroll', loadOnScroll);
});
// Scroll globals
var pageNum = 1; // The latest page loaded
var hasNextPage = true; // Indicates whether to expect another page after this one
// loadOnScroll handler
var loadOnScroll = function() {
// If the current scroll position is past out cutoff point...
if ($(window).scrollTop() > $(document).height() - ($(window).height()*2)) {
// temporarily unhook the scroll event watcher so we don't call a bunch of times in a row
$(window).unbind();
// execute the load function below that will visit the JSON feed and stuff data into the HTML
loadItems();
}
};
var loadItems = function() {
// If the next page doesn't exist, just quit now
if (hasNextPage === false) {
return false
}
// Update the page number
pageNum = pageNum + 1;
// Configure the url we're about to hit
$.ajax({
url: '',
data: {page_number: pageNum},
dataType: 'json',
success: function(data) {
// Update global next page variable
hasNextPage = true;//.hasNext;
// Loop through all items
for (i in data) {
$("#newItems").before(
// Do something with your json object response
}
},
error: function(data) {
// When I get a 400 back, fail safely
hasNextPage = false
},
complete: function(data, textStatus){
// Turn the scroll monitor back on
$(window).bind('scroll', loadOnScroll);
}
});
};
For you html markup all I need to add is an anchor div with an id of “newItems” and include the infinite scroll script above.
And this is my view that handles the initial page and the ajax infinite loading:
def debate_archive(request):
user = request.user
debates = [i for i in Debate.objects.all().order_by("-id") if i.opinion_set.count() >= 2]
paginator = Paginator(debates, 10)
if request.method == 'GET':
if request.is_ajax():
if request.GET.get('page_number'):
# Paginate based on the page number in the GET request
page_number = request.GET.get('page_number');
try:
page_objects = paginator.page(page_number).object_list
except InvalidPage:
return HttpResponseBadRequest(mimetype="json")
# Serialize the paginated objects
resp = serialize_debates(page_objects)
return HttpResponse(json.dumps(resp), mimetype='json')
debates = paginator.page(1).object_list
return direct_to_template(request, 'debate_archive.html', locals())
I reuse Django’s built in Paginator class to know that I am serving back a certain sub set of the entire list of objects. It’s like pagination without pagination that happens automatically as you scroll! When I’m out of items I respond with a 400 which calls the error function in the ajax request which turns off the infinite scroll.
A couple things I noticed:
- If you only add a small piece of additional content, this method may result in a ton of ajax requests. This has to do with the event being bound to the user reaching a certain area by scrolling. Since not much additional length was added it keeps triggering the event when scrolling. I’m sure you can find away around that.
- You don’t need a separate url conf to set this up, just handle it in the same view using the is_ajax() function to determine if it’s from the infinite scroll. Add a some data to the GET ajax request and parse it in the django view. Keeps it all very tidy in my opinion.
This is what powers the infinite scroll found here: http://www.timeoutdebate.com/debate/all/
My Mom just Debugged My Web App
I’m so impressed and embarrassed that my mom debugs my code…
Hi,
Remember that when I tried to update my goal progress it didn’t work on Firefox? To learn Phthon I have converted one of my php script to Python script and found out that it works on other browsers but not on Firefox. I googled - apparently Firefox needs the content type for Python script to execute properly. Is your “update progress” page python script? If it is put this line to your script before you display anything: print “Content-type: text/html”.
debugging Mom
So my Mom sent me this email… She’s actually completely right. I forget to respond with the proper content type in the ajax call that updates goal progress in GoalSay. Her older version of Firefox does not allow that and so the functionality doesn’t work on her laptop. Let’s hear it for mothers who can code!
P.S. My mom is also a computer scientist who Perls and Phps all the time.

