A Picture’s Worth A Thousand Senators: Staring Into The Gaping Ideological Chasm That Divides Congress

In this post, I’m going to introduce you to a cool-looking graph, tell you what it means, and give the technical details of its generation—all because I think America might care. Here we go.

Introduction

Politics in America are hopelessly partisan, and all of the bickering serves only to cripple our nation at a moment of crisis when decisive action is called for. You know it. I know it. Barack Obama and John Boehner know it. Your grandma knows it.

Or do we know it? The belief that American politics has become more polarized in recent decades is widespread. But is there any evidence for it? While I make no attempt to provide a complete explanation for this disturbing trend in our nation’s governance, in this post I present some work that I believe provides an answer—a resounding confirmation that, according to at least one view of the situation, the politics of the United States are now more deeply divided than ever.

Though this work was done in collaboration with Michael Dimond as part of an advanced data mining course (CS676) at BYU, I believe I am the sole author of the portions of our report excerpted below.

The Cool-Looking Graph

Here’s the pretty picture:

senate thresh0.75 curved betweenness-centrality colors and sizes 10000px — United States Senate Legislator Similarity Network 1789-2011.

Bask in its glory—and be grateful, because that thing took a lot of work! Make sure to click on the image to see the full-sized version. (It will open in a new window/tab.)

What It Means

The above graph is a visual representation of the United States Senate across 222 years of legislative history. It is, in essence, a social network of senators across time—who voted like whom, what cliques and factions formed, etc. In other words, retroactive Facebook for America’s past politicians? No, that’s going too far….

Anyway, here’s how to interpret the graph. Each node (circle) represents a senator. An arc is drawn between two nodes if the two senators at the endpoints voted on the same bill at least once and voted the same way on bills more than 75% of the time. Size and color of nodes indicate their centrality (a measure of importance) in the network. Scanning from left (1789) to right (2011), a few trends emerge:

The height of the graph increases. Much of this can be attributed to the increase in the number of states, from 13 to 50, meaning the number of senators serving simultaneously increased by 74.
The graph alternates between unity and polarization. Visually, unity looks like a single “stream” of nodes, whereas polarization is the graph splitting into two components that move in slightly different directions.
In recent decades, the height of the graph has continued to increase in spite of the number of senators being fixed at 100 since 1959. I assert that this corresponds to the phenomenon of increased polarization between the two parties.

I am interested in whether the flow of the graph can be correlated with developments in the American two-party system. Feel free to let me know your thoughts on that. For those wishing to play with the graph data, it’s available here.

Technical Details

This stuff gets pretty computer sciencey, so only read on if you really want to nerd out.

Data

The graph is generated using an aggregated and sanitized version of the THOMAS congressional data from govtrack.us. This yields 2.1 GiB of primarily XML-encoded congressional data from the 1st to the 112th congress. The data includes a record of votes by all legislators on all roll calls since the 1st congress, as well as party affiliation.

Social Graph Inference

Let $latex L$ be the set of all legislators and $latex S$ be the set of all sessions of congress. We define a legislator-to-legislator similarity function $latex \sigma : L \times L \rightarrow [0,1]$ that returns a similarity score for all pairs of legislators that ever voted on the same roll call:

[latex size=3]
\sigma(l_{1},l_{2})=\frac{SameVotes(l_{1},l_{2})}{PossibleVotes(l_{1},l_{2})} \\
\\
\phantom{\sigma(l_{1},l_{2})}=\frac{\sum_{s \in S : l_{1} \in s \wedge l_{2} \in s} \sum_{r\in Rolls(s)} \beta\left [vote(l_{1},r)=vote(l_{2},r) \right ]}{\sum_{s \in S : l_{1} \in s \wedge l_{2} \in s} |Rolls(s)|}
[/latex]

where

$latex Rolls(s)$ returns the set of all roll calls (votes) occurring in session $latex s$;
$latex \beta[x]$ is an indicator function returning 1 when $latex x$ is true, 0 otherwise;
$latex vote(l,r)$ returns the vote cast by legislator $latex l$ on roll $latex r$; and
$latex l \in s$ is true iff legislator $latex l$ served in congressional session $latex s$.

We use this similarity measure to construct a legislator affinity graph as follows:

Let $latex G=(V,E)$ be an undirected graph with a set of vertices $latex V$ and a set of weighted edges $latex E$, such that

$latex V=\{Vertex(l) : l \in L\}$ and
$latex E=\{Edge(l_{1}, l_{2}, \sigma(l_{1},l_{2})) : (l_{1},l_{2}) \in L \times L \wedge \sigma(l_{1},l_{2}) > \theta\}$

where

$latex Vertex(l)$ yields the vertex associated with a given legislator $latex l$;
$latex Edge(l_{1},l_{2},w)$ yields an undirected edge with weight $latex w$ and endpoints $latex Vertex(l_{1})$ and $latex Vertex(l_{2})$,
and $latex \theta \in [0,1]$ is a minimum similarity threshold.

Rendering

In practice, the above $latex \theta$ must be set high (I used 0.75) to prevent the number of edges from being excessively large. Once the graph was constructed, it was loaded into Gephi, a graph visualization tool. Betweenness centralities were computed, nodes were sized and colored, and a force-directed layout algorithm was applied. I then manually rotated the graph so that earlier senators are located on the left and more recent senators on the right, to give the effect of a rough historical timeline. I exported this as an SVG file, then loaded it in the Inkscape vector graphics program. With the benefit of 16GB of RAM, I coaxed Inkscape into rendering a 20,000 pixel width PNG image of the graph. This was finally scaled to 10,000 pixels wide for web distribution using GIMP.

Acknowledgements

Thanks to Christophe Giraud-Carrier for teaching the class for which this graph was generated, and to Michael Dimond who, though not directly working on this portion of our project, was nevertheless an excellent collaborator. And to my friend who convinced me to finally finish this post.

Posted

October 13, 2011

data mining, politics

Josh Hansen

Tags:

Comments

3 responses to “A Picture’s Worth A Thousand Senators: Staring Into The Gaping Ideological Chasm That Divides Congress”

Joel

October 13, 2011

One tweak that occurred to me was using color to represent party, which makes the graph more interesting to me.

Here’s recent history with nodes colored:

http://imageshack.us/photo/my-images/189/senaterecent.png/

And here’s what the whole graph looks like (I couldn’t quite match your layout):

http://imageshack.us/photo/my-images/834/senatealltime.png/

Here’s the gexf with party affiliations in it (I used the data from govtrack.us and identified each politician with his most frequent party.)

http://www.sendspace.com/file/b7ph58

Reply
Jay Dugger

November 29, 2011

Thank you for posting this graph of your work. At first glance I don’t see a substantial split for the American Civil War. That surprises me.

Reply
austin

August 15, 2012

at first glance i notice there are republicans before the republican party was founded

Reply