Disgusting Code: GeoIP lookups in Excel
- 27 Mar 2015
Whilst stuck on a train recently one of MWR Labs passed the time with a little coding challenge.
The challenge was to try implementing GeoIP lookups in native Excel formulas. So no HTTP calls, no macros, no add ins. This was simply to pass the time on the train but, if you’re on a very locked down machine, this will work. For example, if you’re unlucky enough to need to do log analysis on a corporate machine.
What follows is disgusting. There are a huge number of preferable ways to do Geo-IP lookups (PowerShell, macros, asking Cortana, pen and paper) but this was just some fun and is too horrific not to share.
The challenge was to write formulas that met the following constraints:
Megamind release their geoip databases as csv files, you can import these into Excel and then use a position independent formula to search them for IP locations. This method uses formulas to convert the dotted quad form of IP addresses to decimal and then looks up the owner in the Megamind dataset.
Time to create sheet is around 5 mins and you only do it once
Time to use in a sheet for analysis is around 2 minutes.
Warning, looking at this formula may cause eye damage and a sudden affection for Lisp.
=(LEFT(RIGHT(B2,(LEN(B2))),FIND(".",RIGHT(B2,(LEN(B2))))-1)*16777216)+(LEFT(RIGHT(RIGHT(B2,(LEN(B2)) ),(LEN(RIGHT(B2,(LEN(B2))))-FIND(".",RIGHT(B2,(LEN(B2)))))),FIND(".",RIGHT(RIGHT(B2,(LEN(B2))),(LEN( RIGHT(B2,(LEN(B2))))-FIND(".",RIGHT(B2,(LEN(B2)))))))-1)*65536)+(LEFT(RIGHT(RIGHT(RIGHT(B2,(LEN(B2)) ),(LEN(RIGHT(B2,(LEN(B2))))-FIND(".",RIGHT(B2,(LEN(B2)))))),LEN(RIGHT(RIGHT(B2,(LEN(B2))),(LEN(RIGHT (B2,(LEN(B2))))-FIND(".",RIGHT(B2,(LEN(B2)))))))-FIND(".",RIGHT(RIGHT(B2,(LEN(B2))),(LEN(RIGHT(B2,(L EN(B2))))-FIND(".",RIGHT(B2,(LEN(B2)))))))),FIND(".",RIGHT(RIGHT(RIGHT(B2,(LEN(B2))),(LEN(RIGHT(B2,( LEN(B2))))-FIND(".",RIGHT(B2,(LEN(B2)))))),LEN(RIGHT(RIGHT(B2,(LEN(B2))),(LEN(RIGHT(B2,(LEN(B2))))-F IND(".",RIGHT(B2,(LEN(B2)))))))-FIND(".",RIGHT(RIGHT(B2,(LEN(B2))),(LEN(RIGHT(B2,(LEN(B2))))-FIND(". ",RIGHT(B2,(LEN(B2)))))))))-1)+256)+(RIGHT(RIGHT(RIGHT(RIGHT(B2,(LEN(B2))),(LEN(RIGHT(B2,(LEN(B2)))) -FIND(".",RIGHT(B2,(LEN(B2)))))),LEN(RIGHT(RIGHT(B2,(LEN(B2))),(LEN(RIGHT(B2,(LEN(B2))))-FIND(".",RI GHT(B2,(LEN(B2)))))))-FIND(".",RIGHT(RIGHT(B2,(LEN(B2))),(LEN(RIGHT(B2,(LEN(B2))))-FIND(".",RIGHT(B2 ,(LEN(B2)))))))),(LEN(RIGHT(RIGHT(B2,(LEN(B2))),(LEN(RIGHT(B2,(LEN(B2))))-FIND(".",RIGHT(B2,(LEN(B2) ))))))-FIND(".",RIGHT(RIGHT(B2,(LEN(B2))),(LEN(RIGHT(B2,(LEN(B2))))-FIND(".",RIGHT(B2,(LEN(B2))))))) -FIND(".",RIGHT(RIGHT(RIGHT(B2,(LEN(B2))),(LEN(RIGHT(B2,(LEN(B2))))-FIND(".",RIGHT(B2,(LEN(B2)))))), LEN(RIGHT(RIGHT(B2,(LEN(B2))),(LEN(RIGHT(B2,(LEN(B2))))-FIND(".",RIGHT(B2,(LEN(B2)))))))-FIND(".",RI GHT(RIGHT(B2,(LEN(B2))),(LEN(RIGHT(B2,(LEN(B2))))-FIND(".",RIGHT(B2,(LEN(B2))))))))))))
Then:
=VLOOKUP(VLOOKUP((LEFT(INDIRECT("RC[-1]",0),FIND(".",INDIRECT("RC[-1]",0))-1)*16777216)+(LEFT(RIGHT( INDIRECT("RC[-1]",0),(LEN(INDIRECT("RC[-1]",0))-FIND(".",INDIRECT("RC[-1]",0)))),FIND(".",RIGHT(INDI RECT("RC[-1]",0),(LEN(INDIRECT("RC[-1]",0))-FIND(".",INDIRECT("RC[-1]",0)))))-1)*65536)+(LEFT(RIGHT( RIGHT(INDIRECT("RC[-1]",0),(LEN(INDIRECT("RC[-1]",0))-FIND(".",INDIRECT("RC[-1]",0)))),LEN(RIGHT(IND IRECT("RC[-1]",0),(LEN(INDIRECT("RC[-1]",0))-FIND(".",INDIRECT("RC[-1]",0)))))-FIND(".",RIGHT(INDIRE CT("RC[-1]",0),(LEN(INDIRECT("RC[-1]",0))-FIND(".",INDIRECT("RC[-1]",0)))))),FIND(".",RIGHT(RIGHT(IN DIRECT("RC[-1]",0),(LEN(INDIRECT("RC[-1]",0))-FIND(".",INDIRECT("RC[-1]",0)))),LEN(RIGHT(INDIRECT("R C[-1]",0),(LEN(INDIRECT("RC[-1]",0))-FIND(".",INDIRECT("RC[-1]",0)))))-FIND(".",RIGHT(INDIRECT("RC[- 1]",0),(LEN(INDIRECT("RC[-1]",0))-FIND(".",INDIRECT("RC[-1]",0)))))))-1)+256)+(RIGHT(RIGHT(RIGHT(IND IRECT("RC[-1]",0),(LEN(INDIRECT("RC[-1]",0))-FIND(".",INDIRECT("RC[-1]",0)))),LEN(RIGHT(INDIRECT("RC [-1]",0),(LEN(INDIRECT("RC[-1]",0))-FIND(".",INDIRECT("RC[-1]",0)))))-FIND(".",RIGHT(INDIRECT("RC[-1 ]",0),(LEN(INDIRECT("RC[-1]",0))-FIND(".",INDIRECT("RC[-1]",0)))))),(LEN(RIGHT(INDIRECT("RC[-1]",0), (LEN(INDIRECT("RC[-1]",0))-FIND(".",INDIRECT("RC[-1]",0)))))-FIND(".",RIGHT(INDIRECT("RC[-1]",0),(LE N(INDIRECT("RC[-1]",0))-FIND(".",INDIRECT("RC[-1]",0)))))-FIND(".",RIGHT(RIGHT(INDIRECT("RC[-1]",0), (LEN(INDIRECT("RC[-1]",0))-FIND(".",INDIRECT("RC[-1]",0)))),LEN(RIGHT(INDIRECT("RC[-1]",0),(LEN(INDI RECT("RC[-1]",0))-FIND(".",INDIRECT("RC[-1]",0)))))-FIND(".",RIGHT(INDIRECT("RC[-1]",0),(LEN(INDIREC T("RC[-1]",0))-FIND(".",INDIRECT("RC[-1]",0)))))))))),ip_lookups!C2:D160975,2,TRUE),country_mapping! A2:E249,5,FALSE)
At the end of the formula, you’ll see ip_lookups!A1:C160975
GeoIP!!!!
There are possibly going to be errors. For example it assumes all blocks are taken and are blocks are contiguous and that Megamind won’t change CSV format (this guide worked as of March 2015). If you spot any big issues or have suggested improvements (beyond ‘kill it with fire’) please let us know.