Skip to content

Example: White Hosue Visitor Access Records

timrdf edited this page Sep 23, 2011 · 60 revisions

http://www.whitehouse.gov/briefing-room/disclosures/visitor-records has been providing links to data files describing visits to the White House. Those links have changed over the years, and the corresponding data provided has also changed. To deal with these changes, we cached the data to create the following versions (listed according to date of retrieval). The seven versions of the dataset are available from RPI's SVN.

  • 0310
  • 0510
  • 0810
  • 0910
  • 2009-2010
  • 0511
  • 2011-Aug-26

Version: 0310

This is the earliest version that we saved. We don't know where we got it, except that we promise that we followed the link that was listed at http://www.whitehouse.gov/briefing-room/disclosures/visitor-records. We weren't capturing any provenance.

The headers in the data file are listed below. We're pretty sure that WhiteHouse-WAVES-Key-1209.txt is a copy of the documentation available in March 2010.

NAMELAST
NAMEFIRST
NAMEMID
UIN
BDGNBR
ACCESS_TYPE
TOA
POA
TOD
POD
APPT_MADE_DATE
APPT_START_DATE
APPT_END_DATE
APPT_CANCEL_DATE
Total_People
LAST_UPDATEDBY
POST
LastEntryDate
TERMINAL_SUFFIX
visitee_namelast
visitee_namefirst
MEETING_LOC
MEETING_ROOM
CALLER_NAME_LAST
CALLER_NAME_FIRST
CALLER_ROOM
description
RELEASE_DATE

Version: 0510

We retrieved http://www.whitehouse.gov/files/disclosures/visitors/WhiteHouse-WAVES-Released-0510.csv on 2010-07-08 and got this.

When comparing the 0310 headers to the 0510 headers, we only see a capitalization/underscore tweak:

NAMELAST								NAMELAST
NAMEFIRST								NAMEFIRST
NAMEMID									NAMEMID
UIN									UIN
BDGNBR									BDGNBR
ACCESS_TYPE								ACCESS_TYPE
TOA									TOA
POA									POA
TOD									TOD
POD									POD
APPT_MADE_DATE								APPT_MADE_DATE
APPT_START_DATE								APPT_START_DATE
APPT_END_DATE								APPT_END_DATE
APPT_CANCEL_DATE							APPT_CANCEL_DATE
Total_People								Total_People
LAST_UPDATEDBY								LAST_UPDATEDBY
POST									POST
LastEntryDate								LastEntryDate
TERMINAL_SUFFIX								TERMINAL_SUFFIX
visitee_namelast							visitee_namelast
visitee_namefirst							visitee_namefirst
MEETING_LOC								MEETING_LOC
MEETING_ROOM								MEETING_ROOM
CALLER_NAME_LAST							CALLER_NAME_LAST
CALLER_NAME_FIRST							CALLER_NAME_FIRST
CALLER_ROOM								CALLER_ROOM
description								description
RELEASE_DATE							    |	Release Date

Version: 0810

Although the data file had the string 0827, (presumably a month and day), we used 0810 to be consistent with the "convention" that they were following with "monthyear". Once they set a path with the first step, they change it on the second step of the journey!

We retrieved http://www.whitehouse.gov/files/disclosures/visitors/WhiteHouse-WAVES-Released-0827.csv on 2010-09-12 and got this.

When comparing the headers from 0510 to the headers of 0810, we see that they reverted to the older CAPS_UNDERSCORE naming for Release Date and decided that a capital Description looked nicer. Looks like we didn't save a copy of the header documentation this time; probably because the headers didn't change enough.

NAMELAST								NAMELAST
NAMEFIRST								NAMEFIRST
NAMEMID									NAMEMID
UIN									UIN
BDGNBR									BDGNBR
ACCESS_TYPE								ACCESS_TYPE
TOA									TOA
POA									POA
TOD									TOD
POD									POD
APPT_MADE_DATE								APPT_MADE_DATE
APPT_START_DATE								APPT_START_DATE
APPT_END_DATE								APPT_END_DATE
APPT_CANCEL_DATE							APPT_CANCEL_DATE
Total_People								Total_People
LAST_UPDATEDBY								LAST_UPDATEDBY
POST									POST
LastEntryDate								LastEntryDate
TERMINAL_SUFFIX								TERMINAL_SUFFIX
visitee_namelast							visitee_namelast
visitee_namefirst							visitee_namefirst
MEETING_LOC								MEETING_LOC
MEETING_ROOM								MEETING_ROOM
CALLER_NAME_LAST							CALLER_NAME_LAST
CALLER_NAME_FIRST							CALLER_NAME_FIRST
CALLER_ROOM								CALLER_ROOM
description							    |	Description
Release Date							    |	RELEASE_DATE

Version: 0910

I really wish I knew who thought, asked, and executed the following thoughts:

No changes to the headers (once you deal with the tabs):

NAMELAST								NAMELAST
NAMEFIRST								NAMEFIRST
NAMEMID									NAMEMID
UIN									UIN
BDGNBR									BDGNBR
ACCESS_TYPE								ACCESS_TYPE
TOA									TOA
POA									POA
TOD									TOD
POD									POD
APPT_MADE_DATE								APPT_MADE_DATE
APPT_START_DATE								APPT_START_DATE
APPT_END_DATE								APPT_END_DATE
APPT_CANCEL_DATE							APPT_CANCEL_DATE
Total_People								Total_People
LAST_UPDATEDBY								LAST_UPDATEDBY
POST									POST
LastEntryDate								LastEntryDate
TERMINAL_SUFFIX								TERMINAL_SUFFIX
visitee_namelast							visitee_namelast
visitee_namefirst							visitee_namefirst
MEETING_LOC								MEETING_LOC
MEETING_ROOM								MEETING_ROOM
CALLER_NAME_LAST							CALLER_NAME_LAST
CALLER_NAME_FIRST							CALLER_NAME_FIRST
CALLER_ROOM								CALLER_ROOM
Description								Description
RELEASE_DATE								RELEASE_DATE

Version: 2009-2010

We retrieved http://www.whitehouse.gov/files/disclosures/visitors/WhiteHouse-WAVES-Released-1210.zip on 2010-12-29 and got this, which uncompressed to this.

This appears to be some aggregate of the previous releases.

No header changes from 910:

NAMELAST								NAMELAST
NAMEFIRST								NAMEFIRST
NAMEMID									NAMEMID
UIN									UIN
BDGNBR									BDGNBR
ACCESS_TYPE								ACCESS_TYPE
TOA									TOA
POA									POA
TOD									TOD
POD									POD
APPT_MADE_DATE								APPT_MADE_DATE
APPT_START_DATE								APPT_START_DATE
APPT_END_DATE								APPT_END_DATE
APPT_CANCEL_DATE							APPT_CANCEL_DATE
Total_People								Total_People
LAST_UPDATEDBY								LAST_UPDATEDBY
POST									POST
LastEntryDate								LastEntryDate
TERMINAL_SUFFIX								TERMINAL_SUFFIX
visitee_namelast							visitee_namelast
visitee_namefirst							visitee_namefirst
MEETING_LOC								MEETING_LOC
MEETING_ROOM								MEETING_ROOM
CALLER_NAME_LAST							CALLER_NAME_LAST
CALLER_NAME_FIRST							CALLER_NAME_FIRST
CALLER_ROOM								CALLER_ROOM
Description								Description
RELEASE_DATE								RELEASE_DATE

Version: 0511

We retrieved http://www.whitehouse.gov/files/disclosures/visitors/WhiteHouse-WAVES-Released-0511.zip on 2011-05-27 and got this, which uncompressed to this.

Government censorship! They removed CALLER_ROOM in this release.

NAMELAST								NAMELAST
NAMEFIRST								NAMEFIRST
NAMEMID									NAMEMID
UIN									UIN
BDGNBR									BDGNBR
ACCESS_TYPE								ACCESS_TYPE
TOA									TOA
POA									POA
TOD									TOD
POD									POD
APPT_MADE_DATE								APPT_MADE_DATE
APPT_START_DATE								APPT_START_DATE
APPT_END_DATE								APPT_END_DATE
APPT_CANCEL_DATE							APPT_CANCEL_DATE
Total_People								Total_People
LAST_UPDATEDBY								LAST_UPDATEDBY
POST									POST
LastEntryDate								LastEntryDate
TERMINAL_SUFFIX								TERMINAL_SUFFIX
visitee_namelast							visitee_namelast
visitee_namefirst							visitee_namefirst
MEETING_LOC								MEETING_LOC
MEETING_ROOM								MEETING_ROOM
CALLER_NAME_LAST							CALLER_NAME_LAST
CALLER_NAME_FIRST							CALLER_NAME_FIRST
CALLER_ROOM							    <
Description								Description
RELEASE_DATE								RELEASE_DATE

Version: 2011-Aug-26

On 2011-09-14, http://www.whitehouse.gov/briefing-room/disclosures/visitor-records said:

  • "To download Part 1 of the data released in 2011 in its raw format, click here. (.zip of a .csv, 7.4MB)"
  • "To download Part 2 of the data released in 2011 in its raw format, click here. (.zip of a .csv, 3.4MB)"
  • "To download an explanation of the column headers contained in the raw data file, click here. (.txt, 1.3KB)"

This version was named 2011-Aug-26 for the following reasons:

Cached versions of the URLs referenced above are available at:

The headers of Part 1 and Part 2 are inconsistent:

0 NAMELAST								0 NAMELAST
1 NAMEFIRST								1 NAMEFIRST
2 NAMEMID								2 NAMEMID
3 UIN									3 UIN
4 BDGNBR								4 BDGNBR
5 ACCESS_TYPE								5 ACCESS_TYPE
6 TOA									6 TOA
7 POA									7 POA
8 TOD									8 TOD
9 POD									9 POD
10 APPT_MADE_DATE						10 APPT_MADE_DATE
11 APPT_START_DATE						11 APPT_START_DATE
12 APPT_END_DATE						12 APPT_END_DATE
13 APPT_CANCEL_DATE						13 APPT_CANCEL_DATE
14 Total_People							14 Total_People
15 LAST_UPDATEDBY						15 LAST_UPDATEDBY
16 POST									16 POST
17 LastEntryDate							17 LastEntryDate
18 TERMINAL_SUFFIX							18 TERMINAL_SUFFIX
19 visitee_namelast							19 visitee_namelast
20 visitee_namefirst							20 visitee_namefirst
21 MEETING_LOC								21 MEETING_LOC
22 MEETING_ROOM								22 MEETING_ROOM
23 CALLER_NAME_LAST							23 CALLER_NAME_LAST
24 CALLER_NAME_FIRST							24 CALLER_NAME_FIRST
25 Description							    |	25 CALLER_ROOM
26 RELEASE_DATE							    |	26 description
27 								    |	27 release_date
28 									28

Clone this wiki locally