Skip to content

Characterizing table completeness

timrdf edited this page May 1, 2011 · 27 revisions
bash-3.2$ java edu.rpi.tw.data.csv.util.BinaryTable
usage: BinaryTable <file> [--comment-character char] [--header-line headerLineNumber] [--delimiter delimiter]

Column numbers along top, pattern occurrence frequency along right, completeness indication along bottom (for geonames US zip codes):

bash-3.2$ java edu.rpi.tw.data.csv.util.BinaryTable source/US.txt --header-line 0 --delimiter '\t'
123456789012
.....    ...| 3
.......  .. | 41940
..... .  .. | 1
... .    .. | 4
.......  ...| 147
... ...  .. | 10
.....    .. | 1408
......   .. | 84
... . .  .. | 5
|||_|____||_

See also

Clone this wiki locally