Reshaping Data¶
VisiData provides several ways to reshape your data:
- Creating pivot tables
- Melting datasets
- Transposing columns and rows
Creating pivot tables¶
VisiData’s pivot tables are similar to pivot tables you might have created in various spreadsheet programs. Pivot tables create a cross-tabulation of two or more columns in a dataset.
In VisiData, creating a pivot table involves the following steps:
- Use ! to designate the column(s) you want to serve as the pivot table’s rows.
- Optional: Use + to specify additional metrics you want the pivot table to calculate. (By default, the pivot table’s sole metric will be the overall count for each grouping.)
- Navigate to the column you want to serve as the pivot table’s columns.
- Press Shift-W
Let’s say we want to cross-tabulate species by whether their remains were collected. First, let’s designate the SPECIES
column as a key column:
File Edit View Column Row Data Plot System Help | VisiData 3.0.2 | Alt+H for help men SPECIES ║<STATE │ AIRPORT │ PHASE_OF_FLT │ HEIGHT │ SPEED │ BIRDS_STRUCK │ > Unknown bird ║ FL │ VERO BEACH MUNICIP…│ APPROACH │ │ │ 1 │ N Unknown bird ║ AK │ KENAI MUNICIPAL AR…│ APPROACH │ │ │ 1 │ N Unknown bird ║ TX │ DAVID WAYNE HOOKS …│ │ │ │ 1 │ N Unknown bird ║ MO │ LAMBERT-ST LOUIS I…│ APPROACH │ │ │ 1 │ N Unknown bird ║ FL │ POMPANO BEACH AIRP…│ LANDING ROLL │ 0 │ │ 1 │ Unknown bird ║ VI │ HENRY E ROHLSEN AR…│ │ │ │ 1 │ O Unknown bird ║ TX │ SAN ANTONIO INTL │ APPROACH │ │ │ 1 │ N Unknown bird ║ TX │ LONE STAR EXECUTIV…│ DEPARTURE │ │ │ 1 │ P Unknown bird ║ FL │ TAMPA INTL │ APPROACH │ 6000 │ │ 1 │ N Owls ║ MO │ LAMBERT-ST LOUIS I…│ APPROACH │ │ │ 1 │ N Hawks ║ FL │ OPA-LOCKA EXECUTIV…│ APPROACH │ │ │ 1 │ N Gulls ║ CA │ NORMAN Y. MINETA S…│ │ │ │ 1 │ N Unknown bird - sma…║ FL │ FORT LAUDERDALE/HO…│ APPROACH │ 1500 │ │ 1 │ N Unknown bird - sma…║ AR │ FORT SMITH REGIONA…│ CLIMB │ │ │ 1 │ N Unknown bird - sma…║ AR │ BILL AND HILLARY …│ LANDING ROLL │ 0 │ │ 1 │ N Unknown bird ║ │ UNKNOWN │ En Route │ │ │ 1 │ P Unknown bird ║ CA │ METRO OAKLAND INTL │ │ │ │ 1 │ N Unknown bird ║ UT │ SALT LAKE CITY INTL│ │ │ │ 1 │ N Unknown bird ║ TX │ GEORGE BUSH INTERC…│ CLIMB │ │ │ 1 │ N Unknown bird ║ FL │ ORLANDO SANFORD IN…│ APPROACH │ │ │ 1 │ N Unknown bird ║ IL │ CHICAGO O'HARE INT…│ CLIMB │ 12000 │ │ 1 │ P 1› faa-wildlife-strikes| ! key-col 73448 rows
Then, navigate to the REMAINS_COLLECTED
column, either by tapping l or the right-arrow until we get there, or by typing c followed by REMAINS
and then Enter:
File Edit View Column Row Data Plot System Help | VisiData 3.0.2 | Alt+H for help men SPECIES ║<EFFECT │ DAMAGE │ COST_REPAIRS │ PERSON │ REMAINS_COLLECTED │ RE… Unknown bird ║ NONE │ M │ │ Tower │ 0 │ N9… Unknown bird ║ NONE │ M │ │ Tower │ 0 │ ON… Unknown bird ║ NONE │ M │ │ Tower │ 0 │ N9… Unknown bird ║ NONE │ M │ │ Tower │ 0 │ NO… Unknown bird ║ │ M │ │ Tower │ 0 │ ME… Unknown bird ║ Other │ M? │ │ Tower │ 0 │ DA… Unknown bird ║ NONE │ M? │ │ Tower │ 0 │ DA… Unknown bird ║ PRECAUTIONARY LAND…│ M? │ │ Tower │ 0 │ N9… Unknown bird ║ NONE │ M? │ │ Tower │ 0 │ AA… Owls ║ NONE │ M? │ │ Tower │ 0 │ LO… Hawks ║ NONE │ M? │ │ Tower │ 0 │ N5… Gulls ║ NONE │ M? │ │ Tower │ 0 │ PO… Unknown bird - sma…║ NONE │ N │ │ Tower │ 0 │ PI… Unknown bird - sma…║ NONE │ N │ │ Tower │ 0 │ AS… Unknown bird - sma…║ NONE │ N │ │ Tower │ 0 │ AS… Unknown bird ║ PRECAUTIONARY LAND…│ N │ │ Tower │ 0 │ CM… Unknown bird ║ NONE │ N │ │ Tower │ 0 │ DA… Unknown bird ║ NONE │ N │ │ Tower │ 0 │ De… Unknown bird ║ NONE │ N │ │ Tower │ 0 │ DL… Unknown bird ║ NONE │ N │ │ Tower │ 0 │ AI… Unknown bird ║ PRECAUTIONARY LAND…│ N │ │ Tower │ 0 │ Ai… 1› faa-wildlife-strikes| c go-col-regex 73448 rows
Now, press Shift+W to create the pivot table:
File Edit View Column Row Data Plot System Help | VisiData 3.0.2 | Alt+H for help men SPECIES ║ 0 #│ 1 #║ Unknown bird ║ 6677 │ 393 ║ Owls ║ 36 │ 23 ║ Hawks ║ 202 │ 106 ║ Gulls ║ 559 │ 525 ║ Unknown bird - sma…║ 12544│ 1523 ║ Sparrows ║ 346 │ 542 ║ Sandhill crane ║ 15 │ 26 ║ Unknown bird - med…║ 3585 │ 322 ║ Unknown bird - lar…║ 769 │ 58 ║ Barn swallow ║ 246 │ 2838 ║ Rock pigeon ║ 149 │ 1008 ║ Barn owl ║ 22 │ 517 ║ Bank swallow ║ 14 │ 267 ║ Swallows ║ 152 │ 302 ║ Horned lark ║ 56 │ 2634 ║ Northern pintail ║ 0 │ 93 ║ American robin ║ 32 │ 644 ║ American kestrel ║ 81 │ 2506 ║ Microbats ║ 43 │ 443 ║ Mourning dove ║ 200 │ 3909 ║ Free-tailed bats ║ 1 │ 50 ║ 2› faa-wildlife-strikes_pivot_REMAINS_COLLECTED_count| Shift+W 641 grouped rows
… and g_ to auto-adjust the column widths:
File Edit View Column Row Data Plot System Help | VisiData 3.0.2 | Alt+H for help men SPECIES ║ 0 #│ 1 #║ Unknown bird ║ 6677 │ 393 ║ Owls ║ 36 │ 23 ║ Hawks ║ 202 │ 106 ║ Gulls ║ 559 │ 525 ║ Unknown bird - small ║ 12544 │ 1523 ║ Sparrows ║ 346 │ 542 ║ Sandhill crane ║ 15 │ 26 ║ Unknown bird - medium ║ 3585 │ 322 ║ Unknown bird - large ║ 769 │ 58 ║ Barn swallow ║ 246 │ 2838 ║ Rock pigeon ║ 149 │ 1008 ║ Barn owl ║ 22 │ 517 ║ Bank swallow ║ 14 │ 267 ║ Swallows ║ 152 │ 302 ║ Horned lark ║ 56 │ 2634 ║ Northern pintail ║ 0 │ 93 ║ American robin ║ 32 │ 644 ║ American kestrel ║ 81 │ 2506 ║ Microbats ║ 43 │ 443 ║ Mourning dove ║ 200 │ 3909 ║ Free-tailed bats ║ 1 │ 50 ║ 2› faa-wildlife-strikes_pivot_REMAINS_COLLECTED_count g_ resize-cols-max 641 grouped rows
The rows of the pivot table represent each species, while the columns represent the number of rows for each species that fall into each REMAINS_COLLECTED
category.
This is a simple pivot table, since REMAINS_COLLECTED
can only be either 0
or 1
, but pivot tables on more complex columns can end up much wider.
Note
The order of the columns in a pivot table is based on the order the relevant values appear in the source sheet. If you want them to appear, instead, in alphabetical order, sort the source sheet’s relevant column(s) first.
Melting datasets¶
To “melt” a dataset is to reshape it from a “wide” format to a “long” one, specifically by converting each value in each column into its own row. If that concept is unfamiliar, the example below should help clarify.
Melting a dataset in VisiData involves the following steps:
- Optional: Use ! to designate the column(s) you want to keep unmelted.
- Optional: Use - to hide the columns you don’t want to appear, at all, in the melted sheet.
- Press Shift+M
If you skip the optional steps, pressing Shift-M on the original faa-wildlife-strikes.csv
dataset creates this melted sheet:
File Edit View Column Row Data Plot System Help | VisiData 3.0.2 | Alt+H for help men Variable │ Value OPERATOR │ BUSINESS ATYPE │ PA-28 INCIDENT_DATE │ 05/22/15 00:00:00 STATE │ FL AIRPORT │ VERO BEACH MUNICIPAL PHASE_OF_FLT │ APPROACH HEIGHT │ SPEED │ SPECIES │ Unknown bird BIRDS_STRUCK │ 1 EFFECT │ NONE DAMAGE │ M COST_REPAIRS │ PERSON │ Tower REMAINS_COLLECTED │ 0 REMARKS │ N9240F was right base to final on Runway 4, and he reported a bird strike to … OPERATOR │ BUSINESS ATYPE │ BE-1900 INCIDENT_DATE │ 06/18/15 00:00:00 STATE │ AK AIRPORT │ KENAI MUNICIPAL ARPT 3› faa-wildlife-strikes_melted| Shift+M 1175168 melted values
Now let’s examine how the optional steps affect melting. Press q to return to the source sheet, and press ! on each of the first two columns (OPERATOR
and ATYPE
):
File Edit View Column Row Data Plot System Help | VisiData 3.0.2 | Alt+H for help men OPERATOR │ ATYPE ║ INCIDENT_DATE │ STATE │ AIRPORT │ PHASE_OF_FLT> BUSINESS │ PA-28 ║ 05/22/15 00:00:00 │ FL │ VERO BEACH MUNICIP…│ APPROACH │ BUSINESS │ BE-1900 ║ 06/18/15 00:00:00 │ AK │ KENAI MUNICIPAL AR…│ APPROACH │ BUSINESS │ PA-46 MALIBU ║ 09/20/15 00:00:00 │ TX │ DAVID WAYNE HOOKS …│ │ DELTA AIR LINES │ B-717-200 ║ 11/07/15 00:00:00 │ MO │ LAMBERT-ST LOUIS I…│ APPROACH │ BUSINESS │ BE-90 KING ║ 12/17/15 00:00:00 │ FL │ POMPANO BEACH AIRP…│ LANDING ROLL │ DELTA AIR LINES │ B-757 ║ 07/17/15 00:00:00 │ VI │ HENRY E ROHLSEN AR…│ │ DELTA AIR LINES │ B-717-200 ║ 08/02/15 00:00:00 │ TX │ SAN ANTONIO INTL │ APPROACH │ BUSINESS │ C-414 ║ 08/03/15 00:00:00 │ TX │ LONE STAR EXECUTIV…│ DEPARTURE │ ALLEGIANT AIR │ MD-80 ║ 09/02/15 00:00:00 │ FL │ TAMPA INTL │ APPROACH │ TRANS STATES AIRLI…│ EMB-145 ║ 09/07/15 00:00:00 │ MO │ LAMBERT-ST LOUIS I…│ APPROACH │ BUSINESS │ C-172 ║ 11/28/15 00:00:00 │ FL │ OPA-LOCKA EXECUTIV…│ APPROACH │ GOVERNMENT │ EC120 ║ 12/08/15 00:00:00 │ CA │ NORMAN Y. MINETA S…│ │ AMERICAN AIRLINES │ A-321 ║ 05/06/15 00:00:00 │ FL │ FORT LAUDERDALE/HO…│ APPROACH │ EXPRESSJET AIRLINES│ CRJ100/200 ║ 05/06/15 00:00:00 │ AR │ FORT SMITH REGIONA…│ CLIMB │ MESA AIRLINES │ CRJ900 ║ 05/08/15 00:00:00 │ AR │ BILL AND HILLARY …│ LANDING ROLL │ BUSINESS │ HELICOPTER ║ 05/06/15 00:00:00 │ │ UNKNOWN │ En Route │ DELTA AIR LINES │ A-320 ║ 05/07/15 00:00:00 │ CA │ METRO OAKLAND INTL │ │ DELTA AIR LINES │ A-320 ║ 05/08/15 00:00:00 │ UT │ SALT LAKE CITY INTL│ │ LUFTHANSA │ A-380 ║ 05/10/15 00:00:00 │ TX │ GEORGE BUSH INTERC…│ CLIMB │ BUSINESS │ C-172 ║ 05/08/15 00:00:00 │ FL │ ORLANDO SANFORD IN…│ APPROACH │ SPIRIT AIRLINES │ A-319 ║ 05/10/15 00:00:00 │ IL │ CHICAGO O'HARE INT…│ CLIMB │ 1› faa-wildlife-strikes| ! key-col 73448 rows
Then use - (or the Columns Sheet) to hide all the other columns except for STATE
and AIRPORT
:
File Edit View Column Row Data Plot System Help | VisiData 3.0.2 | Alt+H for help men OPERATOR │ ATYPE ║ STATE │ AIRPORT ║ BUSINESS │ PA-28 ║ FL │ VERO BEACH MUNICIP…║ BUSINESS │ BE-1900 ║ AK │ KENAI MUNICIPAL AR…║ BUSINESS │ PA-46 MALIBU ║ TX │ DAVID WAYNE HOOKS …║ DELTA AIR LINES │ B-717-200 ║ MO │ LAMBERT-ST LOUIS I…║ BUSINESS │ BE-90 KING ║ FL │ POMPANO BEACH AIRP…║ DELTA AIR LINES │ B-757 ║ VI │ HENRY E ROHLSEN AR…║ DELTA AIR LINES │ B-717-200 ║ TX │ SAN ANTONIO INTL ║ BUSINESS │ C-414 ║ TX │ LONE STAR EXECUTIV…║ ALLEGIANT AIR │ MD-80 ║ FL │ TAMPA INTL ║ TRANS STATES AIRLI…│ EMB-145 ║ MO │ LAMBERT-ST LOUIS I…║ BUSINESS │ C-172 ║ FL │ OPA-LOCKA EXECUTIV…║ GOVERNMENT │ EC120 ║ CA │ NORMAN Y. MINETA S…║ AMERICAN AIRLINES │ A-321 ║ FL │ FORT LAUDERDALE/HO…║ EXPRESSJET AIRLINES│ CRJ100/200 ║ AR │ FORT SMITH REGIONA…║ MESA AIRLINES │ CRJ900 ║ AR │ BILL AND HILLARY …║ BUSINESS │ HELICOPTER ║ │ UNKNOWN ║ DELTA AIR LINES │ A-320 ║ CA │ METRO OAKLAND INTL ║ DELTA AIR LINES │ A-320 ║ UT │ SALT LAKE CITY INTL║ LUFTHANSA │ A-380 ║ TX │ GEORGE BUSH INTERC…║ BUSINESS │ C-172 ║ FL │ ORLANDO SANFORD IN…║ SPIRIT AIRLINES │ A-319 ║ IL │ CHICAGO O'HARE INT…║ 1› faa-wildlife-strikes| gh go-leftmost 73448 rows
Now press Shift-M. In the resulting melted sheet, OPERATOR
and ATYPE
(the columns you keyed with !) are preserved as standard columns while STATE
and AIRPORT
have been converted to Variable-Value
pairs:
File Edit View Column Row Data Plot System Help | VisiData 3.0.2 | Alt+H for help men OPERATOR │ ATYPE ║ Variable │ Value ║ BUSINESS │ PA-28 ║ STATE │ FL ║ BUSINESS │ PA-28 ║ AIRPORT │ VERO BEACH MUNICIPAL ║ BUSINESS │ BE-1900 ║ STATE │ AK ║ BUSINESS │ BE-1900 ║ AIRPORT │ KENAI MUNICIPAL ARPT ║ BUSINESS │ PA-46 MALIBU ║ STATE │ TX ║ BUSINESS │ PA-46 MALIBU ║ AIRPORT │ DAVID WAYNE HOOKS MEMORIAL ARPT ║ DELTA AIR LINES │ B-717-200 ║ STATE │ MO ║ DELTA AIR LINES │ B-717-200 ║ AIRPORT │ LAMBERT-ST LOUIS INTL ║ BUSINESS │ BE-90 KING ║ STATE │ FL ║ BUSINESS │ BE-90 KING ║ AIRPORT │ POMPANO BEACH AIRPARK ║ DELTA AIR LINES │ B-757 ║ STATE │ VI ║ DELTA AIR LINES │ B-757 ║ AIRPORT │ HENRY E ROHLSEN ARPT ║ DELTA AIR LINES │ B-717-200 ║ STATE │ TX ║ DELTA AIR LINES │ B-717-200 ║ AIRPORT │ SAN ANTONIO INTL ║ BUSINESS │ C-414 ║ STATE │ TX ║ BUSINESS │ C-414 ║ AIRPORT │ LONE STAR EXECUTIVE ARPT ║ ALLEGIANT AIR │ MD-80 ║ STATE │ FL ║ ALLEGIANT AIR │ MD-80 ║ AIRPORT │ TAMPA INTL ║ TRANS STATES AIRLI…│ EMB-145 ║ STATE │ MO ║ TRANS STATES AIRLI…│ EMB-145 ║ AIRPORT │ LAMBERT-ST LOUIS INTL ║ BUSINESS │ C-172 ║ STATE │ FL ║ 4› faa-wildlife-strikes_melted| Shift+M 146896 melted values
Transposing columns and rows¶
In VisiData, you can press Shift-T to “transpose” any given sheet, essentially rotating the struture 90 degrees, so that the rows are represented as columns (and vice versa).
Pressing Shift-T on the original faa-wildlife-strikes.csv
dataset should give you this result:
File Edit View Column Row Data Plot System Help | VisiData 3.0.2 | Alt+H for help men ║ │ │ │ > OPERATOR ║ BUSINESS │ BUSINESS │ BUSINESS │ DELTA AIR LINES ATYPE ║ PA-28 │ BE-1900 │ PA-46 MALIBU │ B-717-200 INCIDENT_DATE ║ 05/22/15 00:00:00 │ 06/18/15 00:00:00 │ 09/20/15 00:00:00 │ 11/07/15 00:00… STATE ║ FL │ AK │ TX │ MO AIRPORT ║ VERO BEACH MUNICIP…│ KENAI MUNICIPAL AR…│ DAVID WAYNE HOOKS …│ LAMBERT-ST LOU… PHASE_OF_FLT ║ APPROACH │ APPROACH │ │ APPROACH HEIGHT ║ │ │ │ SPEED ║ │ │ │ SPECIES ║ Unknown bird │ Unknown bird │ Unknown bird │ Unknown bird BIRDS_STRUCK ║ 1 │ 1 │ 1 │ 1 EFFECT ║ NONE │ NONE │ NONE │ NONE DAMAGE ║ M │ M │ M │ M COST_REPAIRS ║ │ │ │ PERSON ║ Tower │ Tower │ Tower │ Tower REMAINS_COLLECTED ║ 0 │ 0 │ 0 │ 0 REMARKS ║ N9240F was right b…│ ON FINAL, PILOT RE…│ N952G, P46T/G, bir…│ NO EMERGENCY: … 5› faa-wildlife-strikes_T| processing… Shift+T 16 rows
If your source sheet has a key column, the values in that column will become the headers for the transposed sheet. For instance, here’s the frequency table (with Shift-F) for the dataset’s OPERATOR
column:
File Edit View Column Row Data Plot System Help | VisiData 3.0.2 | Alt+H for help men OPERATOR ║↓count♯│ percent%│ histogram ~║ UNKNOWN ║ 23076 │ 31.42 │ ■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■ ║ SOUTHWEST AIRLINES ║ 7752 │ 10.55 │ ■■■■■■■■■■■■ ║ BUSINESS ║ 5868 │ 7.99 │ ■■■■■■■■■ ║ AMERICAN AIRLINES ║ 4337 │ 5.90 │ ■■■■■■■ ║ DELTA AIR LINES ║ 2817 │ 3.84 │ ■■■■ ║ FEDEX EXPRESS ║ 2709 │ 3.69 │ ■■■■ ║ UNITED AIRLINES ║ 2194 │ 2.99 │ ■■■ ║ US AIRWAYS ║ 1885 │ 2.57 │ ■■■ ║ UPS AIRLINES ║ 1773 │ 2.41 │ ■■ ║ SKYWEST AIRLINES ║ 1769 │ 2.41 │ ■■ ║ JETBLUE AIRWAYS ║ 1740 │ 2.37 │ ■■ ║ EXPRESSJET AIRLINES║ 1347 │ 1.83 │ ■■ ║ AMERICAN EAGLE AIR…║ 1041 │ 1.42 │ ■ ║ ENVOY AIR ║ 883 │ 1.20 │ ■ ║ ALASKA AIRLINES ║ 835 │ 1.14 │ ■ ║ REPUBLIC AIRLINES ║ 804 │ 1.09 │ ■ ║ MESA AIRLINES ║ 693 │ 0.94 │ ■ ║ AIR WISCONSIN AIRL…║ 623 │ 0.85 │ ■ ║ PSA AIRLINES ║ 577 │ 0.79 │ ║ PRIVATELY OWNED ║ 516 │ 0.70 │ ║ PHI INC ║ 491 │ 0.67 │ ║ 6› faa-wildlife-strikes_OPERATOR_freq| Shift+F 282 bins
It has OPERATOR
as its key column, so transposing this sheet should result in something like this:
File Edit View Column Row Data Plot System Help | VisiData 3.0.2 | Alt+H for help men OPERATOR ║ UNKNOWN │ SOUTHWEST AIRLINES │ BUSINESS │ AMERICAN AIRLINES │ D> count ║ [23076] [16] UNKNO…│ [7752] [16] SOUTHW…│ [5868] [16] BUSINE…│ [4337] [16] AMERIC…│ [… percent ║ 31.41814617144102 %│ 10.554405838143992%│ 7.989325781505283 %│ 5.904857858621066 %│ 3…% histogram ║ ■■■■■■■■■■■■■■■■■■…│ ■■■■■■■■■■■■ │ ■■■■■■■■■ │ ■■■■■■■ │ ■… 7› faa-wildlife-strikes_OPERATOR_freq_T| Shift+T 3 rows
Note
If your source sheet has multiple key columns, VisiData will join together the columns’ values with the _
character to create the header names.