Distinctive Birds¶
This chapter uses VisiData to answer this question: What birds account for the largest percentages of their state’s collisions?
In the process, we’ll cover the following skills:
- Selecting rows that match a phrase
- Toggling selected rows
- Creating filtered sheets
- Renaming sheets and columns
- Creating frequency tables
- Creating and removing key columns
- Creating new columns
- Joining sheets
Note
If you haven’t already, download faa-wildlife-strikes.csv
, a dataset of all aircraft-wildlife collisions reported to the Federal Aviation Adminsitration between 2010 and mid-2016.
Open the wildlife-strikes dataset in VisiData¶
Run this command in your terminal:
vd faa-wildlife-strikes.csv
If it worked, you should see something like this:
File Edit View Column Row Data Plot System Help | VisiData 3.0.2 | Alt+H for help men OPERATOR │ ATYPE │ INCIDENT_DATE │ STATE │ AIRPORT │ PHASE_OF_FLT> BUSINESS │ PA-28 │ 05/22/15 00:00:00 │ FL │ VERO BEACH MUNICIP…│ APPROACH │ BUSINESS │ BE-1900 │ 06/18/15 00:00:00 │ AK │ KENAI MUNICIPAL AR…│ APPROACH │ BUSINESS │ PA-46 MALIBU │ 09/20/15 00:00:00 │ TX │ DAVID WAYNE HOOKS …│ │ DELTA AIR LINES │ B-717-200 │ 11/07/15 00:00:00 │ MO │ LAMBERT-ST LOUIS I…│ APPROACH │ BUSINESS │ BE-90 KING │ 12/17/15 00:00:00 │ FL │ POMPANO BEACH AIRP…│ LANDING ROLL │ DELTA AIR LINES │ B-757 │ 07/17/15 00:00:00 │ VI │ HENRY E ROHLSEN AR…│ │ DELTA AIR LINES │ B-717-200 │ 08/02/15 00:00:00 │ TX │ SAN ANTONIO INTL │ APPROACH │ BUSINESS │ C-414 │ 08/03/15 00:00:00 │ TX │ LONE STAR EXECUTIV…│ DEPARTURE │ ALLEGIANT AIR │ MD-80 │ 09/02/15 00:00:00 │ FL │ TAMPA INTL │ APPROACH │ TRANS STATES AIRLI…│ EMB-145 │ 09/07/15 00:00:00 │ MO │ LAMBERT-ST LOUIS I…│ APPROACH │ BUSINESS │ C-172 │ 11/28/15 00:00:00 │ FL │ OPA-LOCKA EXECUTIV…│ APPROACH │ GOVERNMENT │ EC120 │ 12/08/15 00:00:00 │ CA │ NORMAN Y. MINETA S…│ │ AMERICAN AIRLINES │ A-321 │ 05/06/15 00:00:00 │ FL │ FORT LAUDERDALE/HO…│ APPROACH │ EXPRESSJET AIRLINES│ CRJ100/200 │ 05/06/15 00:00:00 │ AR │ FORT SMITH REGIONA…│ CLIMB │ MESA AIRLINES │ CRJ900 │ 05/08/15 00:00:00 │ AR │ BILL AND HILLARY …│ LANDING ROLL │ BUSINESS │ HELICOPTER │ 05/06/15 00:0┌──────────────────────────────────| statuses |─┐│ DELTA AIR LINES │ A-320 │ 05/07/15 00:0│ saul.pw/VisiData v3.0.2 ││ DELTA AIR LINES │ A-320 │ 05/08/15 00:0│ opening datasets/faa-wildlife-strikes.csv as ││ LUFTHANSA │ A-380 │ 05/10/15 00:0│ csv ││ BUSINESS │ C-172 │ 05/08/15 00:0│ Don't panic. ││ SPIRIT AIRLINES │ A-319 │ 05/10/15 00:0└───────────────────────────────────────────────┘│ 1› faa-wildlife-strikes| 73448 rows
Select only known species¶
For many of the wildlife strikes in the dataset, species is “unknown”. We want to focus just on the known species, so we’ll filter out the unknowns in this step.
First, navigate to the SPECIES
column. Then, do the following:
- Press | to raise the select-by-search prompt
- Type
unknown
- Press Enter
Once you do that, you should see something like this:
File Edit View Column Row Data Plot System Help | VisiData 3.0.2 | Alt+H for help men <STATE │ AIRPORT │ PHASE_OF_FLT │ HEIGHT │ SPEED │ SPECIES │ BIRDS_STRUCK │ > •FL │ VERO BEACH MUNICIP…│ APPROACH │ │ │ Unknown bird │ 1 │ N •AK │ KENAI MUNICIPAL AR…│ APPROACH │ │ │ Unknown bird │ 1 │ N •TX │ DAVID WAYNE HOOKS …│ │ │ │ Unknown bird │ 1 │ N •MO │ LAMBERT-ST LOUIS I…│ APPROACH │ │ │ Unknown bird │ 1 │ N •FL │ POMPANO BEACH AIRP…│ LANDING ROLL │ 0 │ │ Unknown bird │ 1 │ •VI │ HENRY E ROHLSEN AR…│ │ │ │ Unknown bird │ 1 │ O •TX │ SAN ANTONIO INTL │ APPROACH │ │ │ Unknown bird │ 1 │ N •TX │ LONE STAR EXECUTIV…│ DEPARTURE │ │ │ Unknown bird │ 1 │ P •FL │ TAMPA INTL │ APPROACH │ 6000 │ │ Unknown bird │ 1 │ N MO │ LAMBERT-ST LOUIS I…│ APPROACH │ │ │ Owls │ 1 │ N FL │ OPA-LOCKA EXECUTIV…│ APPROACH │ │ │ Hawks │ 1 │ N CA │ NORMAN Y. MINETA S…│ │ │ │ Gulls │ 1 │ N •FL │ FORT LAUDERDALE/HO…│ APPROACH │ 1500 │ │ Unknown bird - sma…│ 1 │ N •AR │ FORT SMITH REGIONA…│ CLIMB │ │ │ Unknown bird - sma…│ 1 │ N •AR │ BILL AND HILLARY …│ LANDING ROLL │ 0 │ │ Unknown bird - sma…│ 1 │ N • │ UNKNOWN │ En Route │ │ │ Unknown bird │ 1 │ P •CA │ METRO OAKLAND INTL │ │ │ ┌─────────────────────────────| statuses |─┐ •UT │ SALT LAKE CITY INTL│ │ │ │ search wrapped │ •TX │ GEORGE BUSH INTERC…│ CLIMB │ │ │ 25988 matches for /unknown/ │ •FL │ ORLANDO SANFORD IN…│ APPROACH │ │ │ selected 25988 rows │ •IL │ CHICAGO O'HARE INT…│ CLIMB │ 12000 │ └──────────────────────────────────────────┘ 1› faa-wildlife-strikes| | select-col-regex 73448 rows •25988
Now, all the unknown species are selected. But we want the opposite of that: only known species selected. To do that, let’s toggle the selection-ing for all rows, by typing gt (mnemonic: “global toggle”). Once you do that, you should see something like this:
File Edit View Column Row Data Plot System Help | VisiData 3.0.2 | Alt+H for help men <STATE │ AIRPORT │ PHASE_OF_FLT │ HEIGHT │ SPEED │ SPECIES │ BIRDS_STRUCK │ > FL │ VERO BEACH MUNICIP…│ APPROACH │ │ │ Unknown bird │ 1 │ N AK │ KENAI MUNICIPAL AR…│ APPROACH │ │ │ Unknown bird │ 1 │ N TX │ DAVID WAYNE HOOKS …│ │ │ │ Unknown bird │ 1 │ N MO │ LAMBERT-ST LOUIS I…│ APPROACH │ │ │ Unknown bird │ 1 │ N FL │ POMPANO BEACH AIRP…│ LANDING ROLL │ 0 │ │ Unknown bird │ 1 │ VI │ HENRY E ROHLSEN AR…│ │ │ │ Unknown bird │ 1 │ O TX │ SAN ANTONIO INTL │ APPROACH │ │ │ Unknown bird │ 1 │ N TX │ LONE STAR EXECUTIV…│ DEPARTURE │ │ │ Unknown bird │ 1 │ P FL │ TAMPA INTL │ APPROACH │ 6000 │ │ Unknown bird │ 1 │ N •MO │ LAMBERT-ST LOUIS I…│ APPROACH │ │ │ Owls │ 1 │ N •FL │ OPA-LOCKA EXECUTIV…│ APPROACH │ │ │ Hawks │ 1 │ N •CA │ NORMAN Y. MINETA S…│ │ │ │ Gulls │ 1 │ N FL │ FORT LAUDERDALE/HO…│ APPROACH │ 1500 │ │ Unknown bird - sma…│ 1 │ N AR │ FORT SMITH REGIONA…│ CLIMB │ │ │ Unknown bird - sma…│ 1 │ N AR │ BILL AND HILLARY …│ LANDING ROLL │ 0 │ │ Unknown bird - sma…│ 1 │ N │ UNKNOWN │ En Route │ │ │ Unknown bird │ 1 │ P CA │ METRO OAKLAND INTL │ │ │ │ Unknown bird │ 1 │ N UT │ SALT LAKE CITY INTL│ │ │ │ Unknown bird │ 1 │ N TX │ GEORGE BUSH INTERC…│ CLIMB │ │ │ Unknown bird │ 1 │ N FL │ ORLANDO SANFORD IN…│ APPROACH │ │ │ Unknown bird │ 1 │ N IL │ CHICAGO O'HARE INT…│ CLIMB │ 12000 │ │ Unknown bird │ 1 │ P 1› faa-wildlife-strikes| gt stoggle-rows 73448 rows •47460
Now that we’ve selected our desired rows, let’s create a new sheet containing only those rows, by pressing ". The result should look something like this:
File Edit View Column Row Data Plot System Help | VisiData 3.0.2 | Alt+H for help men <STATE │ AIRPORT │ PHASE_OF_FLT │ HEIGHT │ SPEED │ SPECIES │ BIRDS_STRUCK │ > MO │ LAMBERT-ST LOUIS I…│ APPROACH │ │ │ Owls │ 1 │ N FL │ OPA-LOCKA EXECUTIV…│ APPROACH │ │ │ Hawks │ 1 │ N CA │ NORMAN Y. MINETA S…│ │ │ │ Gulls │ 1 │ N TX │ GEORGE BUSH INTERC…│ LANDING ROLL │ 0 │ │ Sparrows │ 1 │ N FL │ VERO BEACH MUNICIP…│ DEPARTURE │ │ │ Sandhill crane │ 1 │ N CA │ ONTARIO INTL ARPT │ │ │ │ Owls │ 1 │ N TN │ MC GHEE TYSON ARPT │ APPROACH │ │ │ Barn swallow │ 1 │ N SC │ CHARLESTON AFB/INT…│ DEPARTURE │ │ │ Gulls │ 1 │ N CA │ WHITEMAN AIRPORT │ LANDING ROLL │ 0 │ │ Rock pigeon │ 1 │ N WA │ SEATTLE-TACOMA INTL│ │ │ │ Barn owl │ 1 │ N HI │ KAHULUI ARPT │ LANDING ROLL │ 0 │ │ Gulls │ 1 │ N TX │ CORPUS CHRISTI INT…│ TAKE-OFF RUN │ 0 │ │ Gulls │ 1 │ N FL │ TALLAHASSEE REGION…│ APPROACH │ │ │ Gulls │ 2-10 │ N TX │ JACK BROOKS REGION…│ APPROACH │ │ │ Owls │ 1 │ N PA │ CAPITAL CITY ARPT …│ LANDING ROLL │ 0 │ │ Bank swallow │ 1 │ N AL │ MOBILE REGIONAL │ APPROACH │ 2000 │ 200 │ Gulls │ 1 │ N PA │ HARRISBURG INTL │ DEPARTURE │ │ │ Swallows │ 1 │ N CT │ BRADLEY INTL │ │ │ │ Horned lark │ 2-10 │ MA │ GENERAL EDWARD LAW…│ │ │ │ Horned lark │ 1 │ CA │ SACRAMENTO INTL │ Climb │ 1500 │ 210 │ Northern pintail │ 2-10 │ P OH │ RICKENBACKER INTL │ │ │ │ American robin │ 1 │ 2› faa-wildlife-strikes_selectedref| " 47460 rows
Rename the filtered sheet¶
By default, our sheet will be titled “faa-wildlife-strikes_selectedref”. To make it slightly easier to distinguish from other sheets, let’s rename it. To rename a sheet, do the following:
- Press Space to raise the type-a-command prompt
- Type
rename-sheet
(the command we want to use) and press Enter - At the next prompt, type the new name we want; in this case
known_species
At this point, you should see something like this:
MA │ GENERAL EDWARD LAW…│ │ │ │ Horned lark │ 1 │ CA │ SACRAMENTO INTL │ Climb │ 1500 │ 210 │ Northern pintail │ 2-10 │ P OH │ RICKENBACKER INTL │ │ │ │ American robin │ 1 │ rename sheet to: known_species Space rename-sheet 47460 rows
When you’ve entered the name, press Enter to complete the edit (or Control-c to cancel the edit).
Count the number of collisions per state¶
To get the denominator for our calculations, we’ll want to know the total number of reported collisions for each state.
Back in our known_species
sheet, navigate to the STATE
column:
File Edit View Column Row Data Plot System Help | VisiData 3.0.2 | Alt+H for help men <STATE │ AIRPORT │ PHASE_OF_FLT │ HEIGHT │ SPEED │ SPECIES │ BIRDS_STRUCK │ > MO │ LAMBERT-ST LOUIS I…│ APPROACH │ │ │ Owls │ 1 │ N FL │ OPA-LOCKA EXECUTIV…│ APPROACH │ │ │ Hawks │ 1 │ N CA │ NORMAN Y. MINETA S…│ │ │ │ Gulls │ 1 │ N TX │ GEORGE BUSH INTERC…│ LANDING ROLL │ 0 │ │ Sparrows │ 1 │ N FL │ VERO BEACH MUNICIP…│ DEPARTURE │ │ │ Sandhill crane │ 1 │ N CA │ ONTARIO INTL ARPT │ │ │ │ Owls │ 1 │ N TN │ MC GHEE TYSON ARPT │ APPROACH │ │ │ Barn swallow │ 1 │ N SC │ CHARLESTON AFB/INT…│ DEPARTURE │ │ │ Gulls │ 1 │ N CA │ WHITEMAN AIRPORT │ LANDING ROLL │ 0 │ │ Rock pigeon │ 1 │ N WA │ SEATTLE-TACOMA INTL│ │ │ │ Barn owl │ 1 │ N HI │ KAHULUI ARPT │ LANDING ROLL │ 0 │ │ Gulls │ 1 │ N TX │ CORPUS CHRISTI INT…│ TAKE-OFF RUN │ 0 │ │ Gulls │ 1 │ N FL │ TALLAHASSEE REGION…│ APPROACH │ │ │ Gulls │ 2-10 │ N TX │ JACK BROOKS REGION…│ APPROACH │ │ │ Owls │ 1 │ N PA │ CAPITAL CITY ARPT …│ LANDING ROLL │ 0 │ │ Bank swallow │ 1 │ N AL │ MOBILE REGIONAL │ APPROACH │ 2000 │ 200 │ Gulls │ 1 │ N PA │ HARRISBURG INTL │ DEPARTURE │ │ │ Swallows │ 1 │ N CT │ BRADLEY INTL │ │ │ │ Horned lark │ 2-10 │ MA │ GENERAL EDWARD LAW…│ │ │ │ Horned lark │ 1 │ CA │ SACRAMENTO INTL │ Climb │ 1500 │ 210 │ Northern pintail │ 2-10 │ P OH │ RICKENBACKER INTL │ │ │ │ American robin │ 1 │ 2› known_species| c go-col-regex 47460 rows
Then, to create a frequency table for the column, press Shift-F. Once you do, you should see something like this:
File Edit View Column Row Data Plot System Help | VisiData 3.0.2 | Alt+H for help men STATE ║↓count♯│ percent%│ histogram ~║ TX ║ 4670 │ 9.84 │ ■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■ ║ ║ 4428 │ 9.33 │ ■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■ ║ CA ║ 3391 │ 7.14 │ ■■■■■■■■■■■■■■■■■■■■■■■■■■■ ║ FL ║ 3096 │ 6.52 │ ■■■■■■■■■■■■■■■■■■■■■■■■■ ║ CO ║ 2914 │ 6.14 │ ■■■■■■■■■■■■■■■■■■■■■■■ ║ NY ║ 2626 │ 5.53 │ ■■■■■■■■■■■■■■■■■■■■■ ║ IL ║ 2153 │ 4.54 │ ■■■■■■■■■■■■■■■■■ ║ NJ ║ 1681 │ 3.54 │ ■■■■■■■■■■■■■ ║ PA ║ 1665 │ 3.51 │ ■■■■■■■■■■■■■ ║ OH ║ 1515 │ 3.19 │ ■■■■■■■■■■■■ ║ MI ║ 1234 │ 2.60 │ ■■■■■■■■■■ ║ MO ║ 1140 │ 2.40 │ ■■■■■■■■■ ║ UT ║ 1049 │ 2.21 │ ■■■■■■■■ ║ WA ║ 874 │ 1.84 │ ■■■■■■■ ║ HI ║ 873 │ 1.84 │ ■■■■■■■ ║ GA ║ 824 │ 1.74 │ ■■■■■■ ║ NC ║ 813 │ 1.71 │ ■■■■■■ ║ MA ║ 801 │ 1.69 │ ■■■■■■ ║ TN ║ 774 │ 1.63 │ ■■■■■■ ║ IN ║ 773 │ 1.63 │ ■■■■■■ ║ LA ║ 737 │ 1.55 │ ■■■■■ ║ 3› known_species_STATE_freq| Shift+F 63 bins
Spruce up the frequency table¶
Because we’ll later be joining this sheet to another sheet, let’s hide the percent
and histogram
columns by navigating to each and pressing -.
Now the sheet should look something like this:
File Edit View Column Row Data Plot System Help | VisiData 3.0.2 | Alt+H for help men STATE ║↓count♯║ TX ║ 4670 ║ ║ 4428 ║ CA ║ 3391 ║ FL ║ 3096 ║ CO ║ 2914 ║ NY ║ 2626 ║ IL ║ 2153 ║ NJ ║ 1681 ║ PA ║ 1665 ║ OH ║ 1515 ║ MI ║ 1234 ║ MO ║ 1140 ║ UT ║ 1049 ║ WA ║ 874 ║ HI ║ 873 ║ GA ║ 824 ║ NC ║ 813 ║ MA ║ 801 ║ TN ║ 774 ║ IN ║ 773 ║ LA ║ 737 ║ 3› known_species_STATE_freq| - hide-col 63 bins
Count the number of collisions per state and species¶
Now that we have the denominator — collisions per state — let’s calculate the numerator: collisions per species per state.
To do that, we’ll want to create a frequency table for the combination of the STATE
and SPECIES
columns. Here’s how:
- Use the Sheets Sheet (Shift-S) to navigate back to the
known_species
sheet - Navigate to the
STATE
column, and press ! to make it a “key” column - Do the same thing for the
SPECIES
columns
At this point, you should see something like this:
File Edit View Column Row Data Plot System Help | VisiData 3.0.2 | Alt+H for help men STATE │ SPECIES ║<AIRPORT │ PHASE_OF_FLT │ HEIGHT │ SPEED │ BIRDS_STRUCK │ > MO │ Owls ║ LAMBERT-ST LOUIS I…│ APPROACH │ │ │ 1 │ N FL │ Hawks ║ OPA-LOCKA EXECUTIV…│ APPROACH │ │ │ 1 │ N CA │ Gulls ║ NORMAN Y. MINETA S…│ │ │ │ 1 │ N TX │ Sparrows ║ GEORGE BUSH INTERC…│ LANDING ROLL │ 0 │ │ 1 │ N FL │ Sandhill crane ║ VERO BEACH MUNICIP…│ DEPARTURE │ │ │ 1 │ N CA │ Owls ║ ONTARIO INTL ARPT │ │ │ │ 1 │ N TN │ Barn swallow ║ MC GHEE TYSON ARPT │ APPROACH │ │ │ 1 │ N SC │ Gulls ║ CHARLESTON AFB/INT…│ DEPARTURE │ │ │ 1 │ N CA │ Rock pigeon ║ WHITEMAN AIRPORT │ LANDING ROLL │ 0 │ │ 1 │ N WA │ Barn owl ║ SEATTLE-TACOMA INTL│ │ │ │ 1 │ N HI │ Gulls ║ KAHULUI ARPT │ LANDING ROLL │ 0 │ │ 1 │ N TX │ Gulls ║ CORPUS CHRISTI INT…│ TAKE-OFF RUN │ 0 │ │ 1 │ N FL │ Gulls ║ TALLAHASSEE REGION…│ APPROACH │ │ │ 2-10 │ N TX │ Owls ║ JACK BROOKS REGION…│ APPROACH │ │ │ 1 │ N PA │ Bank swallow ║ CAPITAL CITY ARPT …│ LANDING ROLL │ 0 │ │ 1 │ N AL │ Gulls ║ MOBILE REGIONAL │ APPROACH │ 2000 │ 200 │ 1 │ N PA │ Swallows ║ HARRISBURG INTL │ DEPARTURE │ │ │ 1 │ N CT │ Horned lark ║ BRADLEY INTL │ │ │ │ 2-10 │ MA │ Horned lark ║ GENERAL EDWARD LAW…│ │ │ │ 1 │ CA │ Northern pintail ║ SACRAMENTO INTL │ Climb │ 1500 │ 210 │ 2-10 │ P OH │ American robin ║ RICKENBACKER INTL │ │ │ │ 1 │ 2› known_species| ! key-col 47460 rows
Now, type gF, which will create a frequency table of all keyed columns. Once you do, you should see something like this:
File Edit View Column Row Data Plot System Help | VisiData 3.0.2 | Alt+H for help men STATE │ SPECIES ║↓count♯│ percent%│ histogram ~║ CO │ Horned lark ║ 1117 │ 2.35 │ ■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■ ║ TX │ Mourning dove ║ 984 │ 2.07 │ ■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■ ║ TX │ Rock pigeon ║ 339 │ 0.71 │ ■■■■■■■■■■■ ║ HI │ Pacific golden-plo…║ 329 │ 0.69 │ ■■■■■■■■■■■ ║ FL │ Barn swallow ║ 316 │ 0.67 │ ■■■■■■■■■■ ║ CO │ Western meadowlark ║ 307 │ 0.65 │ ■■■■■■■■■■ ║ FL │ Mourning dove ║ 304 │ 0.64 │ ■■■■■■■■■■ ║ CO │ Mourning dove ║ 282 │ 0.59 │ ■■■■■■■■■ ║ │ Perching birds (y) ║ 274 │ 0.58 │ ■■■■■■■■■ ║ IL │ American kestrel ║ 273 │ 0.58 │ ■■■■■■■■■ ║ IL │ Killdeer ║ 266 │ 0.56 │ ■■■■■■■■■ ║ TX │ Killdeer ║ 260 │ 0.55 │ ■■■■■■■■ ║ UT │ Horned lark ║ 260 │ 0.55 │ ■■■■■■■■ ║ CA │ American kestrel ║ 258 │ 0.54 │ ■■■■■■■■ ║ FL │ Killdeer ║ 236 │ 0.50 │ ■■■■■■■■ ║ CA │ Red-tailed hawk ║ 229 │ 0.48 │ ■■■■■■■ ║ TX │ Barn swallow ║ 219 │ 0.46 │ ■■■■■■■ ║ CA │ Cliff swallow ║ 213 │ 0.45 │ ■■■■■■■ ║ NY │ Barn swallow ║ 208 │ 0.44 │ ■■■■■■■ ║ CA │ Barn owl ║ 200 │ 0.42 │ ■■■■■■ ║ NY │ American kestrel ║ 196 │ 0.41 │ ■■■■■■ ║ 4› known_species_STATE-SPECIES_freq| gShift+F 5135 bins
Just like we did with the state-frequency table, let’s simplify this table by removing the percent
and histogram
columns; navigate to each of those columns and press -, which should result in something like this:
File Edit View Column Row Data Plot System Help | VisiData 3.0.2 | Alt+H for help men STATE │ SPECIES ║↓count♯║ CO │ Horned lark ║ 1117 ║ TX │ Mourning dove ║ 984 ║ TX │ Rock pigeon ║ 339 ║ HI │ Pacific golden-plo…║ 329 ║ FL │ Barn swallow ║ 316 ║ CO │ Western meadowlark ║ 307 ║ FL │ Mourning dove ║ 304 ║ CO │ Mourning dove ║ 282 ║ │ Perching birds (y) ║ 274 ║ IL │ American kestrel ║ 273 ║ IL │ Killdeer ║ 266 ║ TX │ Killdeer ║ 260 ║ UT │ Horned lark ║ 260 ║ CA │ American kestrel ║ 258 ║ FL │ Killdeer ║ 236 ║ CA │ Red-tailed hawk ║ 229 ║ TX │ Barn swallow ║ 219 ║ CA │ Cliff swallow ║ 213 ║ NY │ Barn swallow ║ 208 ║ CA │ Barn owl ║ 200 ║ NY │ American kestrel ║ 196 ║ 4› known_species_STATE-SPECIES_freq| - hide-col 5135 bins
Prepare the state-species frequency table for joining¶
There’s just one step left before we can join the tables: making sure that the two frequency tables share the exact same key column. (VisiData uses each sheet’s “key” columns to determine which rows to join.)
Because the key for the state-counts table is the STATE
column, this table should also have STATE
as its only key column. That means we need to un-key the SPECIES
column. Luckily, that’s easy. Just navigate to the SPECIES
column and press ! to toggle it’s status from keyed to un-keyed:
File Edit View Column Row Data Plot System Help | VisiData 3.0.2 | Alt+H for help men STATE ║ SPECIES │↓count♯║ CO ║ Horned lark │ 1117 ║ TX ║ Mourning dove │ 984 ║ TX ║ Rock pigeon │ 339 ║ HI ║ Pacific golden-plo…│ 329 ║ FL ║ Barn swallow │ 316 ║ CO ║ Western meadowlark │ 307 ║ FL ║ Mourning dove │ 304 ║ CO ║ Mourning dove │ 282 ║ ║ Perching birds (y) │ 274 ║ IL ║ American kestrel │ 273 ║ IL ║ Killdeer │ 266 ║ TX ║ Killdeer │ 260 ║ UT ║ Horned lark │ 260 ║ CA ║ American kestrel │ 258 ║ FL ║ Killdeer │ 236 ║ CA ║ Red-tailed hawk │ 229 ║ TX ║ Barn swallow │ 219 ║ CA ║ Cliff swallow │ 213 ║ NY ║ Barn swallow │ 208 ║ CA ║ Barn owl │ 200 ║ NY ║ American kestrel │ 196 ║ 4› known_species_STATE-SPECIES_freq| ! key-col 5135 bins
Join the two frequency tables¶
Now, for the moment we’ve all been waiting for: Let’s join the tables!
First, press Shift-S to open the Sheets Sheet:
File Edit View Column Row Data Plot System Help | VisiData 3.0.2 | Alt+H for help men name ║ type │ pane#│ shortcut │ nRows#│ nCols#│ nVisibleCol> sheets ║ SheetsSheet │ 1 │ Shift+S │ 5 │ 11 │ 11 known_species_STATE-SPECIES_freq ║ FreqTableSheet │ 1 │ 4 │ 5135 │ 5 │ 3 known_species ║ CsvSheet │ 1 │ 2 │ 47460 │ 16 │ 16 known_species_STATE_freq ║ FreqTableSheet │ 1 │ 3 │ 63 │ 4 │ 2 faa-wildlife-strikes ║ CsvSheet │ 1 │ 1 │ 73448 │ 16 │ 16 Shift+S› sheets| gk go-top 5 sheets
Then navigate to the known_species_STATE-SPECIES_freq
row, and press s to select it. Do the same for the known_species_STATE_freq
, so that the Sheets Sheet now looks like this:
File Edit View Column Row Data Plot System Help | VisiData 3.0.2 | Alt+H for help men name ║ type │ pane#│ shortcut │ nRows#│ nCols#│ nVisibleCol> sheets ║ SheetsSheet │ 1 │ Shift+S │ 5 │ 11 │ 11 •known_species_STATE-SPECIES_freq ║ FreqTableSheet │ 1 │ 4 │ 5135 │ 5 │ 3 known_species ║ CsvSheet │ 1 │ 2 │ 47460 │ 16 │ 16 •known_species_STATE_freq ║ FreqTableSheet │ 1 │ 3 │ 63 │ 4 │ 2 faa-wildlife-strikes ║ CsvSheet │ 1 │ 1 │ 73448 │ 16 │ 16 Shift+S› sheets| s select-row 5 sheets •2
Press & to raise the sheet-joining prompt:
File Edit View Column Row Data Plot System Help | VisiData 3.0.2 | Alt+H for help men name ║ type │ pane#│ shortcut │ nRows#│ nCols#│ nVisibleCol> sheets ║ SheetsSheet │ 1 │ Shift+S │ 5 │ 11 │ 11 •known_species_STATE-SPECIES_freq ║ FreqTableSheet │ 1 │ 4 │ 5135 │ 5 │ 3 known_species ║ CsvSheet │ 1 │ 2 │ 47460 │ 16 │ 16 •known_species_STATE_freq ║ FreqTableSheet │ 1 │ 3 │ 63 │ 4 │ 2 faa-wildlife-strikes ║ CsvSheet │ 1 │ 1 │ 73448 │ 16 │ 16 inner - only rows with matching keys on all sheets outer - only rows with matching keys on first selected sheet full - all rows from all sheets (union) diff - only rows NOT in all sheets append - all rows from all sheets; columns from all sheets concat - all rows from all sheets; columns and type from first sheet extend - only rows from first sheet; type from first sheet; columns from all sheets merge - merge differences from other sheets into first sheet (includin| Join Help ┐ │ HELPTODO │ └───────────┘ choose jointype: & join-selected 5 sheets •2
Type inner
and press Enter to complete the action. After that, you should see something like this:
File Edit View Column Row Data Plot System Help | VisiData 3.0.2 | Alt+H for help men STATE ║ SPECIES │ known♯│ known♯║ CO ║ Horned lark │ 1117 │ 2914 ║ CO ║ Western meadowlark │ 307 │ 2914 ║ CO ║ Mourning dove │ 282 │ 2914 ║ CO ║ Cliff swallow │ 148 │ 2914 ║ CO ║ Lark bunting │ 91 │ 2914 ║ CO ║ Rock pigeon │ 82 │ 2914 ║ CO ║ American kestrel │ 78 │ 2914 ║ CO ║ Red-tailed hawk │ 73 │ 2914 ║ CO ║ Black-tailed jackr…│ 61 │ 2914 ║ CO ║ Killdeer │ 41 │ 2914 ║ CO ║ Great horned owl │ 31 │ 2914 ║ CO ║ Black-tailed prair…│ 31 │ 2914 ║ CO ║ Sparrows │ 30 │ 2914 ║ CO ║ Western kingbird │ 25 │ 2914 ║ CO ║ Desert cottontail │ 24 │ 2914 ║ CO ║ Perching birds (y) │ 20 │ 2914 ║ CO ║ Burrowing owl │ 20 │ 2914 ║ CO ║ Mountain bluebird │ 16 │ 2914 ║ CO ║ Vesper sparrow │ 16 │ 2914 ║ CO ║ Red-winged blackbi…│ 15 │ 2914 ║ CO ║ Little brown bat │ 15 │ 2914 ║ 5› known_species_STATE-SPECIES_freq+known_species_STATE_freq| & 5135 rows
The columns that had previously been called count
in both sheets have been auto-prefixed with the name of their source sheet. Let’s clarify things by using the ^ shortcut to rename them to count
and state_total
, respectively. On the state_total
column, press _ to expand the width to see the full name:
File Edit View Column Row Data Plot System Help | VisiData 3.0.2 | Alt+H for help men STATE ║ SPECIES │ count♯│ state_total♯║ CO ║ Horned lark │ 1117 │ 2914 ║ CO ║ Western meadowlark │ 307 │ 2914 ║ CO ║ Mourning dove │ 282 │ 2914 ║ CO ║ Cliff swallow │ 148 │ 2914 ║ CO ║ Lark bunting │ 91 │ 2914 ║ CO ║ Rock pigeon │ 82 │ 2914 ║ CO ║ American kestrel │ 78 │ 2914 ║ CO ║ Red-tailed hawk │ 73 │ 2914 ║ CO ║ Black-tailed jackr…│ 61 │ 2914 ║ CO ║ Killdeer │ 41 │ 2914 ║ CO ║ Great horned owl │ 31 │ 2914 ║ CO ║ Black-tailed prair…│ 31 │ 2914 ║ CO ║ Sparrows │ 30 │ 2914 ║ CO ║ Western kingbird │ 25 │ 2914 ║ CO ║ Desert cottontail │ 24 │ 2914 ║ CO ║ Perching birds (y) │ 20 │ 2914 ║ CO ║ Burrowing owl │ 20 │ 2914 ║ CO ║ Mountain bluebird │ 16 │ 2914 ║ CO ║ Vesper sparrow │ 16 │ 2914 ║ CO ║ Red-winged blackbi…│ 15 │ 2914 ║ CO ║ Little brown bat │ 15 │ 2914 ║ 5› joined| _ resize-col-max 5135 rows
Finally, let’s give the sheet a shorter name:
- Press Space to raise the type-a-command prompt
- Type
rename-sheet
and press Enter - At the next prompt, type the new name we want; in this case
joined
Calculate each species’ state-level percentages¶
Now that we have the numerator and denominator in the same sheet, let’s calculate the percentage of known-species collisions to each species in each state.
Let’s say we want the new column to appear as the last column in the sheet, so let’s navigate to the currently-last column by typing gl. Then let’s create the new column by pressing =, typing count * 100 / state_total
, and then pressing Enter.
Once you do that, you should see something like this:
File Edit View Column Row Data Plot System Help | VisiData 3.0.2 | Alt+H for help men STATE ║ SPECIES │ count♯│ state_total♯│ count * 100 / state_total ║ CO ║ Horned lark │ 1117 │ 2914 │ 38.33218943033631 %║ CO ║ Western meadowlark │ 307 │ 2914 │ 10.535346602608099 %║ CO ║ Mourning dove │ 282 │ 2914 │ 9.67741935483871 %║ CO ║ Cliff swallow │ 148 │ 2914 │ 5.078929306794784 %║ CO ║ Lark bunting │ 91 │ 2914 │ 3.1228551818805763 %║ CO ║ Rock pigeon │ 82 │ 2914 │ 2.8140013726835966 %║ CO ║ American kestrel │ 78 │ 2914 │ 2.676733013040494 %║ CO ║ Red-tailed hawk │ 73 │ 2914 │ 2.5051475634866165 %║ CO ║ Black-tailed jackr…│ 61 │ 2914 │ 2.0933424845573096 %║ CO ║ Killdeer │ 41 │ 2914 │ 1.4070006863417983 %║ CO ║ Great horned owl │ 31 │ 2914 │ 1.0638297872340425 %║ CO ║ Black-tailed prair…│ 31 │ 2914 │ 1.0638297872340425 %║ CO ║ Sparrows │ 30 │ 2914 │ 1.029512697323267 %║ CO ║ Western kingbird │ 25 │ 2914 │ 0.8579272477693891 %║ CO ║ Desert cottontail │ 24 │ 2914 │ 0.8236101578586136 %║ CO ║ Perching birds (y) │ 20 │ 2914 │ 0.6863417982155113 %║ CO ║ Burrowing owl │ 20 │ 2914 │ 0.6863417982155113 %║ CO ║ Mountain bluebird │ 16 │ 2914 │ 0.5490734385724091 %║ CO ║ Vesper sparrow │ 16 │ 2914 │ 0.5490734385724091 %║ CO ║ Red-winged blackbi…│ 15 │ 2914 │ 0.5147563486616334 %║ CO ║ Little brown bat │ 15 │ 2914 │ 0.5147563486616334 %║ 5› joined| = addcol-expr 5135 rows [M]
It worked! But the column name is a bit wonky and literal. Let’s make the following tweaks:
- Rename the column by navigating to it, pressing ^ to enter the column-name-editing mode, typing
pct_of_state
, and then pressing Enter. - Press % to tell VisiData that it’s a “float”-type numeric column. (By default, VisiData assumes that newly created columns are just plain-old text.)
- Press _ to resize the column to fit its contents more precisely
Now the sheet should look something like this:
File Edit View Column Row Data Plot System Help | VisiData 3.0.2 | Alt+H for help men STATE ║ SPECIES │ count♯│ state_total♯│ pct_of_state%║ CO ║ Horned lark │ 1117 │ 2914 │ 38.33 ║ CO ║ Western meadowlark │ 307 │ 2914 │ 10.54 ║ CO ║ Mourning dove │ 282 │ 2914 │ 9.68 ║ CO ║ Cliff swallow │ 148 │ 2914 │ 5.08 ║ CO ║ Lark bunting │ 91 │ 2914 │ 3.12 ║ CO ║ Rock pigeon │ 82 │ 2914 │ 2.81 ║ CO ║ American kestrel │ 78 │ 2914 │ 2.68 ║ CO ║ Red-tailed hawk │ 73 │ 2914 │ 2.51 ║ CO ║ Black-tailed jackr…│ 61 │ 2914 │ 2.09 ║ CO ║ Killdeer │ 41 │ 2914 │ 1.41 ║ CO ║ Great horned owl │ 31 │ 2914 │ 1.06 ║ CO ║ Black-tailed prair…│ 31 │ 2914 │ 1.06 ║ CO ║ Sparrows │ 30 │ 2914 │ 1.03 ║ CO ║ Western kingbird │ 25 │ 2914 │ 0.86 ║ CO ║ Desert cottontail │ 24 │ 2914 │ 0.82 ║ CO ║ Perching birds (y) │ 20 │ 2914 │ 0.69 ║ CO ║ Burrowing owl │ 20 │ 2914 │ 0.69 ║ CO ║ Mountain bluebird │ 16 │ 2914 │ 0.55 ║ CO ║ Vesper sparrow │ 16 │ 2914 │ 0.55 ║ CO ║ Red-winged blackbi…│ 15 │ 2914 │ 0.51 ║ CO ║ Little brown bat │ 15 │ 2914 │ 0.51 ║ 5› joined| _ resize-col-max 5135 rows [M]
Sort by percentage¶
Of course, to answer our main question, we’ll need to sort the column. To sort it descendingly, press ]. Now you should see something like this:
File Edit View Column Row Data Plot System Help | VisiData 3.0.2 | Alt+H for help men STATE ║ SPECIES │ count♯│ state_total♯│↓pct_of_state%║ SK ║ Gulls │ 1 │ 1 │ 100.00 ║ NS ║ Striped skunk │ 1 │ 1 │ 100.00 ║ NL ║ Snow bunting │ 1 │ 1 │ 100.00 ║ MB ║ Mallard │ 1 │ 2 │ 50.00 ║ MB ║ Cedar waxwing │ 1 │ 2 │ 50.00 ║ AB ║ Canada goose │ 1 │ 2 │ 50.00 ║ AB ║ Perching birds (y) │ 1 │ 2 │ 50.00 ║ PI ║ Yellow bittern │ 72 │ 173 │ 41.62 ║ CO ║ Horned lark │ 1117 │ 2914 │ 38.33 ║ HI ║ Pacific golden-plo…│ 329 │ 873 │ 37.69 ║ AZ ║ Mourning dove │ 134 │ 438 │ 30.59 ║ MS ║ Mourning dove │ 60 │ 197 │ 30.46 ║ WV ║ Killdeer │ 21 │ 78 │ 26.92 ║ VT ║ American kestrel │ 29 │ 109 │ 26.61 ║ IA ║ Killdeer │ 49 │ 189 │ 25.93 ║ UT ║ Horned lark │ 260 │ 1049 │ 24.79 ║ WY ║ Greater sage-grouse│ 10 │ 41 │ 24.39 ║ IN ║ Mourning dove │ 179 │ 773 │ 23.16 ║ LA ║ Barn swallow │ 168 │ 737 │ 22.80 ║ MD ║ European starling │ 90 │ 405 │ 22.22 ║ BC ║ Red-tailed hawk │ 2 │ 9 │ 22.22 ║ 5› joined| ] sort-desc 5135 rows [M]
Limit to rows with at least 20 collisions¶
Hmmm, many of the highest-ranking species-state combinations seem to come from “states” — like the striped skunk that was struck in Nova Scotia — with very few reported collisions. So let’s limit the results to species-state combinations with at least 20 reports.
To do that, we’ll use z|, VisiData’s “select by expression” command.
First, type z| to raise the selection prompt. Then, type count >= 20
:
File Edit View Column Row Data Plot System Help | VisiData 3.0.2 | Alt+H for help men STATE ║ SPECIES │ count♯│ state_total♯│↓pct_of_state%║ SK ║ Gulls │ 1 │ 1 │ 100.00 ║ NS ║ Striped skunk │ 1 │ 1 │ 100.00 ║ NL ║ Snow bunting │ 1 │ 1 │ 100.00 ║ MB ║ Mallard │ 1 │ 2 │ 50.00 ║ MB ║ Cedar waxwing │ 1 │ 2 │ 50.00 ║ AB ║ Canada goose │ 1 │ 2 │ 50.00 ║ AB ║ Perching birds (y) │ 1 │ 2 │ 50.00 ║ PI ║ Yellow bittern │ 72 │ 173 │ 41.62 ║ CO ║ Horned lark │ 1117 │ 2914 │ 38.33 ║ HI ║ Pacific golden-plo…│ 329 │ 873 │ 37.69 ║ AZ ║ Mourning dove │ 134 │ 438 │ 30.59 ║ MS ║ Mourning dove │ 60 │ 197 │ 30.46 ║ WV ║ Killdeer │ 21 │ 78 │ 26.92 ║ VT ║ American kestrel │ 29 │ 109 │ 26.61 ║ IA ║ Killdeer │ 49 │ 189 │ 25.93 ║ UT ║ Horned lark │ 260 │ 1049 │ 24.79 ║ WY ║ Greater sage-grouse│ 10 │ 41 │ 24.39 ║ IN ║ Mourning dove │ 179 │ 773 │ 23.16 ║ LA ║ Barn swallow │ 168 │ 737 │ 22.80 ║ MD ║ European starling │ 90 │ 405 │ 22.22 ║ BC ║ Red-tailed hawk │ 2 │ 9 │ 22.22 ║ select by expr: count >= 20 z| select-expr 5135 rows [M]
Next, press Enter to complete the action. Now you should see something like this:
File Edit View Column Row Data Plot System Help | VisiData 3.0.2 | Alt+H for help men STATE ║ SPECIES │ count♯│ state_total♯│↓pct_of_state%║ SK ║ Gulls │ 1 │ 1 │ 100.00 ║ NS ║ Striped skunk │ 1 │ 1 │ 100.00 ║ NL ║ Snow bunting │ 1 │ 1 │ 100.00 ║ MB ║ Mallard │ 1 │ 2 │ 50.00 ║ MB ║ Cedar waxwing │ 1 │ 2 │ 50.00 ║ AB ║ Canada goose │ 1 │ 2 │ 50.00 ║ AB ║ Perching birds (y) │ 1 │ 2 │ 50.00 ║ •PI ║ Yellow bittern │ 72 │ 173 │ 41.62 ║ •CO ║ Horned lark │ 1117 │ 2914 │ 38.33 ║ •HI ║ Pacific golden-plo…│ 329 │ 873 │ 37.69 ║ •AZ ║ Mourning dove │ 134 │ 438 │ 30.59 ║ •MS ║ Mourning dove │ 60 │ 197 │ 30.46 ║ •WV ║ Killdeer │ 21 │ 78 │ 26.92 ║ •VT ║ American kestrel │ 29 │ 109 │ 26.61 ║ •IA ║ Killdeer │ 49 │ 189 │ 25.93 ║ •UT ║ Horned lark │ 260 │ 1049 │ 24.79 ║ WY ║ Greater sage-grouse│ 10 │ 41 │ 24.39 ║ •IN ║ Mourning dove │ 179 │ 773 │ 23.16 ║ •LA ║ Barn swallow │ 168 │ 737 │ ┌─────────────────────────────| statuses |─┐ •MD ║ European starling │ 90 │ 405 │ │ selected 498 rows │ BC ║ Red-tailed hawk │ 2 │ 9 │ └──────────────────────────────────────────┘ 5› joined| z| select-expr 5135 rows [M] •498
Finally, press " to create a new sheet containing only the selected rows:
File Edit View Column Row Data Plot System Help | VisiData 3.0.2 | Alt+H for help men STATE ║ SPECIES │ count♯│ state_total♯│↓pct_of_state%║ PI ║ Yellow bittern │ 72 │ 173 │ 41.62 ║ CO ║ Horned lark │ 1117 │ 2914 │ 38.33 ║ HI ║ Pacific golden-plo…│ 329 │ 873 │ 37.69 ║ AZ ║ Mourning dove │ 134 │ 438 │ 30.59 ║ MS ║ Mourning dove │ 60 │ 197 │ 30.46 ║ WV ║ Killdeer │ 21 │ 78 │ 26.92 ║ VT ║ American kestrel │ 29 │ 109 │ 26.61 ║ IA ║ Killdeer │ 49 │ 189 │ 25.93 ║ UT ║ Horned lark │ 260 │ 1049 │ 24.79 ║ IN ║ Mourning dove │ 179 │ 773 │ 23.16 ║ LA ║ Barn swallow │ 168 │ 737 │ 22.80 ║ MD ║ European starling │ 90 │ 405 │ 22.22 ║ TX ║ Mourning dove │ 984 │ 4670 │ 21.07 ║ NH ║ American kestrel │ 24 │ 117 │ 20.51 ║ PI ║ Pacific golden-plo…│ 35 │ 173 │ 20.23 ║ SC ║ Killdeer │ 45 │ 229 │ 19.65 ║ ID ║ American kestrel │ 30 │ 172 │ 17.44 ║ VA ║ Mourning dove │ 92 │ 544 │ 16.91 ║ VA ║ Killdeer │ 89 │ 544 │ 16.36 ║ NE ║ Barn swallow │ 68 │ 419 │ 16.23 ║ AR ║ Mourning dove │ 30 │ 187 │ 16.04 ║ 6› joined_selectedref| " 498 rows
There you have it! The Yellow Bittern accounted for more than 40% of the reported, known-species collisions in “PI” (the FAA’s abbreviation for “USA-possessed Pacific Islands,” such as Guam). In Colorado, the Horned Lark has been nearly as collision-dominating, as has the Pacific Golden-Plover in Hawaii, and Mourning Doves in Arizona and Mississippi.
Take it one step further¶
What if we want to find the species that are disproportionately involved in collisions in their state? How would you do that? (Hint: It involves just one more frequency table and one more join.)