High-Flying Birds¶
This chapter uses VisiData to answer this question: Which birds that are struck, on average, at the highest altitudes?
In the process, we’ll cover the following skills:
- Assigning “types” to columns
- Creating new columns
- Using frequency tables to filter sheets
- Calculating aggregate statistics
Note
If you haven’t already, download faa-wildlife-strikes.csv
, a dataset of all aircraft-wildlife collisions reported to the Federal Aviation Adminsitration between 2010 and mid-2016.
Open the wildlife-strikes dataset in VisiData¶
Let’s start from the very beginning.
Run this command in your terminal:
vd faa-wildlife-strikes.csv
If it worked, you should see something like this:
File Edit View Column Row Data Plot System Help Ctrl+H for help menu OPERATOR | ATYPE | INCIDENT_DATE | STATE | AIRPORT | PHASE_OF_FLT> BUSINESS | PA-28 | 05/22/15 00:00:00 | FL | VERO BEACH MUNICIP…| APPROACH ║ BUSINESS | BE-1900 | 06/18/15 00:00:00 | AK | KENAI MUNICIPAL AR…| APPROACH ║ BUSINESS | PA-46 MALIBU | 09/20/15 00:00:00 | TX | DAVID WAYNE HOOKS …| ║ DELTA AIR LINES | B-717-200 | 11/07/15 00:00:00 | MO | LAMBERT-ST LOUIS I…| APPROACH ║ BUSINESS | BE-90 KING | 12/17/15 00:00:00 | FL | POMPANO BEACH AIRP…| LANDING ROLL ║ DELTA AIR LINES | B-757 | 07/17/15 00:00:00 | VI | HENRY E ROHLSEN AR…| ║ DELTA AIR LINES | B-717-200 | 08/02/15 00:00:00 | TX | SAN ANTONIO INTL | APPROACH ║ BUSINESS | C-414 | 08/03/15 00:00:00 | TX | LONE STAR EXECUTIV…| DEPARTURE ║ ALLEGIANT AIR | MD-80 | 09/02/15 00:00:00 | FL | TAMPA INTL | APPROACH ║ TRANS STATES AIRLI…| EMB-145 | 09/07/15 00:00:00 | MO | LAMBERT-ST LOUIS I…| APPROACH ║ BUSINESS | C-172 | 11/28/15 00:00:00 | FL | OPA-LOCKA EXECUTIV…| APPROACH ║ GOVERNMENT | EC120 | 12/08/15 00:00:00 | CA | NORMAN Y. MINETA S…| ║ AMERICAN AIRLINES | A-321 | 05/06/15 00:00:00 | FL | FORT LAUDERDALE/HO…| APPROACH ║ EXPRESSJET AIRLINES| CRJ100/200 | 05/06/15 00:00:00 | AR | FORT SMITH REGIONA…| CLIMB ║ MESA AIRLINES | CRJ900 | 05/08/15 00:00:00 | AR | BILL AND HILLARY …| LANDING ROLL ║ BUSINESS | HELICOPTER | 05/06/15 00:00:00 | | UNKNOWN | En Route ║ DELTA AIR LINES | A-320 | 05/07/15 00:00:00 | CA | METRO OAKLAND INTL | ║ DELTA AIR LINES | A-320 | 05/08/15 00:00:00 | UT | SALT LAKE CITY INTL| ║ LUFTHANSA | A-380 | 05/10/15 00:00:00 | TX | GEORGE BUSH INTERC…| CLIMB ║ BUSINESS | C-172 | 05/08/15 00:00:00 | FL | ORLANDO SANFORD IN…| APPROACH ║ SPIRIT AIRLINES | A-319 | 05/10/15 00:00:00 | IL | CHICAGO O'HARE INT…| CLIMB ║ 1› faa-wildlife-strikes| saul.pw/VisiData v2.8 | opening datasets/faa-wildli 73448 rows •0
Tell VisiData that HEIGHT
is an integer column¶
Navigate to the HEIGHT
column, and press #.
You should see something like this:
File Edit View Column Row Data Plot System Help Ctrl+H for help menu <STATE | AIRPORT | PHASE_OF_FLT | HEIGHT#| SPEED | SPECIES | BIRDS_STRUCK | > FL | VERO BEACH MUNICIP…| APPROACH | !| | Unknown bird | 1 | … AK | KENAI MUNICIPAL AR…| APPROACH | !| | Unknown bird | 1 | … TX | DAVID WAYNE HOOKS …| | !| | Unknown bird | 1 | … MO | LAMBERT-ST LOUIS I…| APPROACH | !| | Unknown bird | 1 | … FL | POMPANO BEACH AIRP…| LANDING ROLL | 0 | | Unknown bird | 1 | VI | HENRY E ROHLSEN AR…| | !| | Unknown bird | 1 | … TX | SAN ANTONIO INTL | APPROACH | !| | Unknown bird | 1 | … TX | LONE STAR EXECUTIV…| DEPARTURE | !| | Unknown bird | 1 | … FL | TAMPA INTL | APPROACH | 6000 | | Unknown bird | 1 | … MO | LAMBERT-ST LOUIS I…| APPROACH | !| | Owls | 1 | … FL | OPA-LOCKA EXECUTIV…| APPROACH | !| | Hawks | 1 | … CA | NORMAN Y. MINETA S…| | !| | Gulls | 1 | … FL | FORT LAUDERDALE/HO…| APPROACH | 1500 | | Unknown bird - sma…| 1 | … AR | FORT SMITH REGIONA…| CLIMB | !| | Unknown bird - sma…| 1 | … AR | BILL AND HILLARY …| LANDING ROLL | 0 | | Unknown bird - sma…| 1 | … | UNKNOWN | En Route | !| | Unknown bird | 1 | … CA | METRO OAKLAND INTL | | !| | Unknown bird | 1 | … UT | SALT LAKE CITY INTL| | !| | Unknown bird | 1 | … TX | GEORGE BUSH INTERC…| CLIMB | !| | Unknown bird | 1 | … FL | ORLANDO SANFORD IN…| APPROACH | !| | Unknown bird | 1 | … IL | CHICAGO O'HARE INT…| CLIMB | 12000 | | Unknown bird | 1 | … 1› faa-wildlife-strikes| # type-int 73448 rows •0
As you can see, many of the reported collisions are missing height data, or appear to have occurred on the ground (i.e., HEIGHT == 0
).
So, let’s focus only on collsions reported to have occurred above the ground.
Select only rows where HEIGHT
is greater than zero¶
To do that, we’ll use z|, VisiData’s “select by expression” command. Type z| and then, at the prompt, type HEIGHT > 0
. You should see something like this:
File Edit View Column Row Data Plot System Help Ctrl+H for help menu <STATE | AIRPORT | PHASE_OF_FLT | HEIGHT#| SPEED | SPECIES | BIRDS_STRUCK | > FL | VERO BEACH MUNICIP…| APPROACH | !| | Unknown bird | 1 | … AK | KENAI MUNICIPAL AR…| APPROACH | !| | Unknown bird | 1 | … TX | DAVID WAYNE HOOKS …| | !| | Unknown bird | 1 | … MO | LAMBERT-ST LOUIS I…| APPROACH | !| | Unknown bird | 1 | … FL | POMPANO BEACH AIRP…| LANDING ROLL | 0 | | Unknown bird | 1 | VI | HENRY E ROHLSEN AR…| | !| | Unknown bird | 1 | … TX | SAN ANTONIO INTL | APPROACH | !| | Unknown bird | 1 | … TX | LONE STAR EXECUTIV…| DEPARTURE | !| | Unknown bird | 1 | … FL | TAMPA INTL | APPROACH | 6000 | | Unknown bird | 1 | … MO | LAMBERT-ST LOUIS I…| APPROACH | !| | Owls | 1 | … FL | OPA-LOCKA EXECUTIV…| APPROACH | !| | Hawks | 1 | … CA | NORMAN Y. MINETA S…| | !| | Gulls | 1 | … FL | FORT LAUDERDALE/HO…| APPROACH | 1500 | | Unknown bird - sma…| 1 | … AR | FORT SMITH REGIONA…| CLIMB | !| | Unknown bird - sma…| 1 | … AR | BILL AND HILLARY …| LANDING ROLL | 0 | | Unknown bird - sma…| 1 | … | UNKNOWN | En Route | !| | Unknown bird | 1 | … CA | METRO OAKLAND INTL | | !| | Unknown bird | 1 | … UT | SALT LAKE CITY INTL| | !| | Unknown bird | 1 | … TX | GEORGE BUSH INTERC…| CLIMB | !| | Unknown bird | 1 | … FL | ORLANDO SANFORD IN…| APPROACH | !| | Unknown bird | 1 | … IL | CHICAGO O'HARE INT…| CLIMB | 12000 | | Unknown bird | 1 | … select by expr: HEIGHT > 0 z| select-expr 73448 rows •0
Then press Enter. Now you you should see the above-ground collisions selected:
File Edit View Column Row Data Plot System Help Ctrl+H for help menu <STATE | AIRPORT | PHASE_OF_FLT | HEIGHT#| SPEED | SPECIES | BIRDS_STRUCK | > FL | VERO BEACH MUNICIP…| APPROACH | !| | Unknown bird | 1 | … AK | KENAI MUNICIPAL AR…| APPROACH | !| | Unknown bird | 1 | … TX | DAVID WAYNE HOOKS …| | !| | Unknown bird | 1 | … MO | LAMBERT-ST LOUIS I…| APPROACH | !| | Unknown bird | 1 | … FL | POMPANO BEACH AIRP…| LANDING ROLL | 0 | | Unknown bird | 1 | VI | HENRY E ROHLSEN AR…| | !| | Unknown bird | 1 | … TX | SAN ANTONIO INTL | APPROACH | !| | Unknown bird | 1 | … TX | LONE STAR EXECUTIV…| DEPARTURE | !| | Unknown bird | 1 | … •FL | TAMPA INTL | APPROACH | 6000 | | Unknown bird | 1 | … MO | LAMBERT-ST LOUIS I…| APPROACH | !| | Owls | 1 | … FL | OPA-LOCKA EXECUTIV…| APPROACH | !| | Hawks | 1 | … CA | NORMAN Y. MINETA S…| | !| | Gulls | 1 | … •FL | FORT LAUDERDALE/HO…| APPROACH | 1500 | | Unknown bird - sma…| 1 | … AR | FORT SMITH REGIONA…| CLIMB | !| | Unknown bird - sma…| 1 | … AR | BILL AND HILLARY …| LANDING ROLL | 0 | | Unknown bird - sma…| 1 | … | UNKNOWN | En Route | !| | Unknown bird | 1 | … CA | METRO OAKLAND INTL | | !| | Unknown bird | 1 | … UT | SALT LAKE CITY INTL| | !| | Unknown bird | 1 | … TX | GEORGE BUSH INTERC…| CLIMB | !| | Unknown bird | 1 | … FL | ORLANDO SANFORD IN…| APPROACH | !| | Unknown bird | 1 | … •IL | CHICAGO O'HARE INT…| CLIMB | 12000 | | Unknown bird | 1 | … 1› faa-wildlife-strikes| "HEIGHT > 0" | selected 22883 rows z| select-expr 73448 rows •22883
Create a new sheet with only the selected rows¶
To do so, press ". Once you do that, you should see something like this:
File Edit View Column Row Data Plot System Help Ctrl+H for help menu <STATE | AIRPORT | PHASE_OF_FLT | HEIGHT#| SPEED | SPECIES | BIRDS_STRUCK | > FL | TAMPA INTL | APPROACH | 6000 | | Unknown bird | 1 | … FL | FORT LAUDERDALE/HO…| APPROACH | 1500 | | Unknown bird - sma…| 1 | … IL | CHICAGO O'HARE INT…| CLIMB | 12000 | | Unknown bird | 1 | … FL | TAMPA INTL | APPROACH | 50 | | Unknown bird | 1 | … OR | SOUTHWEST OREGON R…| APPROACH | 300 | | Unknown bird | 1 | … ME | PORTLAND INTL JETP…| CLIMB | 1000 | | Unknown bird | 1 | … VA | ROANOKE REGNL ARPT…| APPROACH | 3000 | | Unknown bird | 1 | … FL | MIAMI INTL | APPROACH | 400 | | Unknown bird | 1 | … NY | SYRACUSE HANCOCK I…| APPROACH | 200 | | Unknown bird | 1 | … | UNKNOWN | EN ROUTE | 8000 | | Unknown bird | 1 | … PA | PITTSBURGH INTL AR…| APPROACH | 6000 | | Unknown bird | 1 | … MD | BALTIMORE/WASH INT…| APPROACH | 1000 | | Unknown bird | 1 | … NC | RALEIGH-DURHAM INTL| APPROACH | 1000 | | Unknown bird | 1 | … MI | WILLOW RUN ARPT | APPROACH | 3000 | | Unknown bird - lar…| 1 | … AK | TED STEVENS ANCHOR…| APPROACH | 100 | | Unknown bird | 1 | … | UNKNOWN | LOCAL | 80 | | Unknown bird | 1 | … DC | RONALD REAGAN WASH…| APPROACH | 100 | | Unknown bird | 1 | … AL | BIRMINGHAM-SHUTTLE…| APPROACH | 1100 | | Unknown bird | 1 | … NY | LA GUARDIA ARPT | CLIMB | 400 | | Unknown bird | 1 | … CA | JOHN WAYNE-ORANGE …| APPROACH | 1700 | | Unknown bird - med…| 1 | … FL | TAMPA INTL | APPROACH | 1000 | | Unknown bird | 1 | … 2› faa-wildlife-strikes_selectedref| " 22883 rows •0
Get the average collision height for each species¶
This is a two-step process. First, navigate to the HEIGHT
column, and press + to add an aggregator. At the prompt at the bottom of the screen, type mean
…
File Edit View Column Row Data Plot System Help Ctrl+H for help menu <STATE | AIRPORT | PHASE_OF_FLT | HEIGHT#| SPEED | SPECIES | BIRDS_STRUCK | > FL | TAMPA INTL | APPROACH | 6000 | | Unknown bird | 1 | … FL | FORT LAUDERDALE/HO…| APPROACH | 1500 | | Unknown bird - sma…| 1 | … IL | CHICAGO O'HARE INT…| CLIMB | 12000 | | Unknown bird | 1 | … FL | TAMPA INTL | APPROACH | 50 | | Unknown bird | 1 | … OR | SOUTHWEST OREGON R…| APPROACH | 300 | | Unknown bird | 1 | … ME | PORTLAND INTL JETP…| CLIMB | 1000 | | Unknown bird | 1 | … VA | ROANOKE REGNL ARPT…| APPROACH | 3000 | | Unknown bird | 1 | … FL | MIAMI INTL | APPROACH | 400 | | Unknown bird | 1 | … NY | SYRACUSE HANCOCK I…| APPROACH | 200 | | Unknown bird | 1 | … | UNKNOWN | EN ROUTE | 8000 | | Unknown bird | 1 | … PA | PITTSBURGH INTL AR…| APPROACH | 6000 | | Unknown bird | 1 | … MD | BALTIMORE/WASH INT…| APPROACH | 1000 | | Unknown bird | 1 | … NC | RALEIGH-DURHAM INTL| APPROACH | 1000 | | Unknown bird | 1 | … MI | WILLOW RUN ARPT | APPROACH | 3000 | | Unknown bird - lar…| 1 | … AK | TED STEVENS ANCHOR…| APPROACH | 100 | | Unknown bird | 1 | … | UNKNOWN | LOCAL | 80 | | Unknown bird | 1 | … DC | RONALD REAGAN WASH…| APPROACH | 100 | | Unknown bird | 1 | … AL | BIRMINGHAM-SHUTTLE…| APPROACH | 1100 | | Unknown bird | 1 | … NY | LA GUARDIA ARPT | CLIMB | 400 | | Unknown bird | 1 | … CA | JOHN WAYNE-ORANGE …| APPROACH | 1700 | | Unknown bird - med…| 1 | … FL | TAMPA INTL | APPROACH | 1000 | | Unknown bird | 1 | … choose any of 16 options (Ctrl+X for menu): mean + aggregate-col 22883 rows •0
… and then press Enter.
Note
Adding an aggregator doesn’t change the appearance of the sheet. But if you want to confirm that you’ve added an aggregator correctly, you can press Shift-C to open up the Columns Sheet, and look at the aggregators
field.
Then, to get the average for each animal, navigate to the SPECIES
column, and press Shift-F to create a frequency sheet. It should look something like this:
File Edit View Column Row Data Plot System Help Ctrl+H for help menu SPECIES ║↓count♯| HEIGHT_mean%║ Unknown bird - sma…║ 8807 | 1408.82 ║ Unknown bird - med…║ 3056 | 2260.49 ║ Unknown bird ║ 2098 | 2253.93 ║ Unknown bird - lar…║ 693 | 2691.92 ║ Gulls ║ 550 | 696.03 ║ Barn swallow ║ 385 | 191.71 ║ Mourning dove ║ 312 | 229.82 ║ European starling ║ 277 | 181.48 ║ Sparrows ║ 252 | 554.77 ║ Rock pigeon ║ 219 | 376.45 ║ Perching birds (y) ║ 211 | 2018.73 ║ Canada goose ║ 208 | 848.81 ║ Killdeer ║ 199 | 189.23 ║ Red-tailed hawk ║ 190 | 370.29 ║ American robin ║ 183 | 2545.00 ║ Turkey vulture ║ 178 | 1119.58 ║ American kestrel ║ 159 | 202.04 ║ Horned lark ║ 159 | 290.84 ║ Chimney swift ║ 156 | 928.85 ║ Cliff swallow ║ 135 | 270.13 ║ Mallard ║ 133 | 1173.14 ║ 3› faa-wildlife-strikes_selectedref_SPECIES_freq| Shift+F 420 bins •0
Sort the species by collision height¶
By default, frequency tables are sorted by the raw count of matching rows. To sort by the highest average collision height, navigate to the mean_HEIGHT
column, and press ].
You should see something like this:
File Edit View Column Row Data Plot System Help Ctrl+H for help menu SPECIES ║ count♯|↓HEIGHT_mean%║ Eared grebe ║ 1 | 20000.00 ║ Long-billed curlew ║ 1 | 15000.00 ║ Whimbrel ║ 1 | 12000.00 ║ Lesser yellowlegs ║ 4 | 9925.00 ║ Dusky flycatcher ║ 1 | 9500.00 ║ Sanderling ║ 1 | 9500.00 ║ Red-necked phalaro…║ 4 | 8525.00 ║ Clark's grebe ║ 1 | 8500.00 ║ Northern saw-whet …║ 1 | 8000.00 ║ Grebes ║ 1 | 8000.00 ║ Spotted sandpiper ║ 1 | 7700.00 ║ Least flycatcher ║ 3 | 7566.67 ║ Lark sparrow ║ 2 | 7400.00 ║ Greater yellowlegs ║ 3 | 7333.33 ║ Pacific-slope flyc…║ 3 | 7000.00 ║ Seminole bat ║ 1 | 7000.00 ║ Solitary sandpiper ║ 2 | 6250.00 ║ Pocketed free-tail…║ 1 | 6000.00 ║ Red-naped sapsucker║ 2 | 5750.00 ║ Cinnamon teal ║ 1 | 5600.00 ║ Horned grebe ║ 2 | 5400.00 ║ 3› faa-wildlife-strikes_selectedref_SPECIES_freq| ] sort-desc 420 bins •0
Hrrrm, it seems like a lot of these species show up just a few times — too few to be particularly informative. Let’s do something about that.
Limit the results to relatively common species¶
This step will seem familiar; it’s a lot like how we selected only above-the-ground collisions.
First, type z| to raise the “select by expression” prompt. Then, type count >= 20
:
File Edit View Column Row Data Plot System Help Ctrl+H for help menu SPECIES ║ count♯|↓HEIGHT_mean%║ Eared grebe ║ 1 | 20000.00 ║ Long-billed curlew ║ 1 | 15000.00 ║ Whimbrel ║ 1 | 12000.00 ║ Lesser yellowlegs ║ 4 | 9925.00 ║ Dusky flycatcher ║ 1 | 9500.00 ║ Sanderling ║ 1 | 9500.00 ║ Red-necked phalaro…║ 4 | 8525.00 ║ Clark's grebe ║ 1 | 8500.00 ║ Northern saw-whet …║ 1 | 8000.00 ║ Grebes ║ 1 | 8000.00 ║ Spotted sandpiper ║ 1 | 7700.00 ║ Least flycatcher ║ 3 | 7566.67 ║ Lark sparrow ║ 2 | 7400.00 ║ Greater yellowlegs ║ 3 | 7333.33 ║ Pacific-slope flyc…║ 3 | 7000.00 ║ Seminole bat ║ 1 | 7000.00 ║ Solitary sandpiper ║ 2 | 6250.00 ║ Pocketed free-tail…║ 1 | 6000.00 ║ Red-naped sapsucker║ 2 | 5750.00 ║ Cinnamon teal ║ 1 | 5600.00 ║ Horned grebe ║ 2 | 5400.00 ║ select by expr: count >= 20 z| select-expr 420 bins •0
Next, press Enter to complete the action. Because there are no high-count species in the visible part of the sheet, you won’t notice much of effect at first; just a confirmation in the status bar at the bottom:
Cinnamon teal ║ 1 | 5600.00 ║ Horned grebe ║ 2 | 5400.00 ║ 3› faa-wildlife-strikes_selectedref_SPECIES_freq| "count >= 20 z| select-expr 420 bins •75
Now press " to create a new sheet containing only the selected rows. Tada!:
File Edit View Column Row Data Plot System Help Ctrl+H for help menu SPECIES ║ count♯| HEIGHT_mean%║ Yellow-bellied sap…║ 24 | 3585.42 ║ Swainson's thrush ║ 56 | 3454.46 ║ Snow goose ║ 34 | 3258.82 ║ Hermit thrush ║ 49 | 2770.10 ║ Red-eyed vireo ║ 39 | 2768.72 ║ Unknown bird - lar…║ 693 | 2691.92 ║ Northern shoveler ║ 23 | 2602.17 ║ American robin ║ 183 | 2545.00 ║ White-throated spa…║ 56 | 2426.79 ║ Greater white-fron…║ 32 | 2335.94 ║ Unknown bird - med…║ 3056 | 2260.49 ║ Unknown bird ║ 2098 | 2253.93 ║ Dark-eyed junco ║ 39 | 2031.28 ║ Perching birds (y) ║ 211 | 2018.73 ║ Northern pintail ║ 64 | 1980.19 ║ American coot ║ 53 | 1965.66 ║ Unknown bird or bat║ 43 | 1892.88 ║ Red bat ║ 20 | 1750.00 ║ Geese ║ 56 | 1696.88 ║ Gadwall ║ 25 | 1675.80 ║ Gray catbird ║ 37 | 1573.27 ║ 4› faa-wildlife-strikes_selectedref_SPECIES_freq_selectedref| " 75 bins •0
Finally, type g_ to widen all the columns, so that you can read all the names:
File Edit View Column Row Data Plot System Help Ctrl+H for help menu SPECIES ║ count♯| HEIGHT_mean%║ Yellow-bellied sapsucker ║ 24 | 3585.42 ║ Swainson's thrush ║ 56 | 3454.46 ║ Snow goose ║ 34 | 3258.82 ║ Hermit thrush ║ 49 | 2770.10 ║ Red-eyed vireo ║ 39 | 2768.72 ║ Unknown bird - large ║ 693 | 2691.92 ║ Northern shoveler ║ 23 | 2602.17 ║ American robin ║ 183 | 2545.00 ║ White-throated sparrow ║ 56 | 2426.79 ║ Greater white-fronted goose ║ 32 | 2335.94 ║ Unknown bird - medium ║ 3056 | 2260.49 ║ Unknown bird ║ 2098 | 2253.93 ║ Dark-eyed junco ║ 39 | 2031.28 ║ Perching birds (y) ║ 211 | 2018.73 ║ Northern pintail ║ 64 | 1980.19 ║ American coot ║ 53 | 1965.66 ║ Unknown bird or bat ║ 43 | 1892.88 ║ Red bat ║ 20 | 1750.00 ║ Geese ║ 56 | 1696.88 ║ Gadwall ║ 25 | 1675.80 ║ Gray catbird ║ 37 | 1573.27 ║ 4› faa-wildlife-strikes_selectedref_SPECIES_freq_selectedre g_ resize-cols-max 75 bins •0