High-Flying Birds

This chapter uses VisiData to answer this question: Which birds that are struck, on average, at the highest altitudes?

In the process, we’ll cover the following skills:

  • Assigning “types” to columns
  • Creating new columns
  • Using frequency tables to filter sheets
  • Calculating aggregate statistics

Note

If you haven’t already, download faa-wildlife-strikes.csv, a dataset of all aircraft-wildlife collisions reported to the Federal Aviation Adminsitration between 2010 and mid-2016.

Open the wildlife-strikes dataset in VisiData

Let’s start from the very beginning.

Run this command in your terminal:

vd faa-wildlife-strikes.csv

If it worked, you should see something like this:

 OPERATOR           | ATYPE          | INCIDENT_DATE      | STATE   | AIRPORT            | PHASE_O> 
 BUSINESS           | PA-28          | 05/22/15 00:00:00  | FL      | VERO BEACH MUNICIP…| APPROACH 
 BUSINESS           | BE-1900        | 06/18/15 00:00:00  | AK      | KENAI MUNICIPAL AR…| APPROACH 
 BUSINESS           | PA-46 MALIBU   | 09/20/15 00:00:00  | TX      | DAVID WAYNE HOOKS …|          
 DELTA AIR LINES    | B-717-200      | 11/07/15 00:00:00  | MO      | LAMBERT-ST LOUIS I…| APPROACH 
 BUSINESS           | BE-90 KING     | 12/17/15 00:00:00  | FL      | POMPANO BEACH AIRP…| LANDING… 
 DELTA AIR LINES    | B-757          | 07/17/15 00:00:00  | VI      | HENRY E ROHLSEN AR…|          
 DELTA AIR LINES    | B-717-200      | 08/02/15 00:00:00  | TX      | SAN ANTONIO INTL   | APPROACH 
 BUSINESS           | C-414          | 08/03/15 00:00:00  | TX      | LONE STAR EXECUTIV…| DEPARTU… 
 ALLEGIANT AIR      | MD-80          | 09/02/15 00:00:00  | FL      | TAMPA INTL         | APPROACH 
 TRANS STATES AIRLI…| EMB-145        | 09/07/15 00:00:00  | MO      | LAMBERT-ST LOUIS I…| APPROACH 
 BUSINESS           | C-172          | 11/28/15 00:00:00  | FL      | OPA-LOCKA EXECUTIV…| APPROACH 
 GOVERNMENT         | EC120          | 12/08/15 00:00:00  | CA      | NORMAN Y. MINETA S…|          
 AMERICAN AIRLINES  | A-321          | 05/06/15 00:00:00  | FL      | FORT LAUDERDALE/HO…| APPROACH 
 EXPRESSJET AIRLINES| CRJ100/200     | 05/06/15 00:00:00  | AR      | FORT SMITH REGIONA…| CLIMB    
 MESA AIRLINES      | CRJ900         | 05/08/15 00:00:00  | AR      | BILL AND  HILLARY …| LANDING… 
 BUSINESS           | HELICOPTER     | 05/06/15 00:00:00  |         | UNKNOWN            | En Route 
 DELTA AIR LINES    | A-320          | 05/07/15 00:00:00  | CA      | METRO OAKLAND INTL |          
 DELTA AIR LINES    | A-320          | 05/08/15 00:00:00  | UT      | SALT LAKE CITY INTL|          
 LUFTHANSA          | A-380          | 05/10/15 00:00:00  | TX      | GEORGE BUSH INTERC…| CLIMB    
 BUSINESS           | C-172          | 05/08/15 00:00:00  | FL      | ORLANDO SANFORD IN…| APPROACH 
 SPIRIT AIRLINES    | A-319          | 05/10/15 00:00:00  | IL      | CHICAGO O'HARE INT…| CLIMB    
 EXPRESSJET AIRLINES| EMB-145        | 05/11/15 00:00:00  | AL      | BIRMINGHAM-SHUTTLE…| LANDING… 
faa-wildlife-strikes| Do ya feel lucky, textpunk? | saul.pw/VisiData v1.2 | opening      73448 rows 

Tell VisiData that HEIGHT is an integer column

Navigate to the HEIGHT column, and press #.

You should see something like this:

<STATE   | AIRPORT            | PHASE_OF_FLT   | HEIGHT  #| SPEED   | SPECIES            | BIRDS_S> 
 FL      | VERO BEACH MUNICIP…| APPROACH       |         ?|         | Unknown bird       | 1        
 AK      | KENAI MUNICIPAL AR…| APPROACH       |         ?|         | Unknown bird       | 1        
 TX      | DAVID WAYNE HOOKS …|                |         ?|         | Unknown bird       | 1        
 MO      | LAMBERT-ST LOUIS I…| APPROACH       |         ?|         | Unknown bird       | 1        
 FL      | POMPANO BEACH AIRP…| LANDING ROLL   |        0 |         | Unknown bird       | 1        
 VI      | HENRY E ROHLSEN AR…|                |         ?|         | Unknown bird       | 1        
 TX      | SAN ANTONIO INTL   | APPROACH       |         ?|         | Unknown bird       | 1        
 TX      | LONE STAR EXECUTIV…| DEPARTURE      |         ?|         | Unknown bird       | 1        
 FL      | TAMPA INTL         | APPROACH       |     6000 |         | Unknown bird       | 1        
 MO      | LAMBERT-ST LOUIS I…| APPROACH       |         ?|         | Owls               | 1        
 FL      | OPA-LOCKA EXECUTIV…| APPROACH       |         ?|         | Hawks              | 1        
 CA      | NORMAN Y. MINETA S…|                |         ?|         | Gulls              | 1        
 FL      | FORT LAUDERDALE/HO…| APPROACH       |     1500 |         | Unknown bird - sma…| 1        
 AR      | FORT SMITH REGIONA…| CLIMB          |         ?|         | Unknown bird - sma…| 1        
 AR      | BILL AND  HILLARY …| LANDING ROLL   |        0 |         | Unknown bird - sma…| 1        
         | UNKNOWN            | En Route       |         ?|         | Unknown bird       | 1        
 CA      | METRO OAKLAND INTL |                |         ?|         | Unknown bird       | 1        
 UT      | SALT LAKE CITY INTL|                |         ?|         | Unknown bird       | 1        
 TX      | GEORGE BUSH INTERC…| CLIMB          |         ?|         | Unknown bird       | 1        
 FL      | ORLANDO SANFORD IN…| APPROACH       |         ?|         | Unknown bird       | 1        
 IL      | CHICAGO O'HARE INT…| CLIMB          |    12000 |         | Unknown bird       | 1        
 AL      | BIRMINGHAM-SHUTTLE…| LANDING ROLL   |        0 |         | Unknown bird - sma…| 1        
faa-wildlife-strikes|                                                              #     73448 rows 

As you can see, many of the reported collisions are missing height data, or appear to have occurred on the ground (i.e., HEIGHT == 0).

So, let’s focus only on collsions reported to have occurred above the ground.

Create a column that says if HEIGHT is greater than zero

To do that, press = and then, at the prompt, type HEIGHT > 0 followed by Enter.

You should see something like this:

<STATE   | AIRPORT            | PHASE_OF_FLT   | HEIGHT  #| HEIGHT > 0   | SPEED   | SPECIES      > 
 FL      | VERO BEACH MUNICIP…| APPROACH       |         ?|             !|         | Unknown bird   
 AK      | KENAI MUNICIPAL AR…| APPROACH       |         ?|             !|         | Unknown bird   
 TX      | DAVID WAYNE HOOKS …|                |         ?|             !|         | Unknown bird   
 MO      | LAMBERT-ST LOUIS I…| APPROACH       |         ?|             !|         | Unknown bird   
 FL      | POMPANO BEACH AIRP…| LANDING ROLL   |        0 | False        |         | Unknown bird   
 VI      | HENRY E ROHLSEN AR…|                |         ?|             !|         | Unknown bird   
 TX      | SAN ANTONIO INTL   | APPROACH       |         ?|             !|         | Unknown bird   
 TX      | LONE STAR EXECUTIV…| DEPARTURE      |         ?|             !|         | Unknown bird   
 FL      | TAMPA INTL         | APPROACH       |     6000 | True         |         | Unknown bird   
 MO      | LAMBERT-ST LOUIS I…| APPROACH       |         ?|             !|         | Owls           
 FL      | OPA-LOCKA EXECUTIV…| APPROACH       |         ?|             !|         | Hawks          
 CA      | NORMAN Y. MINETA S…|                |         ?|             !|         | Gulls          
 FL      | FORT LAUDERDALE/HO…| APPROACH       |     1500 | True         |         | Unknown bird … 
 AR      | FORT SMITH REGIONA…| CLIMB          |         ?|             !|         | Unknown bird … 
 AR      | BILL AND  HILLARY …| LANDING ROLL   |        0 | False        |         | Unknown bird … 
         | UNKNOWN            | En Route       |         ?|             !|         | Unknown bird   
 CA      | METRO OAKLAND INTL |                |         ?|             !|         | Unknown bird   
 UT      | SALT LAKE CITY INTL|                |         ?|             !|         | Unknown bird   
 TX      | GEORGE BUSH INTERC…| CLIMB          |         ?|             !|         | Unknown bird   
 FL      | ORLANDO SANFORD IN…| APPROACH       |         ?|             !|         | Unknown bird   
 IL      | CHICAGO O'HARE INT…| CLIMB          |    12000 | True         |         | Unknown bird   
 AL      | BIRMINGHAM-SHUTTLE…| LANDING ROLL   |        0 | False        |         | Unknown bird … 
faa-wildlife-strikes| "HEIGHT > 0"                                                 =     73448 rows 

Create a frequency table with our new column

Navigate one column to the right, so that our new column is selected, and press Shift-F to create a frequency sheet for it.

You should see something like this:

 HEIGHT > 0    count  #| percent  %| histogram                                                    
              ‖   35629 |     48.51 | ************************************************** ‖          
 True            22883 |     31.16 | ********************************                             
 False           14936 |     20.34 | ********************                                         
                                                                                                    
                                                                                                    

Select only the True rows

You can do that by navigating down one row …

 HEIGHT > 0    count  #| percent  %| histogram                                                    
                 35629 |     48.51 | **************************************************           
 True         ‖   22883 |     31.16 | ********************************                   ‖          
 False           14936 |     20.34 | ********************                                         
                                                                                                    
                                                                                                    

… and then pressing Enter. After that, you should see something like this:

<STATE   | AIRPORT            | PHASE_OF_FLT   | HEIGHT  #| HEIGHT > 0   | SPEED   | SPECIES      > 
 FL      | TAMPA INTL         | APPROACH       |     6000 | True         |         | Unknown bird   
 FL      | FORT LAUDERDALE/HO…| APPROACH       |     1500 | True         |         | Unknown bird … 
 IL      | CHICAGO O'HARE INT…| CLIMB          |    12000 | True         |         | Unknown bird   
 FL      | TAMPA INTL         | APPROACH       |       50 | True         |         | Unknown bird   
 OR      | SOUTHWEST OREGON R…| APPROACH       |      300 | True         |         | Unknown bird   
 ME      | PORTLAND INTL JETP…| CLIMB          |     1000 | True         |         | Unknown bird   
 VA      | ROANOKE REGNL ARPT…| APPROACH       |     3000 | True         |         | Unknown bird   
 FL      | MIAMI INTL         | APPROACH       |      400 | True         |         | Unknown bird   
 NY      | SYRACUSE HANCOCK I…| APPROACH       |      200 | True         |         | Unknown bird   
         | UNKNOWN            | EN ROUTE       |     8000 | True         |         | Unknown bird   
 PA      | PITTSBURGH INTL AR…| APPROACH       |     6000 | True         |         | Unknown bird   
 MD      | BALTIMORE/WASH INT…| APPROACH       |     1000 | True         |         | Unknown bird   
 NC      | RALEIGH-DURHAM INTL| APPROACH       |     1000 | True         |         | Unknown bird   
 MI      | WILLOW RUN ARPT    | APPROACH       |     3000 | True         |         | Unknown bird … 
 AK      | TED STEVENS ANCHOR…| APPROACH       |      100 | True         |         | Unknown bird   
         | UNKNOWN            | LOCAL          |       80 | True         |         | Unknown bird   
 DC      | RONALD REAGAN WASH…| APPROACH       |      100 | True         |         | Unknown bird   
 AL      | BIRMINGHAM-SHUTTLE…| APPROACH       |     1100 | True         |         | Unknown bird   
 NY      | LA GUARDIA ARPT    | CLIMB          |      400 | True         |         | Unknown bird   
 CA      | JOHN WAYNE-ORANGE …| APPROACH       |     1700 | True         |         | Unknown bird … 
 FL      | TAMPA INTL         | APPROACH       |     1000 | True         |         | Unknown bird   
 NE      | EPPLEY AIRFIELD    |                |     3000 | True         |         | Unknown bird … 
faa-wildlife-strikes_True|                                                        ^J     22883 rows 

Get the average collision height for each species

This is a two-step process. First, navigate to the HEIGHT column, and press + to add an aggregator. At the prompt at the bottom of the screen, type mean

 FL      | TAMPA INTL         | APPROACH       |     1000 | True         |         | Unknown bird   
 NE      | EPPLEY AIRFIELD    |                |     3000 | True         |         | Unknown bird … 
min/max/avg/mean/median/sum/distinct/count/keymax: mean                                  22883 rows 

… and then press Enter.

Note

Adding an aggregator doesn’t change the appearance of the sheet. But if you want to confirm that you’ve added an aggregator correctly, you can press Shift-C to open up the Columns Sheet, and look at the aggregators field.

Then, to get the average for each animal, navigate to the SPECIES column, and press Shift-F to create a frequency sheet. It should look something like this:

 SPECIES             count  #| mean_HEIGHT  %                                                     
 Unknown bird - sma…‖    8807 |       1408.82 ‖                                                     
 Unknown bird - med…    3056 |       2260.49                                                      
 Unknown bird           2098 |       2253.93                                                      
 Unknown bird - lar…     693 |       2691.92                                                      
 Gulls                   550 |        696.03                                                      
 Barn swallow            385 |        191.71                                                      
 Mourning dove           312 |        229.82                                                      
 European starling       277 |        181.48                                                      
 Sparrows                252 |        554.77                                                      
 Rock pigeon             219 |        376.45                                                      
 Perching birds (y)      211 |       2018.73                                                      
 Canada goose            208 |        848.81                                                      
 Killdeer                199 |        189.23                                                      
 Red-tailed hawk         190 |        370.29                                                      
 American robin          183 |       2545.00                                                      
 Turkey vulture          178 |       1119.58                                                      
 American kestrel        159 |        202.04                                                      
 Horned lark             159 |        290.84                                                      
 Chimney swift           156 |        928.85                                                      
 Cliff swallow           135 |        270.13                                                      
 Mallard                 133 |       1173.14                                                      
 Hawks                   131 |        770.13                                                      
faa-wildlife-strikes_True_SPECIES_freq|                                            F       420 bins 

Sort the species by collision height

By default, frequency tables are sorted by the raw count of matching rows. To sort by the highest average collision height, navigate to the mean_HEIGHT column, and press ].

You should see something like this:

 SPECIES             count  #| mean_HEIGHT  %                                                     
 Eared grebe        ‖       1 |      20000.00                                                      
 Long-billed curlew        1 |      15000.00                                                      
 Whimbrel                  1 |      12000.00                                                      
 Lesser yellowlegs         4 |       9925.00                                                      
 Dusky flycatcher          1 |       9500.00                                                      
 Sanderling                1 |       9500.00                                                      
 Red-necked phalaro…       4 |       8525.00                                                      
 Clark's grebe             1 |       8500.00                                                      
 Northern saw-whet …       1 |       8000.00                                                      
 Grebes                    1 |       8000.00                                                      
 Spotted sandpiper         1 |       7700.00                                                      
 Least flycatcher          3 |       7566.67                                                      
 Lark sparrow              2 |       7400.00                                                      
 Greater yellowlegs        3 |       7333.33                                                      
 Pacific-slope flyc…       3 |       7000.00                                                      
 Seminole bat              1 |       7000.00                                                      
 Solitary sandpiper        2 |       6250.00                                                      
 Pocketed free-tail…       1 |       6000.00                                                      
 Red-naped sapsucker       2 |       5750.00                                                      
 Cinnamon teal             1 |       5600.00                                                      
 Horned grebe              2 |       5400.00                                                      
 Willow flycatcher         2 |       5200.00                                                      
faa-wildlife-strikes_True_SPECIES_freq|                                            ]       420 bins 

Hrrrm, it seems like a lot of these species show up just a few times — too few to be particularly informative. Let’s do something about that.

Limit the results to relatively common species

This step will seem familiar; it’s a lot like how we selected only above-the-ground collisions.

First, let’s create a new column indicating whether a species has at least 20 collision-height observations. To do that, press = and then, in the new column expr= prompt, type count >= 20.

You should see something like this:

 SPECIES             count  #| mean_HEIGHT  %| count >= 20                                        
 Eared grebe        ‖       1 |      20000.00 | False         ‖                                     
 Long-billed curlew        1 |      15000.00 | False                                              
 Whimbrel                  1 |      12000.00 | False                                              
 Lesser yellowlegs         4 |       9925.00 | False                                              
 Dusky flycatcher          1 |       9500.00 | False                                              
 Sanderling                1 |       9500.00 | False                                              
 Red-necked phalaro…       4 |       8525.00 | False                                              
 Clark's grebe             1 |       8500.00 | False                                              
 Northern saw-whet …       1 |       8000.00 | False                                              
 Grebes                    1 |       8000.00 | False                                              
 Spotted sandpiper         1 |       7700.00 | False                                              
 Least flycatcher          3 |       7566.67 | False                                              
 Lark sparrow              2 |       7400.00 | False                                              
 Greater yellowlegs        3 |       7333.33 | False                                              
 Pacific-slope flyc…       3 |       7000.00 | False                                              
 Seminole bat              1 |       7000.00 | False                                              
 Solitary sandpiper        2 |       6250.00 | False                                              
 Pocketed free-tail…       1 |       6000.00 | False                                              
 Red-naped sapsucker       2 |       5750.00 | False                                              
 Cinnamon teal             1 |       5600.00 | False                                              
 Horned grebe              2 |       5400.00 | False                                              
 Willow flycatcher         2 |       5200.00 | False                                              
faa-wildlife-strikes_True_SPECIES_freq| "count >= 20"                              =       420 bins 

Then, navigate one column the right, so that our new column is selected, and press Shift-F to create a frequency sheet for it.

 count >= 20    count  #| percent  %| histogram                                                   
 False         ‖     345 |     82.14 | ************************************************** ‖         
 True                75 |     17.86 | **********                                                  
                                                                                                    
                                                                                                    
                                                                                                    
                                                                                                    
                                                                                                    
                                                                                                    
                                                                                                    
                                                                                                    
                                                                                                    
                                                                                                    
                                                                                                    
                                                                                                    
                                                                                                    
                                                                                                    
                                                                                                    
                                                                                                    
                                                                                                    
                                                                                                    
                                                                                                    
                                                                                                    
faa-wildlife-strikes_True_SPECIES_freq_count_>=_20_freq|                           F         2 bins 

Finally, navigate to the True row, and press Enter.

Tada!:

 SPECIES             count  #| mean_HEIGHT  %| count >= 20                                        
 Yellow-bellied sap…‖      24 |       3585.42 | True                                               
 Swainson's thrush        56 |       3454.46 | True                                               
 Snow goose               34 |       3258.82 | True                                               
 Hermit thrush            49 |       2770.10 | True                                               
 Red-eyed vireo           39 |       2768.72 | True                                               
 Unknown bird - lar…     693 |       2691.92 | True                                               
 Northern shoveler        23 |       2602.17 | True                                               
 American robin          183 |       2545.00 | True                                               
 White-throated spa…      56 |       2426.79 | True                                               
 Greater white-fron…      32 |       2335.94 | True                                               
 Unknown bird - med…    3056 |       2260.49 | True                                               
 Unknown bird           2098 |       2253.93 | True                                               
 Dark-eyed junco          39 |       2031.28 | True                                               
 Perching birds (y)      211 |       2018.73 | True                                               
 Northern pintail         64 |       1980.19 | True                                               
 American coot            53 |       1965.66 | True                                               
 Unknown bird or bat      43 |       1892.88 | True                                               
 Red bat                  20 |       1750.00 | True                                               
 Geese                    56 |       1696.88 | True                                               
 Gadwall                  25 |       1675.80 | True                                               
 Gray catbird             37 |       1573.27 | True                                               
 New World Vultures       60 |       1541.50 | True                                               
faa-wildlife-strikes_True_SPECIES_freq_True|                                      ^J        75 bins 

Finally, type g_ to widen all the columns, so that you can read all the names:

 SPECIES                      count#| mean_HEIGHT%| count >= 20                                   
 Yellow-bellied sapsucker    ‖    24 |     3585.42 | True                                          
 Swainson's thrush               56 |     3454.46 | True                                          
 Snow goose                      34 |     3258.82 | True                                          
 Hermit thrush                   49 |     2770.10 | True                                          
 Red-eyed vireo                  39 |     2768.72 | True                                          
 Unknown bird - large           693 |     2691.92 | True                                          
 Northern shoveler               23 |     2602.17 | True                                          
 American robin                 183 |     2545.00 | True                                          
 White-throated sparrow          56 |     2426.79 | True                                          
 Greater white-fronted goose     32 |     2335.94 | True                                          
 Unknown bird - medium         3056 |     2260.49 | True                                          
 Unknown bird                  2098 |     2253.93 | True                                          
 Dark-eyed junco                 39 |     2031.28 | True                                          
 Perching birds (y)             211 |     2018.73 | True                                          
 Northern pintail                64 |     1980.19 | True                                          
 American coot                   53 |     1965.66 | True                                          
 Unknown bird or bat             43 |     1892.88 | True                                          
 Red bat                         20 |     1750.00 | True                                          
 Geese                           56 |     1696.88 | True                                          
 Gadwall                         25 |     1675.80 | True                                          
 Gray catbird                    37 |     1573.27 | True                                          
 New World Vultures              60 |     1541.50 | True                                          
faa-wildlife-strikes_True_SPECIES_freq_True|                                      g_        75 bins