Lesson 17: Sort with arrays and hashes


Using the sort function to sort array elements, hash keys and hash values

  • sort is a function in which the default behavior is to sort the contents of a list in alphabetical order.
  • Customizable sort behavior

Default sort behavior

  • The default behavior of the sort function is to sort in Ascii order
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
print "Default -- ABC sort with strings\n";
my @array = ('After','food','best','Hiss');
my @sorted_array = sort @array;
 
foreach my $element (@sorted_array){
  print "$element\n";
}
 
print "Default -- ABC sort with numbers\n";
my @numbers = (3,400,1,100,4,20);
my @sorted_numbers = sort @numbers;
 
foreach my $element (@sorted_numbers){
  print "$element\n";
}

Output:

%% ./sort.pl
Default -- ABC sort with strings
After
Hiss
best
food
Default -- ABC sort with numbers
1
100
20
3
4
400

Simple customization of sort behavior: Numeric order.

  • use the perl reserved $a and $b variables to customize the sort
  • use the numeric version of the cmp operator, (the spaceship operator)
1
2
3
4
5
6
7
  print "Custom -- Numeric sort\n";
 
  @sorted_numbers = sort {$a  $b}  @numbers;
 
  foreach my $element (@sorted_numbers){
    print "$element\n";
  }

Output:

%% ./sort.pl
Custom -- Numeric sort
1
3
4
20
100
400

Code: Sort biggest to smallest

  • switch $a and $b
1
2
3
4
5
6
7
  print "Custom -- Reverse Numeric sort\n";
 
  @sorted_numbers = sort {$b  $a}  @numbers;
 
  foreach my $element (@sorted_numbers){
    print "$element\n";
  }

Output:

%% ./sort.pl
Custom -- Reverse Numeric sort
400
100
20
4
3
1

Custom sort: Sort by length of array elements

order of operations

  1. each element of the array sorted based on its length. Remember perl works from the inner-most parenthesis out, and the sort is the inner-most function
  2. each element of the sorted list is stored in $element of the foreach loop
  3. the $element and its length is printed

Code:

1
2
3
4
5
my @array = ( 'aaaaa' , 'aaa' , 'aa' , 'a' , 'aa' );
 
foreach my $element(sort { length ($a)  length ($b) } @array) {
  print "$element\t", length ($element), "\n";
}

Output:

%% ./perl sort.pl
a       1
aa      2
aa      2
aaa     3
aaaaa   5

Custom sort: Sort by hash values

  • Remember to retrieve the values of a hash you need to use the hash key/index.
  • $value = $hash{'key'}
  • $a and $b will hold the keys of the hash
  • use $a and $b to retrieve the values
  • $hash{$a} will evaluate to the value of the key $a

order of operations

  1. keys are retrieved with the keys function
  2. the keys are used to retrieve with the values
  3. the keys are sorted and returned based on the value ( $value = $hash{$key} )
  4. the list of newly sorted keys (based on the value) are passed 1 by 1 to $nt by the foreach loop

Code:

1
2
3
4
5
6
7
8
9
10
11
my %nt_count = (
                 'G' => 2,
                 'C' => 1,
                 'T' => 4,
                 'A' => 3,
                );
 
foreach my $nt(sort {$nt_count{$b}  $nt_count{$a}}
               keys %nt_count ){
  print "$nt\t$nt_count{$nt}\n";
}

Output:

%% ./sort.pl
T       4
A       3
G       2
C       1

Exercises

  1. Create a hash, and print each key/value pair sorted by values. Be sure to use keys function and a foreach loop.

Print Friendly

5 thoughts on “Lesson 17: Sort with arrays and hashes

  1. How to auto generate the count of nucleotides in a DNA sequence using a hash

    Code:

    my %nt_count;
    ### This is complicated
    my $DNA = 'ATAATCGTTG';
    my @DNA = split '' , $DNA;
    foreach my $nt (@DNA){
      $nt_count{$nt}++;
      ## if the nucleotide exists in the hash increment the count by 1
      ## if it does not exist increment nothing which results in 1
    }
    

    Code in another way (does the same as above):

    my %nt_count;
    my $DNA = 'ATAATCGTTG';
    my @DNA = split '' , $DNA;
    foreach my $nt (@DNA){
      if (!exists $nt_count{$nt} ){
         $nt_count{$nt} = 1;
      }else { ## does exist
        my $count = $nt_count{$nt};
        my $new_count = $count +1 ;
        $nt_count{$nt} = $new_count;
      }
    }
    
  2. Hi Sofia,
    There are two bugs in the second script:

    if (!exist $nt_count{$nt} ){
    should be
    if (!exists $nt_count{$nt} ){

    and

    $nt_count{$nt} = 0;
    should be
    $nt_count{$nt} = 1;

    Cheers,
    Bert

  3. Hi Sofia,

    I always have trouble with hashes ….

    By what are we sorting in the “Custom sort: Sort by hash values” example? The title suggests we are sorting by value, but the description states “the keys are sorted with the sort function”, as does the code itself (sort …. keys %nt_count), while the output again suggests we have sorted by value …. :/

    Cheers,
    Bert

    • Hi Bert,
      Thanks for pointing this out. To make it more clear, it should say that the

      1. keys are retrieved with the keys function
      2. the keys are used to retrieve the values ( $value = $hash{‘key’} )
      3. the keys are sorted and returned based on the value

      I have added this correction to the notes. Thank you.

      -Sofia

Leave a Reply

Your email address will not be published. Required fields are marked *