@lancegatlin, thanks for mentioning Norvig. I couldn't find downloads of the text (content of books) that Norvig used as input. In project gutenberg I found a couple of books dated the last few years --- though not enough volume.
One of my goals is to obtain n-grams for a more complete Ascii character set, to assess layout of numbers, punctuation, space, newline, tab and others.
So as it stands, i'm using the Enron data set. I like that it's recent, and it's based primarily on work related emails. The ranking of single letters are fairly close to Norvig's results, with a few adjacent pairs of letters reversed in the mid range of the ranking. This is a derived ranking covering 101-key keyboard keys:
SP e t a o n i r s l h d c RET u m p y g f w b 0 v k 1 3 5 ' 4 9 x 7 . 8 6 , q j TAB = z / 2 ; - \ ` ] [
I'm using this to study metrics for assessing layout. Note that the first 12 characters are mostly consistent with the set of single key chords in the backspice layout, even though it's based on a different data set. That's encouraging.