The transcription above is in Pinyin with tone marks (instead of numbers).
Pinyin is an algorithmic system in that the pronunciation can always be
derived from the spelling by applying certain rules.
Thus, the letter u is not pronounced as |U| but is identical to
ü after j, q, x and y
(that is, after a |J| sound). The 'syllables' qún, xù and
yú must therefore be read as if written with an ü.
Another example is the a, which is pronounced as |E| instead of
|AH| before a single n and after i, ü (or
u) and y. The a in the syllables miàn,
qián, jiān and xiàn therefore sounds as an |E|.
The sound file of this poem contains a complete, manually produced
synthesization of its speech sounds. The standard fourth-tone bù
has been distinguished from the second-tone bú. However, third-tone
speech sounds and a few toneless ones have not been similarly adjusted on
the basis of the context in which they occur. Unfortunately, it was not
possible either to artificially add an intonation on the level of phrases,
let alone whole sentences.