Right, the thermocouple reading is directly related to the materials composition. Good meters and well built thermocouples are essentially self-calibrating, until they begin to fail.
That being said ...
(WARNING: This started off nicely, but ends up in huge windbag mode. I tried to edit it, but it sounds worse now.)
There are a lot of ways to mess up the accuracy, positioning is probably the easiest. Very few things are at the exact same temperature in more than one spot, and when you heat things quickly these differences get larger. At best you are getting an average, over a relatively large sampling period. Thermocouple reaction time is a function of the initial temp difference and the mass of the section of the wire with the temperature differential, not just the bead. (and reaction time isn't everything, anyway)
To restate that, the voltages are not generated at the actual bead you made, but rather along the section that has temperature differential. Make sure that most of the differential occurs in a part of the TC wire you are expecting it to. Metal sheathings can also distribute the gradient over a larger section of wire. Generally, longer gradients make for more accurate readings, since the wire impurities (and slight cross section differences) get averaged out better. These things aren't very intuitive, but leaving a good length of the metal tip out on those nice commercial tc's can give you a much better reading rather than pushing it in up to the hilt and hoping for good lead wire composition.
If you have longish leads back to the meter, and don't have the insulated wires twisted, you could also be an antenna picking up stray signals from almost anything electrical operating nearby.
Seebeck_coefficient voltages are really small, under 50 microvolts (0.00005v) per degree C.
Unprotected bead type TC are very susceptible to corrosion, and the relatively small wire surface area will be quickly coated with oils, and tend to move around/vibrate quite a bit, eventually damaging the wires.