LW - Natural Latents Are Not Robust To Tiny Mixtures by johnswentworth

The Nonlinear Library: LessWrong

المحتوى المقدم من The Nonlinear Fund. يتم تحميل جميع محتويات البودكاست بما في ذلك الحلقات والرسومات وأوصاف البودكاست وتقديمها مباشرة بواسطة The Nonlinear Fund أو شريك منصة البودكاست الخاص بهم. إذا كنت تعتقد أن شخصًا ما يستخدم عملك المحمي بحقوق الطبع والنشر دون إذنك، فيمكنك اتباع العملية الموضحة هنا https://ar.player.fm/legal.

18d ago 8:08

MP3•منزل الحلقة

Link to original article
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Natural Latents Are Not Robust To Tiny Mixtures, published by johnswentworth on June 7, 2024 on LessWrong. In our previous natural latent posts, our core theorem typically says something like: Assume two agents have the same predictive distribution P[X] over variables X, but model that distribution using potentially-different latent variables. If the latents both satisfy some simple "naturality" conditions (mediation and redundancy) then the two agents' latents contain approximately the same information about X. So, insofar as the two agents both use natural latents internally, we have reason to expect that the internal latents of one can be faithfully translated into the internal latents of the other. This post is about one potential weakness in that claim: what happens when the two agents' predictive distributions are only approximately the same? Following the pattern of our previous theorems, we'd ideally say something like If the two agents' distributions are within ϵ of each other (as measured by some KL-divergences), then their natural latents contain approximately the same information about X, to within some O(ϵ) bound. But that turns out to be false. The Tiny Mixtures Counterexample Let's start with two distributions, P0 and Q0, over X. These won't be our two agents' distributions - we're going to construct our two agents' distributions by mixing these two together, as the name "tiny mixtures" suggests. P0 and Q0 will have extremely different natural latents. Specifically: X1 consists of 1 million bits, X2 consists of another 1 million bits Under P0, X1 is uniform, and X2=X1. So, there is an exact natural latent ΛP=X1=X2 under P0. Under Q0, X1 and X2 are independent and uniform. So, the empty latent ΛQ is exactly natural under Q0. Mental picture: we have a million-bit channel, under P0 the output (X2) is equal to the input (X1), while under Q0 the channel hardware is maintained by Comcast so they're independent. Now for our two agents' distributions, P and Q. P will be almost P0, and Q will be almost Q0, but each agent puts a 1250 probability on the other distribution: P=(11250)P0+1250Q0 Q=1250P0+(11250)Q0 First key observation: DKL(P||Q) and DKL(Q||P) are both roughly 50 bits. Calculation: DKL(P||Q)=X1,X2P[X](logP[X]logQ[X]) X1=X2121000000(1000000log(122000000+1250121000000)50 DKL(Q||P)=X1,X2Q[X](logQ[X]logP[X]) X1X2122000000(2000000log(1250122000000))50 Intuitively: since each distribution puts roughly 1250 on the other, it takes about 50 bits of evidence to update from either one to the other. Second key observation: the empty latent is approximately natural under Q, and the latent Λ:=X1 is approximately natural under P. Epsilons: Under Q, the empty latent satisfies mediation to within about 125010000001230 bits (this is just mutual information of X1 and X2 under Q), and redundancy exactly (since the empty latent can always be exactly computed from any input). Under P, Λ:=X1 satisfies mediation exactly (since X1 mediates between X1 and anything else), redundancy with respect to X2 exactly (Λ=X1 can be exactly computed from just X1 without X2), and redundancy with respect to X1 to within about 125010000001230 bits (since there's a 1250 chance that X2 doesn't tell us the relevant 1000000 bits). … and of course the information those two latents tell us about X differs by 1 million bits: one of them is empty, and the other directly tells us 1 million bits about X1. Now, let's revisit the claim we would've liked to make: If the two agents' distributions are within ϵ of each other (as measured by some KL-divergences), then their natural latents contain approximately the same information about X, to within some O(ϵ) bound. Tiny mixtures rule out any claim along those lines. Generalizing the counterexample to an N bit channel (where N=1000000 above) and a mixin pr...

1690 حلقات

#The Nonlinear Fund #Podcasting Education #Of TexttoSpeech