keyboardman's comments

keyboardman · on Sept 12, 2019

Thanks CrazyStat. "sigma and mu are independent". But "s and x-bar are independent" is an assumption which does not hold in practice. It should also be noted that X and x are different. I defined p(x|mu) to be a Normal distribution but p(X|mu) was never defined to be normal.

However, I did make a mistake, I should have said p(x-bar|mu), instead of p(X|mu), follows normal distribution, when N->infty. This means that p(x-bar|mu) only approximates normal distribution but it will never be the exact normal distribution. This also matches the conclusion from central limit theorem.

I am going to fix the "p(x-bar|mu)" typo. Thanks for the discussion.

CrazyStat · on Sept 12, 2019

sigma and mu may or may not be independent depending on the specified priors. In this case you're treating sigma as fixed and only putting a prior on mu, so they are (trivially) independent--a fixed value is independent of every random variable.

x-bar and s are independent for data coming from a Normal distribution, which your model assumes to be the case. Like I said, this is a basic result in mathematical statistics. The first google result [1] for "independence of s and x bar" gives a proof.

> It should also be noted that X and x are different. I defined p(x|mu) to be a Normal distribution but p(X|mu) was never defined to be normal.

From your writeup (apologies for formatting):

"We have N data points X={x_1,x_2,⋯,x_N} independently sampled from normal distribution N(μ,σ^2)."

This means X|mu has a multivariate Normal distribution with mean vector (mu, mu, mu, ..., mu) and covariance matrix sigma^2 * I_n, where I_N is the NxN identity matrix. p(X|mu) is (multivariate) Normal by definition.

Since X|mu has a multivariate Normal distribution, x-bar|mu has a univariate Normal distribution. This is the result of a basic fact about Normal distributions: if a vector X has a multivariate Normal distribution, then MX, where M is any matrix of an appropriate size to be multiplied by X, also has a (multivariate or univariate) Normal distribution (in other words, linear combination of Normally distributed random variables also have a Normal distribution). In the case of x-bar, M is the 1xN matrix [1/N, 1/N, ..., 1/N].

The Central Limit Theorem is a far more general result. In this case since we're starting with a Normal distribution we don't need it, as we can use properties of the Normal distribution to arrive at the distribution of x-bar|mu directly, without needing the CLT.

[1] http://jekyll.math.byuh.edu/courses/m321/handouts/mean_var_i...

keyboardman · on Sept 12, 2019

Thank you very much CrazyStat. I was convinced that "x-bar and s are independent for data coming from a Normal distribution". So the conclusion is for my example, p(x|mu) is normal distribution, p(x-bar|mu), after derivation, is also normal distribution. If p(x|mu) comes from some other distributions, p(x-bar|mu), however, might not be normal. Thanks for providing the proof, and I will only be convinced by the proof :) Will update the post shortly. Thanks for the great conversation and suggestions.

keyboardman · on Sept 12, 2019

One small thing is that in the proof they use n-1 instead of n as the denominator for sample variance which is different from my/Kevin's settings. Although this might be trivial, I will further look into that to see if it makes any difference. Overall this (full) derivation I would say is non-trivial, although it is not some discoveries.

keyboardman · on Sept 12, 2019

I looked closely and confirmed that the denominator would not matter. Thanks for providing the proof.

keyboardman · on Aug 3, 2019

You can use the key/token then delete to be safe.