One of my team-mates introduced another interesting question over lunch. Working through it reminded me of some of the statistics problems I struggled with at Imperial, specifically during Intelligent Data and Probabilistic Inference. It reinforced that in spite of scoring 90 for that course I’m still not confident I knew what I was doing then (or now).
Suppose you have a symmetric triangular distribution of unknown width and mean. Given that the distribution has yielded three independent samples of 2, 4 and 5, what is the expectation of the mean?
The triangular distribution can be used as an estimate for something which is known to be bounded both above and below, that also takes into account a value known to be the most probable. Some argue that it is the simplest distribution satisfying these (though one could argue that a cosine or some kind of truncated normal might be more applicable).
The instinctive answer I have is simply the mean of the samples, or , though I was suspicious as probability and statistics often yield non-intuitive results.
The distribution is named as such because it has a triangular probability density function; because of the laws of probability (area under the function must be 1), specifying the minimum, maximum and mean is enough to uniquely identify it. Supposing we have a minimum , a maximum and a mean , this yields a pdf of:
We are only dealing with a symmetric case, so we can set which cleans things up a little:
Based on our observations that we have three samples of 2, 4 and 5, we can construct the likelihood that a given triangular distribution gave rise to a certain result. While the probability sampling a given distribution resolves to a precise value is infinitesimally small, we can still compare them in relative terms using the density functions. We can write this as
Expanding this term will depend on where exactly 2, 4 and 5 fall in our triangle. Let’s work out the most promising case (where 2 falls on the left of while 4 and 5 fall on its right); the rest are left as an exercise to the reader. In this case, we have
At this point, we notice the original question needs a bit more specification. We aren’t given what the distribution of possible values of and is. One way of getting around this is just to pick a uniform distribution; however, that isn’t quite defined over the real line. We can for now simply find the maximum likelihood estimate for and .
Alternatively, if we give prior probability distributions for and , we could also use the samples as information to get a posterior distribution. Usually we would pick a conjugate prior distribution that wouldn’t fundamentally change even when accounting for the sample; I didn’t find one for the triangular distribution, though.
If we want to find the most likely distribution, we seek to find an extreme point; this can be done by taking partial derivatives (and this expression actually lines up quite well with the quotient rule). There is a fairly standard ‘trick’ for handling these, though; since the logarithm is a strictly increasing function, we compute the log-likelihood instead. The maximum of the logarithm will also be the maximum of the original function. Using the laws of logarithms, we get something a lot more tractable:
Computing the partial derivatives is then straightforward. We then set them to zero, and solve the resulting equations simultaneously; this yields
We’ll want the second pair of solutions; the first actually has which is no good (we need ). Interestingly, the mean of that triangular distribution is then which is not quite .
Indeed, though, the log-likelihood we get with these values of and is about . Indeed, if we look at the family of distributions with and , the best we get is about .