I finally coded the correct answer to my maximum value problem in relational algebra. There are plenty of answers out there (see stack overflow, codeblow, stack overflow again, …) but my brain didn’t want to accept that this solution worked. It helps me to visualize what’s happening.
Say we have this simple relation of letters and numbers:
If we want to identify the maximum value in the number column, we can start by identifying what numbers aren’t the maximum. A number isn’t the maximum if it is less than another number in the relation. In other words, we need to find the theta-join of our relation and itself for Number1 < Number2. Here is the cross-product of the relation and itself (with renamed attributes):
If we only take the rows where Number1 < Number2 (i.e. a theta-join), we will get the values 1, 2, and 3 (but not 4) for the Number1 attribute. These are all the values that cannot be the maximum:
This is where my brain started to object. What about the row A1A1? it insisted. If you subtract this relation from the cross-product relation, that row will still be there! And while that’s true, we can solve that by looking only at the Number1 values, not at the entire rows.
So to get the maximum, we just have to take a projection of Number1 from the theta-join relation above and subtract it from the projection of Number from the original relation. The result:
4, our maximum.