Did I figure-out z-translation of 2D-coordinates correctly?

william_ · Oct 16, 2024

I was figuring-out for myself how a screen's x- and y-coordinates are mutated in order to produce the illusion of a 3D-shape in 3D-graphic-design.

The first page of the attached PDF shows how I went about this, and I'll here explain it further. So, if we imagine a rectangular screen centrally in front of us 'retreating' from view towards the centre-point of focus (as the upper figure shows) then its corner-vertices will each be drawn to the screen's centre. At any given point it will have an "original screen", which is always what its size would be at z = 0, and a "virtual screen" which would be its shrunken size at z-value greater than 0; in the upper figure on page 1 rectangle B is this "virtual screen" and A the "original screen".

This means that for every z-units that the "virtual screen" retreats, its width will decrease by x-units; I arbitrarily set this ratio to be 1:1 in my workings here. So if the "original screen"s width is equal to 10 x-units, and the "virtual screen" then retreats by 1 z-unit, then the "virtual screen"s width will become 9 x-units; the lower figure of page 1 illustrates this, except the "virtual screen" has 'retreated' by 2 z-units rather than 1 in this figure.

I imagined that the "virtual screen" was split into 10 columns, and that it would always be thusly split no matter how far it retreated: so if z = 0, then the "original screen" can only fit 10 columns, at z = 1 it can fit 10 + 1.1 columns as there would now be space between the edge of the "original screen" and the "virtual screen"s for extra columns, at z = 4 it can fit 10 + 6.6 and so on. Assuming that, I thought to calculate how many such columns could fit onto the "original screen" for each integer value of z and a given width of the "original screen". Equation 1 on the second page of this PDF shows the formula I used for this, with its graphical-representation shown on page 3 ("n" being 10 in that particular instance of that graph). When referring to "x-blocks" in the explanations attached to those equations, I'm referring to the columns that the screen is split into.

Equation 2 on page 2 is equation 1 re-arranged so that z-depth is a function of "x-blocks" or columns.

I thought then to see how many columns could fit on an "original screen" with a width of 1920 pixels if when z were at its maximum value each column were only 1 pixel wide: the equation 3 on page 2 tells me that the answer is 920, with a maximum z-depth of 480 units, assuming here, as I assumed above, that 1 z-unit is equal to 1 x-unit. [As an aside here, I didn't exactly understand what the falling z-values of that inverted parabola for equation 3 represented...]

The graphs shown on page 3 for the first three equations made sense to me, and it seemed that I had found the correct relationship between x-adjustment and z-depth in equation 1: i.e, that x-coordinates would be shifted quickly at first as z would increase, but which shifting would slow dramatically as z were to increase further.

But I couldn't see exactly how what I'd figured-out thus far translated to single pixels on a screen: i.e., I'd made a model for a "virtual screen" split into "n" number of columns, but couldn't see how that could relax into describing any given pixel on a 1080x1920 screen (I'm not considering the y-axis here, for the sake of simplicity, that will be trivial once I've figured-out the x-z relationship).

So, I just assumed that equation 1 was the correct equation to use to describe how a single pixel's x-coordinate would be adjusted with increasing z-value, and just guessed that that equation's two variables should be the pixel's original x-coordinate, and the z-value—which z-value would range from 0 to infinity, or rather, ranging between 0 and whatever large value it takes when an x-coordinate's original position is -920 (the edge of the screen) and its final position is 0. I thought that a given pixel's original x-coordinate should be relative to a screen with a width of 1920 pixels, with 0 being at the centre, so its minimum value would be -920 and its maximum 920. Then making a graph of equation 1 again with a constant-slider for "n" that ranged from -920 to 920 I could see that I was probably right in my guess.

So, having explained all that, I'm asking here whether I have got that right: whether equation 4 on page 3, with "n" ranging from -920 to 920, is correct to use to calculate what a given pixel's adjusted x-value would be as its z-depth were to increase.

Then, would you be able to explain how you would arrive at that conclusion from a sort-of 'first-principles', just like I did with "x-blocks" and the figures shown on page 1? As, like I said, I had to just make the jump of faith that equation 1 would work for single pixels as it does for "x-blocks" without being able to see exactly how that would be the correct thing to do.

berkeman · Oct 16, 2024

Welcome to PF.

It looks like your PDF did not attach successfully. Try using the "Attach files" link below the Edit window to try again?

william_ · Oct 18, 2024

Edit: as the 3mb PDF was too large for the post, I've shared it here as JPGs instead. So whenever you read "first page of the PDF", etc., in the OP, just translate that to "first image attached below", etc., which attached images can be seen below, in order.

william_ · Oct 18, 2024

berkeman said:

Welcome to PF.

It looks like your PDF did not attach successfully. Try using the "Attach files" link below the Edit window to try again?

Worked-around it. Would be glad of any constructive replies to the question itself.

Filip Larsen · Oct 19, 2024

You may want to check your work yourself, searching for terms like "pinhole camera" and "perspective transformation" and return here with specific question if you find something that doesn't quite match.

For a practical approach that is very commonly used in 3D computer graphics you may also want to read up on how the 4x4 homogeneous transformation matrix is a simple building block that can be used to numerically model almost all coordinate changes from general 3D points in some world frame onto screen coordinates if needed.

Did I figure-out z-translation of 2D-coordinates correctly?

Similar threads

Hot Threads

Recent Insights