You might have tried to explain why perspective couldn't explain it but you didn't succeed.
May I try?
1. Let this represent a cloud bank. The upper half (red with yellow squares) is cloud. The lower half (yellow with red squares) is the "gap" below the clouds. The gap is 500' from surface to ceiling. The clouds are an additional 500' above the gap. Together, they are 1000'.

2. Here is a view of that cloud+gap from a height of 100 feet (bridge of a ship) at a distance of 100 miles through a 50mm focal length lens. On the left is a flat earth view. On the right is a globe earth view. On a flat earth, we can't resolve the cloud from the gap but we can see something. On a globe earth, we see nothing yet. But this isn't about whether or not something can be seen. This is about perspective. As far as perspective is concerned, 500 feet of cloud (or 500 feet of clearance below the cloud) subtends about 0.054 of a degree or just over 3 arcminutes. A good eye should be able to resolve an angular gap of 3 arcminutes below the cloud but this display/model isn't able to. Still, lets pretend that we're approximating the limits of eye resolution since that is a companion piece of the perspective argument for a flat earth:

3. Let's try to overcome the resolution limits of the shorter focal length by zooming in. This is a common flat earth argument for showing how "zooming" in can restore something to view. At 200mm "zoom" in the flat scenario we can now barely make out the lower 500' yellow segment representing the gap. Did zooming make the cloud bank appear to "rise up?" Or did it just help resolve the upper 500' from the lower 500'?
Zooming didn't work on the globe side. The entire 1000' of cloud+gap is still not visible. On the flat side, zooming didn't increase resolution of the upper 3 arcminutes of clouds more than the lower 3 arcminutes of gap. Both are still 3 arcminutes. Zooming just helped us to begin to distinguish the two from each other. In other words, the clouds didn't come into view before the gap did. They diverged from the same 6 minutes of arc into two 3 minutes of arc of distinguishable characteristics. Zooming didn't change perspective. Perspective remained the same.

4. Let's leave the focal length alone now and begin reducing distance. Here is the 200mm view from 75 miles (always keeping our observation height at 100 feet). The globe view still can't see the clouds, but now the flat earth can clearly distinguish cloud from gap. Both are 500' in height and they have increased in vertical angular height at the same rate. At 75 miles, 500' now subtends 4 minutes of arc. That's perspective at work. The spatial dimension of angular height is inversely proportional to the distance. As distance decreases the apparent height (angular height) increases. But perspective isn't making the clouds "grow" more rapidly than the gap. 500' of gap isn't resolving more slowly than 500' of cloud. Perspective is operating on both at the same rate.

5. At 50 miles, the clouds have finally come into view on a globe earth. Are we seeing only the tops of the clouds or, like previously in the flat earth scenario, is the 500' of clouds merely unresolved from the lower 500' of gap? It must be the former, right? Because we're zoomed in to 200mm, so 500' should subtend the same degree of arc as in the clearly resolved flat earth scenario. At 50 miles, 500' now accounts for 6.4 arcminutes. That's true whether we're on a flat earth or a globe. It's a function of distance, not surface topology. So on a globe, we MUST only be seeing the top of the 500' clouds. The lower 500' gap is still hidden from view.
But this didn't happen as the cloud + gap came into view on a flat earth. Perspective increased the size of the clouds and the gap at the same time. Zoom helped us resolve the difference earlier than in the globe scenario, but it didn't bring the clouds into view before the gap. Neither did perspective. Already, we can see something different is happening on a globe than on a flat earth in the way this cloud bank is coming into view.

6. This is a 40 mile view. On the flat earth, the cloud bank and the gap beneath it both just keeps getting bigger. That's how perspective works. The angle subtended by a 500' vertical height is now over 8 arcminutes. But even so, on a globe at that distance, though the full 500' (8+ arcminutes of cloud) is visible, only a tiny sliver of the gap below the clouds is seen. Maybe about 20% (100 of the 500 feet of gap). If perspective was responsible for this disparate revelation of gap compared to cloud, then we should have seen the clouds resolve earlier than the gap in the flat earth scenario. But they didn't. They resolved together, equally, as you would expect with perspective. But something else other than perspective (or resolution) must be responsible for the differences in revealing of the gap below the clouds.

7. At 30 miles, we clearly see that there is a gap under the clouds, but it's still narrower than the band of clouds above. But we know that we set the model up so that they were the same vertical height of 500', which at this distance makes up almost 11 arcminutes. We see the full 11 arcminutes of cloud in both the globe and flat scenario. But we only see about 7.5 arcminutes of gap in the globe scenario. That's 3.5 minutes of arc difference between the cloud band and the gap band. (Gap Band. Ha!)
That delta never happens in the flat scenario. Perspective doesn't cause that. The remaining increase in angular visibility is what is causing the upper cloud segment to appear to rise; something with doesn't happen in a flat scenario and for which Perspective is not responsible.

8. Finally, we'll jump ahead and stop at a distance of about 14 miles. At this range, 500' takes up over 23 arcminutes and the full span of the gap below the clouds is fully visible on a globe. Perspective can now 'expand' the gap area inversely to the distance just as it's been doing on the flat earth side since the start.
What what Perspective is incapable of doing is making the gap come into view and increase in size more slowly or after the upper clouds.

Note, too, that a distinguishing feature between flat and convex surface models is the "dip" in the objective from eye level.
In Rowbotham's Earth Not a Globe, he adds an extra feature to "help" perspective explain this phenomenon. He says surface irregularities, such as waves, at the horizon account for the difference between how higher and lower objects are revealed. The model above didn't account for surface irregularities. The surface was smooth.
But you can't have your perspective cake and forget about it too. Perspective works on the waves as well, causing them to diminish in angular height with distance. To account for the disparity between revealing of upper clouds and lower gap, they'd have to be closer to the observer by inverse relationship to the size required to account for the amount of obscuring they would cause. Not only that, but they'd have to be ever-present, yet the "sinking ship" phenomenon happens regardless of sea state or surface smoothness.
Another oft-claimed addition to the flat earth scenario is the atmo(layer) effects at low grazing angles. The atmo- is dense, and looking horizontally across a distance near-parallel to the surface is to look through ever denser amounts of particulates, moisture and other obscuring, light-extinction factors. We encounter haze, mirage, shimmering, diffusion...aspects which make distinguishing things more difficult. This is most pronounced close to the surface (usually). So "convergence zone" -- that band of air closest to the surface of earth -- becomes another possible explanation for why lower elements of an object or lower objects are lost to sight before higher ones. And that is true...sometimes. Not always. Like waves, atmospheric/atmolayer surface conditions can mask things from sight that just getting a little steeper angle or elevation can restore to sight.
But if that's a required component, then it needs to be consistent. The atmo- is anything but consistent. It's in constant flux. Yet even under perfectly clear and stable air conditions, the above phenomenon is observed.
So that's why citing perspective as the reason for "sinking ship effect" is grossly flawed. Perspective doesn't work in that way. And trying to apply ad hoc rationalizations (waves, eye resolution, convergence zone) to salvage it only reveals its flaws.
But a curving surface does work as an explanation. (So does light curving in the opposite direction away from a flat earth surface, which is why I remained intrigued by Electromagnetic Accelerator in a flat earth model while disparaging Perspective as an explanation for the "sinking ship" phenomenon.)