From the processor point of view, every address value addresses one byte. When you connect 16 bit memory device to the processor, each 16-bit memory cell in the memory is actually 2 bytes. To address 16-bit memory device connected to processor, LA31 is not used, because this address line is used to distinguis between individual bytes in one 16-bit word, which is not necessary if your memory device has 16-bit wide data bus and may read all 16-bit per one bus transaction. Therefore, LA31 is not used for connection in 16-bit mode. Similarly, for 32-bit device two least significant address bits should not be used for connection.
By "16-bit" and "32-bit" I mean data port size of the memory device connected.
For the information about functional description of the P5040 processor please look P5040 Reference Manual. This document is available for download from P5040 product page, "Documentation" tab:
QorIQ® P5040|NXP
On P5040Ds board upper address lines LA7..LA5 are used to create "virtual banks", to do that these lines are connected through XOR gates controlled by CPLD. This is to have a possibility to invert upper 3 address lines if necessary.
Yes, data bus numbering is also opposite - LAD0 is most significant data bit. Also, data lanes should be connected as described in P5040 Reference Manual, Table 13-2:
LAD[0:31]
Multiplexed address/data bus. For configuration of a port size in BRn[PS] as 32 bits, all of LAD[0:31]
must be connected to the external RAM data bus, with LAD[0:7] occupying the most significant byte lane
(at address offset 0). For a port size of 16 bits, LAD[0:7] connect to the most-significant byte lane (at
address offset 0), while LAD[8:15] connect to the least-significant byte lane (at address offset 1);
LAD[16:31] are unused for 16-bit port sizes. For a port size of 8 bits, only LAD[0:7] are connected to the
external RAM.