A model consists of triangles. To define a triangle, you need 3 vertices (points) out of your vertex list. But if you save 3 vertices for each triangle a lot of vertices are redundant, because a vertex can belong to more than one triangle. So you have a vertex list, with all your vertexes and you save your triangles as indices. An index tells you, where in the vertex list you find a certain vertex for a certain triangle.
adding to that, with 16 bits you can have 2^16 possible values - a range of 0 to 65,535. With 32 bit indices you've got 2^32 = 0 to 4,294,967,295. So with 32 bit index list you can have a lot more points in one mesh buffer.
My computer uses 64 bits, and running 32 bit applications is just as fast on my 64 bit machine, with 64 bit OS, as it is on a 32 bit machine with 32 bit OS. So the "faster speed improvement" is not true.