Here is another 2 cents worth of opinion. If I repeat anyone else's comment sorry for extra's. (EDIT: comments refer to BLDC but are applicable to 1/2 bridge or H-bridge)
Fets do share reasonably good if they share a common electrically connection point and share a common heat spreader. They are positive temperature coefficient devices...ie the Rdson goes up with temperature which means that parallel mosfets will reach an current equilibrium point that is dependent on the die temperature and the circuit impedance between the devices. That is why you see ballast resistors sometimes on parallel IGBT's since a lot of them are negative coefficient devices where Vce decreases with temperature. I have seen newer IGBTs that are now positive coefficient on the Vce sat.
The package of a mosfet defines its maximum long term continuous current if you are able to keep the heatspreader at a temperature that keeps the die below its maximum allow operating temperature. Tj = Pd * Rjc + Ta. (not considering life expectancy since I am sure someone will comment about that) TO220 is typically around 75Amps max though I have seen higher with the newer generation TO220 packages. In terms of short term current ratings, you can look at the datasheet at the transient thermal impedance curves for 1 time pulses, and look at what the heat capacity of the heatsink that your fets are attached to. Realistically for long term average current, I think that you are limited by the thermal resistance path from the die to where you are trying to remove the heat too. So if that is from the die to ambient, for steady state, it will be less than the 75A or whatever the package is rated to. For the short term current, it is kinda hard to characterize since it is really dependent on the design. It is the short term current that you probably want to figure out for motor acceleration purposes. That is like the 1 min rating that the motor controller manufacturers sometimes give.
The mosfet spec sheets are not always that easy to use. The headline specs like typical Rdson, voltage , and current are given at the ideal temperature of 25C and are for the device only, not the device within your controller. So you can't say I want a 300A controller so I will use one TO220 mosfet rated at 300A to do it for reasons that I gave above. My opinion for figuring out how many mosfets you need is only a swag since you don't always know what your thermal design will be since it is dependent on the controller package and thermal insulating tape used to isolate the mosfets.
1) Identify the maximum current that you want to push through the motor
2) Identify the average continuous current that you want to run the motor at
3) Identify the maximum voltage that you want to operate at (you want some margin to prevent avalanching)
4) Take swag at how fast that you want to switch the device on and off at (Ton, Toff mosfet time) Chinese controllers are in the order of 3 usec or so.
5) Identify the switching frequency that you want to PWM at.
6) Identify the thermal insulating tape that you will use(if using a non isolated package)
7) Take a swag at how hot you want your heatspreader to rise to, that you mosfets will attach to.
7) Choose some candidate mosfets that you think are good
Ok, so now that you have that you can swag the power losses
Switching losses for a swag can be estimated as Psw = Ids * Vds * (ton+toff) * Fsw /6. This assumes that the devices are being hardswitched ie it is the device interrupting or sourcing the current. I am sure there are better estimates to calculate the switch losses, but this is the one I use. (EDIT Ids will be the motor current (phase for BLDC or Armature for DC) Vds is the battery voltage)
Conduction losses assuming a BLDC configuration with synchronous rectification (ie one low side bank of mosfets always turned on during the sector and a pair of mosfets banks switching in a half bridge configuration) will include the half bridge that is switching and the bank that is not switching. We will ignore the deadtime and losses associated with it for the purpose of figuring out the number of mosfets.
Pcond total ~= PD(half bridge) + PD(sync), PD half bridge can be estimated as the sum of the upper and lower Pd which are duty cycle dependent. So Pd upper = D * Ids^2 * Rdson and Pd lower = (1-D) * Ids^2 * Rdson where D is the duty cycle and Ids is the motor current and Rdson is the mosfet drain source resistance at its maximum die temperature that you want to operate it at. I use 150C Tjmax as another swag, ignoring the life expectancy stuff that others get concerned about. So now you can estimate the power that you main heat spreader needs to get rid of to the ambient environment. We assume that the commutation frequency is high enough that you can assume that the power is not localized to one bank of mosfets during normal operation when calculating power leaving the heatspreader for the controller. In the worst case of motor stall, one sector could be energized for a long period of time, and that what I normally used for design purposes.
Ptotal ~= ids^2 * Rdson * 2 + Ids*Vds*(Ton+Toff)*Fsw/6 for an Hbridge Synchronous rect single quadrant operation.
To figure out how many device you need you can now look at the worst case operating condition of 90% duty cycle on the bank of mosfets that are switching.
Pd max for bank will be ~= (Ids/n)^2*Rdson*0.9*n + Ids*Vds*(Ton+Toff)*Fsw/6.

in this case is the number of parallel mosfets. For the single device in the bank, its Pd ~= (Ids/n)^2*Rdson + (Ids/n)*Vds*(Ton+Toff)*Fsw/6
Now you can figure out how many devices you need.
1) use Rdson for Tj of 150C from the datasheet of your favorite mosfet
2) choose the maximum allowed heatspreader temperature that the fets are mounted to.
Calculate maximum PD for the device. Pdss ~= (Tj-Ths)/(Rjc + Rt + Rhs) where Tj is 150C, Ths is the maximum temperature you want your heat sink to get to, Rjc is the thermal resistance listed on the mosfet datasheet, Rt is the thermal resistance of your insulating tape, Rhs is the thermal resistance to the back of your heat sink if you are bolting it to something else that has a lot of metal. So you either can rearrange the above equation to solve for n or you can calculate the pd for various number of parallel mosfets and choose a number that has power dissipation less than that calculated based on the thermal resistance. You will want to choose later how to get rid of the heat from the heat spreader base on your steadystate continuous power dissipation requirement. That could be a finned heatsink attached or a larger mass of metal.
The short term current requirement will be driven by how much heat capacity the main heat sink(spreader) will have, which means how much metal there is in terms of volume. The heat capacity is like a big thermal capacitor. Just like a capacitor and resistor, the temperature will rise in a similar fashion to the voltage on a capacitor being charged through a resistor. This is what gives you the ability to drive a higher current through your controller for a short time since the maximum power that you can dissipate in a mosfet is determined by the power being dissipated and the temperature differential between your mosfet die and the main heatspreader if the thermal resistance remains the same. If that main heat spreader takes some time to rise in temperature, then you can operate longer at higher currents. That is where those 1 minute ratings come in for the controller manufacturers. They have a certain amount of metal in their heat spreaders that allow them to run higher currents for a short time.
I think that I have laid out a fairly conservative method for calculating the number of mosfets you need. I have used it in the past and has work reasonably well to give you a starting point. Once you have real hardware you can figure out the real ratings by monitoring the various temperatures.