Bah, I didn't want to do this but I'll try...
Do you remember linear algebra?

(I do and wish I didn't but that's beside the point)
So you have a matrix of units with weights and it looks like this:
Unit Weight
Militia 10%
LI 10%
MI 40%
HI 25%
X-bow 15%
Then you chose a rapid expander type personality and the matrix gets multiplied by:
(.5
1.5
2
0
1)
And then it gets normalized to 100%...
You further have a SC pretender so that multiplies the matrix by:
(.5
.5
3
1
1)
And it gets normalized again...
Anyway, you have a large matrix for unit type, then you have these operators (if that's the right term) for different conditions, like personality, like theme, like nation that multiply the values in the big matrix.
Actually if you did it this way you could have one huge list of all units, multiplied by national operators (to zero out the unallowed units), multiplied by theme and personality operators, multiplied by exisiting indie units (to put back in desired indie units when you get them), multiplied by...
Do you see it now? Its not that difficult to set up, it is difficult to balance

The size of these files is tiny, though there may be 100 of them. The tricky part then comes in how you set up your algorythem for which operators are applied to your specified matricies. But again, that's not difficult to concieve how you set up those algorythems, its just difficult to balance... that's where the want to externalize all this comes from, let the players who want to fiddle fiddle, eventually people will arrive at settings that work 'best' and the devs can chose to use them for the vanilla game or not.