Sound symbolism emerged as a prevalent component in the origin and development of language. However, as previous studies have either been lacking in scope or in phonetic granularity, the present study investigates the phonetic and semantic features involved from a bottom-up perspective. By analyzing the phonemes of 344 near-universal concepts in 245 language families, we establish 125 sound-meaning associations. The results also show that between 19 and 40 of the items of the Swadesh-100 list are sound symbolic, which calls into question the list’s ability to determine genetic relationships. In addition, by combining co-occurring semantic and phonetic features between the sound symbolic concepts, 20 macro-concepts can be identified, e. g. basic descriptors, deictic distinctions and kinship attributes. Furthermore, all identified macro-concepts can be grounded in four types of sound symbolism: (a) unimodal imitation (onomatopoeia); (b) cross-modal imitation (vocal gestures); (c) diagrammatic mappings based on relation (relative); or (d) situational mappings (circumstantial). These findings show that sound symbolism is rooted in the human perception of the body and its interaction with the surrounding world, and could therefore have originated as a bootstrapping mechanism, which can help us understand the bio-cultural origins of human language, the mental lexicon and language diversity.