Aspectuality of events has been shown to be construed through various means in typologically diverse languages, ranging from mainly grammatical devices to conventionalized lexical means. The rise of multimodal studies in linguistics allows incorporating yet another semiotic layer into the description. In this context, we present a cross-linguistic study of multimodal event construals in Czech and English spontaneous conversations, based on multimodal corpora. We follow Croft’s (2012) cognitive model of aspectual types, in order to take into account multiple parameters (out of which the features of (un)boundedness and directedness are the most prominent) determining a particular aspectual contour of a verb in a given context. We investigate which feature combinations are associated with (un)boundedness of corresponding co-speech gestures. The multivariate analysis revealed that in English, gesture boundedness is predicted by the predicate’s general aspectual type, whereas in Czech, the more fine-grained features of directedness and incrementality are stronger predictors.