Both the face and the voice provide us with not only linguistic information but also a wealth of paralinguistic information, including gender cues. However, the way in which we integrate these two sources in our perception of gender has remained largely unexplored. In the following study, we used a bimodal perception paradigm in which varying degrees of incongruence were created between facial and vocal information within audiovisual stimuli. We found that in general participants were able to combine both sources of information, with gender of the face being influenced by that of the voice and vice versa. However, in conditions that directed attention to either modality, we observed that participants were unable to ignore the gender of the voice, even when instructed to. Overall, our results point to a larger role of the voice in gender perception, when more controlled visual stimuli are used.